Mohammed Vepari

Backend & Infrastructure Engineer

B.Sc. Computer Science (2026) • 87% GPA • Architecting fault-tolerant, auto-scaling distributed systems using Go, Node, and AWS Terraform.

Reliability | Observability | FinOps | Operational Excellence

Available for on-site, hybrid, or remote roles (EST/EDT).

Brampton, ON

Live Systems

7

Design Docs

8

Engineering Posts

4

Role Alignment Snapshot

If someone is scanning for role fit, this is the shortest honest summary of where I add value.

Backend Engineer

Node.js APIs, queue-worker execution paths, Postgres/Redis persistence, and clear service boundaries.

Platform Engineer

Terraform-managed cloud topology, ALB/Fargate deployment paths, and infrastructure-first design decisions.

Infrastructure / SRE

Monitoring, DLQ recovery, autoscaling guardrails, load validation, and public observability surfaces.

New Grad SWE

B.Sc. Computer Science (2026) with shipped live systems, architecture docs, and incident-style technical writing.

[Engineering Identity]

B.Sc. Computer Science (2026) • 87% GPA • Architecting fault-tolerant, auto-scaling distributed systems using Go, Node, and AWS Terraform.

Focused on SRE-grade platform delivery with infrastructure provisioning, cost-aware scaling, and production failure-recovery automation.

// LIVE_CANDIDATE_SIGNAL: PORTFOLIO_NODE_01
[0.0002] CANDIDATE_PROFILE: backend_infrastructure_engineer
[0.0005] ACTIVE_SIGNAL: flagship_systems_live
[0.0008] CORE_STACK: go_node_terraform_fargate
[0.0011] PRINCIPLES: frugality_operational_excellence
[0.0014] STATUS: available_onsite_hybrid_remote_est_edt
_

Architecture_And_Runbooks

Engineering writing focused on architecture decisions, incident response, and operating economics. This is the evidence layer behind the project metrics.

Migration Notes: App Runner to ECS for the AI Gateway and Go Load Balancer

Why both services outgrew App Runner, why the AI gateway moved to ECS Express Mode while the load balancer moved to regular ECS, and which AWS alternatives remained viable.

Read Migration Notes

ADR: Fargate Spot and Firecracker Isolation Strategy

Why I selected AWS Fargate Spot + isolation-first execution boundaries over EC2 worker fleets for asynchronous payload workloads.

Read ADR

Post-Mortem: Surviving a 15k Req/Min Payload Spike

Failure timeline, root-cause analysis, and queue-decoupling response path used to stabilize throughput and memory pressure.

Read Post-Mortem

FinOps Report: 70% Compute Reduction with Spot

Cost and scaling analysis for elastic worker execution using spot capacity, queue-depth triggers, and recovery guardrails.

Read FinOps Report

Engineering_Notes

Incident-style writeups that show architecture decisions, load behavior, bottleneck analysis, and measurable outcomes.

View All Posts
2026-04-01 · 11 min

Migrating from AWS App Runner to ECS: Why I Split the AI Gateway and Go Load Balancer by Workload Fit

Why App Runner was useful for first delivery, where it stopped fitting these workloads, how I moved the AI gateway to ECS Express Mode and the Go load balancer to regular ECS, and which AWS alternatives remained viable.

- Problem, architecture, stress, resolution, impact

- Target teams: Amazon, Canonical, Veeva, Stripe

2026-03-07 · 9 min

Queue-First Cloud Code Execution: Preventing Worker Starvation Under Burst Load

How I shifted from request-coupled execution to queue-worker isolation to keep execution throughput stable under burst traffic.

- Problem, architecture, stress, resolution, impact

- Target teams: Amazon, Stripe, DoorDash

2026-03-06 · 8 min

Go Load Balancer Failure Handling: Circuit Breakers, Hysteresis, and Bounded Retries

A breakdown of how I hardened a Go load balancer against backend flapping with health-aware routing and controlled retry behavior.

- Problem, architecture, stress, resolution, impact

- Target teams: Canonical, DoorDash, Amazon

Core Infrastructure Engineering

Production Systems Portfolio

Core infrastructure systems and reliability engineering projects. All systems are provisioned via Infrastructure as Code (Terraform), instrumented with deep observability pipelines, and rigorously tested through chaos drills and load validation.

Engineering Behavioral Signals

- Built systems with explicit failure handling, retries, and reliability controls.

- Documented architecture tradeoffs and scaling decisions in each project deep dive.

- Demonstrated deployment + observability readiness with public demos and live metrics.

Distributed Systems & Cloud APIs

Reliability-focused backend systems with routing, failover, and API-driven operations.

PgBouncer + mTLS | 10k+ Regional Write Spike Validation

Distributed Systems & Cloud APIs

NetPulse: Distributed Uptime Monitoring SaaS

Demonstrates secure and high-concurrency monitoring architecture with reliability controls that stay stable under aggressive regional write spikes.

Tech: Next.js, Node.js, PostgreSQL, PgBouncer, mTLS, Docker

Architecture: mTLS regional checkers -> queue -> monitoring engine -> PgBouncer + Postgres/Redis -> status dashboard + incident lifecycle.

Architecture Snapshot

mTLS regional checkers
  -> queue
  -> monitoring engine
  -> PgBouncer + Postgres/Redis
  -> status dashboard + incident lifecycle.

- Implemented PgBouncer for advanced PostgreSQL connection pooling, preventing database connection exhaustion during 10,000+ concurrent regional worker write load tests.

- Enforced Zero-Trust architecture by establishing Mutual TLS (mTLS) encryption between distributed regional checkers and the centralized monitoring engine.

Update: Added dedicated registration with Cognito email verification and full login flow for production-style onboarding.

Next.jsNode.jsPostgreSQLPgBouncermTLSDocker
App Runner -> Regular ECS | Go pprof + Consul | Prometheus/Grafana

Distributed Systems & Cloud APIs

Mini Load Balancer (Go)

Shows when a networking-heavy service outgrows App Runner and needs regular ECS service-level control for proxying, health management, and deployment behavior.

Tech: AWS ECS, Go, Consul, Prometheus, Grafana, pprof

Architecture: miniloadbalancer.io -> ALB/TLS ingress -> regular ECS service running Go proxy + control plane -> Consul discovery -> backend pool -> Prometheus/Grafana telemetry.

Architecture Snapshot

miniloadbalancer.io
  -> ALB/TLS ingress
  -> regular ECS service running Go proxy + control plane
  -> Consul discovery
  -> backend pool
  -> Prometheus/Grafana telemetry.

- Migrated the service from AWS App Runner to regular ECS so service rollout policy, health-probe cadence, task behavior, and ingress wiring could be controlled directly.

- Conducted deep runtime profiling using Go pprof to identify and eliminate memory allocation bottlenecks, optimizing goroutine scheduling for high-throughput TCP proxying.

Update: Migrated the public deployment from AWS App Runner to regular ECS and cut over the live endpoint to miniloadbalancer.io.

Scaling & Messaging Systems

Queue-based and real-time data pipelines designed for throughput, isolation, and safe execution.

Fargate Spot FinOps + DLQ Recovery | 15k+ Req/Min Burst Tests

Scaling & Messaging Systems

Cloud Code Execution Environment

Shows SRE-first backend platform engineering where autoscaling, queue durability, and cost-efficiency are designed as first-class requirements.

Tech: Node.js, AWS Fargate, EventBridge, Terraform (IaC), FinOps

Architecture: ALB execution ingress -> queue and DLQ lanes -> Fargate Spot worker pool -> result store -> recovery scheduler via EventBridge.

Architecture Snapshot

ALB execution ingress
  -> queue and DLQ lanes
  -> Fargate Spot worker pool
  -> result store
  -> recovery scheduler via EventBridge.

- Architected a highly elastic worker pool utilizing AWS Fargate Spot instances via Terraform, reducing distributed compute costs by 70% for asynchronous payload processing.

- Engineered a self-healing queue ecosystem using Redis Dead Letter Queues (DLQ) and AWS EventBridge cron triggers, achieving 100% payload recovery during staged network partition drills.

Update: Introduced Terraform-governed dual-endpoint model for control plane and execution API traffic separation.

Node.jsAWS FargateEventBridgeTerraform (IaC)FinOps
Live Dashboard | Real-Time Signal View

Scaling & Messaging Systems

Real-Time Transit Telemetry Dashboard

Demonstrates real-time data engineering, stream correctness, and observability-first operations reporting.

Tech: Dashboard, Telemetry, JavaScript, AWS S3, Data Visualization

Architecture: Transit data feeds -> telemetry processor -> event store -> live dashboard + websocket broadcaster + alert hooks.

Architecture Snapshot

Transit data feeds
  -> telemetry processor
  -> event store
  -> live dashboard + websocket broadcaster + alert hooks.

- WebSocket telemetry updates delivered route refreshes in ~1 second windows under normal load.

- Idempotency + late-event correction eliminated duplicate state writes in replay testing.

Update: Added event-time ordering, idempotency dedupe, and late-arrival correction for robust stream semantics.

DashboardTelemetryJavaScriptAWS S3Data Visualization

Algorithms & Visualization

Algorithmic systems that surface routing, optimization, and tradeoff decisions clearly.

Dijkstra + A* | Interactive Graph Simulation

Algorithms & Visualization

Telecom Network Routing Visualizer

Makes core CS routing algorithms legible to reviewers while demonstrating algorithmic correctness under dynamic network conditions.

Tech: React, TypeScript, Vite, Algorithms, Visualization

Architecture: React visualization layer -> weighted graph model -> Dijkstra/A* engine -> congestion simulator -> route quality metrics.

Architecture Snapshot

React visualization layer
  -> weighted graph model
  -> Dijkstra/A* engine
  -> congestion simulator
  -> route quality metrics.

- Supports deterministic Dijkstra and heuristic A* route comparisons on identical graph states.

- Recomputes route quality metrics live as congestion weights change.

Update: Initial release of telecom network routing visualizer with core graph/routing primitives.

ReactTypeScriptViteAlgorithmsVisualization

Full-Stack Product Engineering

End-to-end applications with product UX, backend workflows, and documented design decisions.

Search Performance Improved by 90%

Full-Stack Product Engineering

moveYSplash: Social Platform Prototype

Demonstrates end-to-end product delivery, responsive UX, and measurable query optimization in a live academic project.

Tech: Next.js, TypeScript, Tailwind CSS, Supabase, PostgreSQL

Architecture: Next.js UI + app APIs -> Supabase Auth/Postgres -> feed composer + indexed search pipeline.

Architecture Snapshot

Next.js UI + app APIs
  -> Supabase Auth/Postgres
  -> feed composer + indexed search pipeline.

- Search latency improved by ~90% after SQL query and indexing optimization.

- Maintained responsive interaction across mobile and desktop breakpoints.

Update: Published a live deployment link to make academic project outcomes directly reviewable.

Next.jsTypeScriptTailwind CSSSupabasePostgreSQL
App Runner -> ECS Express Mode | Structured LLM Workflow

Full-Stack Product Engineering

Shared AI Gateway

Shows applied AI product engineering plus infrastructure judgment: App Runner was fast for first launch, but ECS Express Mode became a better fit once deployment behavior, ingress policy, and service tuning mattered more.

Tech: AWS ECS Express Mode, Next.js, TypeScript, LLM Integration, Prompt Orchestration, Structured Outputs

Architecture: sharedaigateway.com -> ingress -> ECS Express Mode service -> requirement parser -> project evidence retrieval -> prompt orchestration -> structured fit brief UI.

Architecture Snapshot

sharedaigateway.com
  -> ingress
  -> ECS Express Mode service
  -> requirement parser
  -> project evidence retrieval
  -> prompt orchestration
  -> structured fit brief UI.

- Migrated the AI service from AWS App Runner to ECS Express Mode to keep a lighter managed experience while gaining more explicit control over deployment behavior and service tuning.

- Converted unstructured job descriptions into normalized requirement signals and evidence-backed summaries for repeatable analysis.

Update: Migrated the public deployment from AWS App Runner to ECS Express Mode and cut over the live endpoint to sharedaigateway.com.

AWS ECS Express ModeNext.jsTypeScriptLLM IntegrationPrompt OrchestrationStructured Outputs
Capstone Delivery | Final Grade: A-

Full-Stack Product Engineering

Online Tutoring Management System (Capstone)

Shows product ownership from user workflow design to backend persistence and secure scheduling controls.

Tech: Angular, React, Node.js, SQL, REST API

Architecture: Angular/React client -> Node.js API gateway -> auth + scheduling services -> SQL persistence layer.

Architecture Snapshot

Angular/React client
  -> Node.js API gateway
  -> auth + scheduling services
  -> SQL persistence layer.

- Delivered capstone scope on schedule with final grade A-.

- Implemented authentication-gated scheduling with consistent SQL-backed session records.

AngularReactNode.jsSQLREST API

Recent Engineering Upgrades

Deployed April 2026

Recent platform and portfolio updates with direct proof links so reviewers can verify shipped improvements without hunting through the site.

Deployment Architecture

ECS Migration Reflected for AI and Load Balancer

Shared AI Gateway and Mini Load Balancer now document the App Runner to ECS migration path, public domain cutovers, and why workload fit drove ECS Express Mode for AI versus regular ECS for the load balancer.

Open Migration Deep Dive

Homepage Structure

Core Infrastructure Positioned First

Core infrastructure systems now appear ahead of non-engineering history so reviewers see production work first.

View Core Infrastructure

Technical Positioning

Infrastructure Skills Map Refreshed

Technical skills now foreground Go, AWS, Terraform, Docker, Prometheus, Grafana, and Redis/BullMQ reliability workflows.

View Technical Skills

Access Paths

Live Portfolio Apps Centralized

Contact section now includes direct live links for NetPulse, Cloud Code Execution, Transit Telemetry, and Mini Load Balancer.

Open Contact Section

Project Evidence

Project Metrics Upgraded

Project cards now emphasize FinOps, DLQ recovery, PgBouncer/mTLS, and pprof-driven optimization outcomes.

View Flagship Systems

Narrative Structure

Runbooks Elevated in Homepage Flow

Architecture runbooks and incident-style writing now sit near the top of the homepage to keep the portfolio centered on engineering proof.

Open Runbooks

Technical Skills

Systems_Stack_Map

Languages & Runtime

Go

Load balancing, concurrency control, and runtime profiling.

ACTIVE

Node.js

Distributed API services, queue workers, and async processing.

ACTIVE

TypeScript

Type-safe backend and frontend platform development.

ACTIVE

Python

Scripting, logic implementation, and systems tooling support.

STRONG

Java

Backend fundamentals and object-oriented systems design.

STRONG

SQL

Schema design, indexing, and query optimization.

ACTIVE

Infrastructure & Cloud

AWS (Fargate, ALB, VPC)

Cloud-native deployment and service networking.

ACTIVE

Terraform (IaC)

Repeatable infrastructure provisioning and change control.

ACTIVE

Docker

Containerized service packaging and runtime consistency.

ACTIVE

Linux / cgroups

Isolation and resource-bound execution constraints.

PROJECT

Observability & Reliability

Prometheus

Service metrics for traffic and health behavior visibility.

ACTIVE

Grafana

Operational dashboards for failure and scaling diagnostics.

ACTIVE

Go pprof

Heap and CPU profiling for runtime bottleneck elimination.

ACTIVE

Redis / BullMQ DLQ

Queue durability and dead-letter recovery workflows.

ACTIVE

mTLS + Incident Controls

Zero-trust communication and alert lifecycle hardening.

ACTIVE

Data & Web Platforms

PostgreSQL / MySQL

Relational data modeling and persistence.

ACTIVE

React / Next.js

Dashboards and technical web interfaces.

ACTIVE

REST / WebSocket APIs

Realtime and request-response service integration.

ACTIVE

Supabase

Rapid full-stack data and auth integration.

PROJECT

Angular

Capstone implementation for complex user workflows.

PROJECT

Experience

Operational ownership, reliability awareness, and delivery discipline carried into production-style project work.

Portfolio Engineering

Independent Systems & Infrastructure Developer

2024 - Present

- Architected and deployed production-grade distributed systems with live demos, system design docs, and measurable reliability metrics.

- Provisioned AWS cloud environments via Terraform, including ALB-routed services and queue-worker execution patterns.

- Implemented cost-aware autoscaling patterns, DLQ recovery workflows, and observability instrumentation for failure analysis.

- Published architecture decision records and incident-style post-mortems for public technical review.

Amazon Fulfillment

Fulfillment Associate

2016 - Present

- Managed high-volume inventory processing while consistently meeting critical path deadlines in a fast-paced logistics environment.

- Maintained 99% accuracy in order fulfillment through strict quality checks and proactive defect identification.

- Adapted quickly to shifting operational priorities and helped clear major backlogs during peak seasonal demand.

- Recognized for reliability, punctuality, and safety compliance across long-term tenure.

Academic_Foundation

Algoma University

Honours Bachelor of Computer Science

Expected Graduation: 2026
GPA: 87%
// Graduating with Honours

George Brown College

Computer Programming and Analysis

2022 - 2023
GPA: 3.72/4.0
// Graduating with Honours

Engineer_Profile

Systems Mindset with Product Delivery Discipline

B.Sc. Computer Science (2026) • 87% GPA • Architecting fault-tolerant, auto-scaling distributed systems using Go, Node, and AWS Terraform. I focus on building systems that are both understandable and resilient under stress, then presenting that evidence in ways hiring teams can verify quickly.

Education

Honours BCS, Expected 2026

Primary Focus

Distributed Systems, Platform APIs, Reliability

Direct Contact

mvepari@algomau.ca

Operating Principles

- Design for failure first, then optimize for speed.

- Publish measurable impact, not vague implementation claims.

- Document architecture and tradeoffs so teams can reason quickly.

Contact

I am actively seeking full-time New Grad Software Engineer roles (2026). I have shipped public, production-style projects with live demos and source code. The fastest way to reach me is by email.