Engineering_Blog

Architecture Breakdowns and Incident-Style Deep Dives

These posts follow a strict arc: problem, architecture, failure point, resolution, and business impact. They are written for engineering hiring teams that want production-grade thinking, not tutorial-style demos.

2026-06-08·12 min read

Portfolio App Upgrade Log: Queues, ECS, Pooling, and Evidence-Grounded AI

How I upgraded the portfolio apps from isolated demos into reviewable production-style systems with clearer runtime boundaries, system design docs, and live evidence paths.

VeevaAmazonStripeCanonical

- Problem under hard constraints

- Architecture data flow

- Breaking point and bottleneck fix

2026-04-01·11 min read

Migrating from AWS App Runner to ECS: Why I Split AI Gateway Platform and Edge Balancer by Workload Fit

Why App Runner was useful for first delivery, where it stopped fitting these workloads, how I moved AI Gateway Platform to ECS Express Mode and Edge Balancer to regular ECS, and which AWS alternatives remained viable.

AmazonCanonicalVeevaStripe

- Problem under hard constraints

- Architecture data flow

- Breaking point and bottleneck fix

2026-03-07·9 min read

Queue-First Cloud Sandbox: Preventing Worker Starvation Under Burst Load

How I shifted from request-coupled execution to queue-worker isolation to keep execution throughput stable under burst traffic.

AmazonStripeDoorDash

- Problem under hard constraints

- Architecture data flow

- Breaking point and bottleneck fix

2026-03-06·8 min read

Edge Balancer Failure Handling: Circuit Breakers, Hysteresis, and Bounded Retries

A breakdown of how I hardened a Go-based Edge Balancer against backend flapping with health-aware routing and controlled retry behavior.

CanonicalDoorDashAmazon

- Problem under hard constraints

- Architecture data flow

- Breaking point and bottleneck fix

2026-03-05·7 min read

NetPulse Incident Noise Reduction: Multi-Region Checks Without Alert Flooding

How I designed incident lifecycle rules and alert deduplication in NetPulse to keep uptime signals trustworthy.

VeevaIntuitGMAmazon

- Problem under hard constraints

- Architecture data flow

- Breaking point and bottleneck fix

How to Add a New Post

1. Open content/blog/index.ts and add a new object in blogPosts.

2. Follow this structure exactly: hookarchitecture stressTestbottleneckResolution businessImpact.

3. Add optional stress-test screenshots to public/ and set stressTest.screenshotUrl.