Shared AI Gateway System Design
Back to PortfolioFull-Stack Product Engineering
ECS Express Mode-hosted AI gateway that turns job requirements into structured fit briefs using project metadata, prompt orchestration, and a public web interface.
App Runner -> ECS Express Mode | Structured LLM Workflow
Why This Project Matters
Shows applied AI product engineering plus infrastructure judgment: App Runner was fast for first launch, but ECS Express Mode became a better fit once deployment behavior, ingress policy, and service tuning mattered more.
Tech + Architecture Summary
- Tech: AWS ECS Express Mode, Next.js, TypeScript, LLM Integration, Prompt Orchestration, Structured Outputs
- Architecture: sharedaigateway.com -> ingress -> ECS Express Mode service -> requirement parser -> project evidence retrieval -> prompt orchestration -> structured fit brief UI.
Impact Metrics
- Migrated the AI service from AWS App Runner to ECS Express Mode to keep a lighter managed experience while gaining more explicit control over deployment behavior and service tuning.
- Converted unstructured job descriptions into normalized requirement signals and evidence-backed summaries for repeatable analysis.
- Grounded generated output in project metadata so responses reference concrete systems, system design docs, and shipped artifacts instead of generic claims.
Core Problem
Generate useful role-alignment summaries while keeping outputs grounded in real project evidence and preventing generic or inflated AI responses.
High-Level Architecture
mermaid graph LR Client[Web Client]-->Parser[Requirement Parser] Parser-->Retriever[Project Evidence Retrieval] Retriever-->Prompt[Prompt Orchestration Layer] Prompt-->LLM[LLM Analysis Service] LLM-->Formatter[Structured Response Formatter] Formatter-->Client
Production-Grade Capabilities
- Requirement parsing and evidence retrieval before prompt execution.
- Structured response generation designed for consistent skimability.
- ECS Express Mode runtime with explicit service configuration, ingress behavior, and architecture documentation.
Engineering Decisions
- More aggressive prompt shaping improves consistency but can reduce flexibility for unusual job descriptions.
- Evidence grounding improves trust, but requires curated project metadata to avoid thin or repetitive output.
- Structured outputs improve readability, but can hide nuance if the scoring schema is too rigid.
- ECS Express Mode keeps more convenience than regular ECS, but still requires clearer service boundaries and deployment discipline than App Runner.
Behavioral + Impact Signals
- Treated applied AI as a product and systems problem, not just a model wrapper.
- Optimized for explainability and grounding over novelty-only interaction patterns.
- Kept the default portfolio narrative understandable even without using the tool.
Quality Guarantees
- Generated summaries remain tied to explicit project evidence instead of free-form unsupported claims.
- Output structure is stable enough for quick skim and comparison across roles.
- Live UI remains usable without forcing the reviewer through a multi-step workflow.
Recent Upgrades
- Migrated the public deployment from AWS App Runner to ECS Express Mode and cut over the live endpoint to sharedaigateway.com.
- Documented the migration path, why App Runner stopped fitting the workload, and where regular ECS remained the better choice for networking-heavy services.
Outcome Highlights
- Cut over the public deployment to sharedaigateway.com on an ECS Express Mode service path.
- Grounded outputs in concrete portfolio evidence to reduce hallucinated summaries.
- Exposed both the deployed app and source repository for technical review.