AI Gateway Platform System Design

Full-Stack Product Engineering

ECS Express Mode-hosted AI Gateway Platform that turns job requirements into structured fit briefs using project metadata, prompt orchestration, and a public web interface.

App Runner -> ECS Express Mode | Structured LLM Workflow

Live Demo Source Repository Read Migration Deep Dive Read Production Upgrade Log

Why This Project Matters

Shows applied AI product engineering plus infrastructure judgment: App Runner was fast for first launch, but ECS Express Mode became a better fit once deployment behavior, ingress policy, and service tuning mattered more.

Tech + Architecture Summary

Tech: AWS ECS Express Mode, Next.js, TypeScript, LLM Integration, Prompt Orchestration, Structured Outputs
Architecture: sharedaigateway.com -> ingress -> ECS Express Mode service -> requirement parser -> project evidence retrieval -> prompt orchestration -> structured fit brief UI.

Impact Metrics

Migrated the AI service from AWS App Runner to ECS Express Mode to keep a lighter managed experience while gaining more explicit control over deployment behavior and service tuning.
Converted unstructured job descriptions into normalized requirement signals and evidence-backed summaries for repeatable analysis.
Grounded generated output in project metadata so responses reference concrete systems, system design docs, and shipped artifacts instead of generic claims.

Core Problem

Generate useful role-alignment summaries while keeping outputs grounded in real project evidence and preventing generic or inflated AI responses.

Build Notes

What I Owned

This is an applied AI product, but I want it to read like software engineering: parsing, grounding, structured output, deployment tradeoffs, and reviewer-friendly UX.

Hard Lesson

The lesson was that an AI feature hurts credibility if it feels generic, so the useful work is grounding output in real project evidence and keeping the default portfolio clear without requiring the tool.

Next Enhancement

Next I would add a transparent evidence panel showing which projects and facts were used for each generated fit brief.

High-Level Architecture

mermaid
graph LR
  Client[Web Client]-->Parser[Requirement Parser]
  Parser-->Retriever[Project Evidence Retrieval]
  Retriever-->Prompt[Prompt Orchestration Layer]
  Prompt-->LLM[LLM Analysis Service]
  LLM-->Formatter[Structured Response Formatter]
  Formatter-->Client

Production-Grade Capabilities

Requirement parsing and evidence retrieval before prompt execution.
Structured response generation designed for consistent skimability.
ECS Express Mode runtime with explicit service configuration, ingress behavior, and architecture documentation.

Engineering Decisions

More aggressive prompt shaping improves consistency but can reduce flexibility for unusual job descriptions.
Evidence grounding improves trust, but requires curated project metadata to avoid thin or repetitive output.
Structured outputs improve readability, but can hide nuance if the scoring schema is too rigid.
ECS Express Mode keeps more convenience than regular ECS, but still requires clearer service boundaries and deployment discipline than App Runner.

Behavioral + Impact Signals

Treated applied AI as a product and systems problem, not just a model wrapper.
Optimized for explainability and grounding over novelty-only interaction patterns.
Kept the default portfolio narrative understandable even without using the tool.

Quality Guarantees

Generated summaries remain tied to explicit project evidence instead of free-form unsupported claims.
Output structure is stable enough for quick skim and comparison across roles.
Live UI remains usable without forcing the reviewer through a multi-step workflow.

Recent Upgrades

Migrated the public deployment from AWS App Runner to ECS Express Mode and cut over the live endpoint to sharedaigateway.com.
Documented the migration path, why App Runner stopped fitting the workload, and where regular ECS remained the better choice for networking-heavy services.
Reframed the AI feature around evidence retrieval, structured output, and deployment tradeoffs so it reads as product engineering instead of AI bloat.

Outcome Highlights

Cut over the public deployment to sharedaigateway.com on an ECS Express Mode service path.
Grounded outputs in concrete portfolio evidence to reduce hallucinated summaries.
Exposed both the deployed app and source repository for technical review.