Shared AI Gateway System Design

Back to Portfolio

Full-Stack Product Engineering

ECS Express Mode-hosted AI gateway that turns job requirements into structured fit briefs using project metadata, prompt orchestration, and a public web interface.

App Runner -> ECS Express Mode | Structured LLM Workflow

Why This Project Matters

Shows applied AI product engineering plus infrastructure judgment: App Runner was fast for first launch, but ECS Express Mode became a better fit once deployment behavior, ingress policy, and service tuning mattered more.

Tech + Architecture Summary

  • Tech: AWS ECS Express Mode, Next.js, TypeScript, LLM Integration, Prompt Orchestration, Structured Outputs
  • Architecture: sharedaigateway.com -> ingress -> ECS Express Mode service -> requirement parser -> project evidence retrieval -> prompt orchestration -> structured fit brief UI.

Impact Metrics

  • Migrated the AI service from AWS App Runner to ECS Express Mode to keep a lighter managed experience while gaining more explicit control over deployment behavior and service tuning.
  • Converted unstructured job descriptions into normalized requirement signals and evidence-backed summaries for repeatable analysis.
  • Grounded generated output in project metadata so responses reference concrete systems, system design docs, and shipped artifacts instead of generic claims.

Core Problem

Generate useful role-alignment summaries while keeping outputs grounded in real project evidence and preventing generic or inflated AI responses.

High-Level Architecture

mermaid
graph LR
  Client[Web Client]-->Parser[Requirement Parser]
  Parser-->Retriever[Project Evidence Retrieval]
  Retriever-->Prompt[Prompt Orchestration Layer]
  Prompt-->LLM[LLM Analysis Service]
  LLM-->Formatter[Structured Response Formatter]
  Formatter-->Client

Production-Grade Capabilities

  • Requirement parsing and evidence retrieval before prompt execution.
  • Structured response generation designed for consistent skimability.
  • ECS Express Mode runtime with explicit service configuration, ingress behavior, and architecture documentation.

Engineering Decisions

  • More aggressive prompt shaping improves consistency but can reduce flexibility for unusual job descriptions.
  • Evidence grounding improves trust, but requires curated project metadata to avoid thin or repetitive output.
  • Structured outputs improve readability, but can hide nuance if the scoring schema is too rigid.
  • ECS Express Mode keeps more convenience than regular ECS, but still requires clearer service boundaries and deployment discipline than App Runner.

Behavioral + Impact Signals

  • Treated applied AI as a product and systems problem, not just a model wrapper.
  • Optimized for explainability and grounding over novelty-only interaction patterns.
  • Kept the default portfolio narrative understandable even without using the tool.

Quality Guarantees

  • Generated summaries remain tied to explicit project evidence instead of free-form unsupported claims.
  • Output structure is stable enough for quick skim and comparison across roles.
  • Live UI remains usable without forcing the reviewer through a multi-step workflow.

Recent Upgrades

  • Migrated the public deployment from AWS App Runner to ECS Express Mode and cut over the live endpoint to sharedaigateway.com.
  • Documented the migration path, why App Runner stopped fitting the workload, and where regular ECS remained the better choice for networking-heavy services.

Outcome Highlights

  • Cut over the public deployment to sharedaigateway.com on an ECS Express Mode service path.
  • Grounded outputs in concrete portfolio evidence to reduce hallucinated summaries.
  • Exposed both the deployed app and source repository for technical review.