Non-agentic: progressive refinement of a Production AWS API service spec (compiling-architecture workflow)

End-to-end demo of the compiling-architecture skill workflow on a typical AWS-hosted, JavaScript-on-API production service. Follows the README's five-section Progressive-Refinement guidance — one signal per spec change, each spec change followed by a compile so the reader sees what the compiler does with the new signal.

Steps 1-4: author writes the bare-minimum spec (cloud, language, platform, availability), reviews the baseline compile, then adds the latency NFR — and the compiler REJECTS the spec because caching-required-low-latency activates on p99 <= 100ms and requires features.caching: true. The annotated body and the Suggestions block tell the author exactly which threshold fired and which constraint to flip.

Steps 5-8: author opts in to caching, then adds real throughput numbers. cache-aside--redis joins the pattern set; a warn_nfr advisory fires briefly (caching enabled but no throughput data) and resolves once the real QPS numbers land. The composes graph at Step 8 visualises every inter-pattern relationship the registry declares for selected patterns — solid arrows for the wiring the implementing-architecture skill will consume, dashed for edges the registry declared but this spec didn't activate.

Steps 9-11: author commits to a cost story (intent + operating_model + ceilings); the compiler runs full cost feasibility (Pattern OpEx + Ops Team Cost + CapEx) and surfaces a ceiling breach with three concrete architectural levers. The author picks one (raise ceilings to fit the real cost shape), then promotes every assumptions.* key into the explicit spec body and prepends the # STATUS: APPROVED comment header per the compiling-architecture skill's contract. The result is a self-contained architecture.yaml the skills/implementing-architecture skill reads as its input contract.

Non-agentic demo — no multi-agent topology, hosting platform, or agent-specific patterns. See the agentic counterpart for the agentic-scale version of the same workflow: hosting-mismatch rejection, multi-agent cost feasibility, per-pattern defaultConfig overrides, composes graph at agentic density.

1 Step 1: start with just the basics

The author starts with the bare minimum: cloud, language, platform, and a single availability target. No latency budget yet, no throughput, no feature flags, no cost ceilings. The compiler will fill in every other decision it has an opinion about and pick a baseline pattern set.

spec.yaml (full)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws
  language: javascript
  platform: api
nfr:
  availability:
    target: 0.999

2 Step 2: compile — baseline patterns + every default filled in

The compiler runs and produces a complete compiled spec. The assumptions section is the headline output — every default the compiler applied on the author's behalf is enumerated explicitly: feature flags off, low-throughput defaults, weekly deploy cadence, zero dedicated ops engineers for cost math, and so on. Verbose mode also annotates each spec field with the patterns that activated on it (# <pattern-id>, ...), and writes rejected-patterns.yaml for the patterns considered but dropped.

compiler output (verbose -v)

# ─── input spec with pattern-activation annotations ───
# Each `# pattern-id` shows the patterns that activated on this spec value.
  cloud: aws  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
  language: javascript # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
  platform: api # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
    target: 0.999  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)

# ─── what the compiler FILLED IN as assumptions ───
assumptions:
  constraints:
    tenantCount: 1
    features:
      caching: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (9 more)
      async_messaging: false # arch-serverless--aws, resilience-timeouts-retries-backoff, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (11 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (11 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (11 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (9 more)
    latency:
      p95Milliseconds: 500  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (5 more)
      p99Milliseconds: 1000 # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (5 more)
    throughput:
      peak_query_per_second_read: 5  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (5 more)
      peak_query_per_second_write: 1 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (4 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, iac-cloudformation
      compliance:
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (2 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, idp-oidc--cognito, api-rest-resource-oriented, ... (2 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, api-rest-resource-oriented
      audit_logging: false # arch-serverless--aws, db-managed-postgres, api-rest-resource-oriented, ... (5 more)
    # …  (more defaults below; expand the full output to see them)

# ─── Matched Patterns based on input spec ───
# meta  = policy gates (always emitted when their feature flag is set)
# P0    = high priority — load-bearing architectural decisions
# P1    = mid priority — operational + observability + security baseline
# P2/P3 = lower priority — refinements + governance + docs
# Override priority by adding `patterns.<id>.recommended_priority: P0` to spec.
  patterns:
    P0:    # (4 patterns)
      - arch-serverless--aws  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
      - db-managed-postgres  # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
      - arch-serverless-pay-per-use--aws-lambda  # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
      - iac-cloudformation  # AWS-native IaC; deep service coverage; AWS-specific.
    P1:    # (15 patterns)
      - resilience-timeouts-retries-backoff  # Deadlines, bounded retries, exponential backoff with jitter; avoid retry storms.
          composes:
            wraps: ['sync-request-reply-rest']
      - idp-oidc--cognito  # Use AWS Cognito as a managed identity provider natively integrated with the AWS ecosystem. Provides user pools (authentication, registration, MFA), identity pools (federated AWS resource access), social and enterprise federation, and Lambda triggers for customization. Best choice for AWS-native applications requiring tight IAM integration. Free tier: 50,000 MAU.
      - resilience-circuit-breaker  # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
          composes:
            wraps: ['sync-request-reply-rest']
      - api-rest-resource-oriented  # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
      - sync-request-reply-rest  # Synchronous HTTP APIs; simple integration; needs timeouts/retries/backpressure.
      - deploy-blue-green  # Two environments; switch traffic; fast rollback; higher infra cost.
          composes:
            layered_after: ['iac-cloudformation']
      - sec-auth-oauth2-oidc  # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
          composes:
            wraps: ['api-rest-resource-oriented']
      - crud-single-model  # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
      - finops-cost-allocation-tags  # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
      - release-feature-flags  # Decouple deploy from release; safer experiments; needs kill switches and governance.
      - obs-telemetry-backend--aws-cloudwatch  # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
      - obs-golden-signals  # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
          composes:
            layered_after: ['obs-open-telemetry-baseline']
      - obs-open-telemetry-baseline  # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
          composes:
            co_runs_with: ['api-rest-resource-oriented']
      - finops-budget-guardrails  # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
      - ops-slo-error-budgets  # Define SLOs and error budgets to balance reliability and velocity.
    P2:    # (3 patterns)
      - arch-egress-minimization  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
      - api-versioning-header  # Version via headers/media types; keeps URLs stable; harder to debug and cache.
      - gov-system-manifest  # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
          composes:
            layered_after: ['iac-cloudformation']
            co_runs_with: ['release-feature-flags', 'gov-adrs-mandatory', 'ops-runbooks']
    P3:    # (2 patterns)
      - ops-runbooks  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
      - gov-adrs-mandatory  # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.

# ─── warns and cost feasibility ───
# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
# Total Patterns Selected: 24
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      50,650
# Pattern OpEx (monthly):     $         420
# Ops Team Cost (monthly):    $           0
# Total OpEx (monthly):       $         420
# Total TCO (24mo):         $      60,730
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✓ PASS
#
# ============================================================

see full compiled-spec.yaml (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
  language: javascript # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
  platform: api # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
nfr:
  availability:
    target: 0.999  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
assumptions:
  constraints:
    saas-providers: []
    disallowed-saas-providers: []
    ai-inference-platforms: []
    disallowed-ai-inference-platforms: []
    model-vendors: []
    disallowed-model-vendors: []
    tenantCount: 1
    features:
      caching: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (9 more)
      async_messaging: false # arch-serverless--aws, resilience-timeouts-retries-backoff, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (11 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (12 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (10 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (11 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (11 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (9 more)
    latency:
      p95Milliseconds: 500  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (5 more)
      p99Milliseconds: 1000 # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (5 more)
    throughput:
      peak_query_per_second_read: 5  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (5 more)
      peak_query_per_second_write: 1 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (4 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, iac-cloudformation
      compliance:
        gdpr: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
        gdpr_rtbf: false
        ccpa: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
        hipaa: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (20 more)
        sox: false # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (21 more)
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, resilience-timeouts-retries-backoff, ... (2 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, idp-oidc--cognito, api-rest-resource-oriented, ... (2 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, api-rest-resource-oriented
      audit_logging: false # arch-serverless--aws, db-managed-postgres, api-rest-resource-oriented, ... (5 more)
  operating_model:
    on_call: false
    deploy_freq: weekly
    ops_team_size: 0
    single_resource_monthly_ops_usd: 10000
    amortization_months: 24
  cost:
    intent:
      priority: minimize-opex
    ceilings:
      monthly_operational_usd: 500
      one_time_setup_usd: 1000
    preferences:
      prefer_free_tier_if_possible: true  # db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, idp-oidc--cognito, ... (2 more)
      prefer_saas_first: false
  patterns:
    P0:
      arch-serverless--aws:  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
        compute_service: lambda  # Options: lambda, fargate
        api_gateway: api-gateway-http # Options: api-gateway-http, api-gateway-rest, alb, function-url
        database: dynamodb # Options: dynamodb, aurora-serverless, rds-proxy
        storage: s3-standard # Options: s3-standard, s3-intelligent-tiering, efs
        auth_service: cognito # Options: cognito, lambda-authorizer, iam
        event_bus: eventbridge # Options: eventbridge, sns-sqs, kinesis
      db-managed-postgres: # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
        provider: supabase  # Options: supabase, neon, render, railway, digitalocean-app-platform
        instance_size: small # Options: micro, small, medium, large
        storage_gb: 8 # Range: 1-500
        backup_retention_days: 7 # Range: 1-30
        connection_pooling: true # Boolean
        high_availability: false # Boolean
        ssl_mode: require # Options: disable, allow, prefer, require, verify-ca, verify-full
      arch-serverless-pay-per-use--aws-lambda: # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
        memory_size: 512  # Options: 128, 256, 512, 1024, 2048, 3008
        timeout: 30 # Range: 3-900
        architecture: x86_64 # Options: x86_64, arm64
        provisioned_concurrency: 0 # Range: 0-1000
      iac-cloudformation: # AWS-native IaC; deep service coverage; AWS-specific.
        stack_naming_convention: project-environment-resource  # Options: project-environment-resource, environment-project-resource, flat-naming, custom
        change_set_enabled: true # Boolean
        termination_protection: false # Boolean
        drift_detection: true # Boolean
        stack_policy: none # Options: none, protect-all, protect-data-resources, custom
    P1:
      resilience-timeouts-retries-backoff:  # Deadlines, bounded retries, exponential backoff with jitter; avoid retry storms.
        initial_timeout_ms: 5000  # Range: 100-60000
        max_retries: 3 # Range: 0-10
        backoff_strategy: exponential_jitter # Options: exponential, exponential_jitter, linear, constant
        backoff_multiplier: 2 # Range: 1-5
        max_backoff_ms: 30000 # Range: 1000-300000
        retry_on_timeout: true # Boolean
        retry_on_errors: # Options: network, 5xx, 429, throttle, 4xx, all
        - network
        - 5xx
        - throttle
        circuit_breaker_enabled: false # Boolean
        composes:
          wraps:
          - sync-request-reply-rest
      idp-oidc--cognito: # Use AWS Cognito as a managed identity provider natively integrated with the AWS ecosystem. Provides user pools (authentication, registration, MFA), identity pools (federated AWS resource access), social and enterprise federation, and Lambda triggers for customization. Best choice for AWS-native applications requiring tight IAM integration. Free tier: 50,000 MAU.
        pool_type: user-pool  # Options: user-pool, user-pool-and-identity-pool
        mfa_configuration: OPTIONAL # Options: OFF, OPTIONAL, ON
        password_policy: medium # Options: low, medium, high
        lambda_triggers_enabled: false # Boolean
      resilience-circuit-breaker: # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
        failure_threshold: 5  # Range: 1-20
        success_threshold: 2 # Range: 1-10
        timeout_duration: 60 # Range: 5-300
        half_open_max_calls: 1 # Range: 1-10
        fallback_strategy: fail_fast # Options: fail_fast, cached_response, default_value, alternative_service
        composes:
          wraps:
          - sync-request-reply-rest
      api-rest-resource-oriented: # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
        pagination_style: offset  # Options: offset, cursor, page_number
        max_page_size: 100 # Range: 10-1000
        versioning_strategy: uri # Options: uri, header, query_param, none
        filtering_style: query_params # Options: query_params, json_body, graphql_like
        cache_strategy: etag # Options: etag, last_modified, cache_control, none
        id_format: uuid # Options: uuid, integer, slug, composite
        response_envelope: false # Boolean
      sync-request-reply-rest: # Synchronous HTTP APIs; simple integration; needs timeouts/retries/backpressure.
        timeout_seconds: 30  # Range: 1-300
        retry_strategy: exponential_backoff # Options: none, fixed_delay, exponential_backoff, exponential_backoff_jitter
        max_retries: 3 # Range: 0-10
        circuit_breaker_enabled: true # Boolean
        rate_limiting_strategy: token_bucket # Options: none, token_bucket, leaky_bucket, fixed_window, sliding_window
        idempotency_required: false # Boolean
      deploy-blue-green: # Two environments; switch traffic; fast rollback; higher infra cost.
        traffic_switch_strategy: instant  # Options: instant, gradual, canary
        health_check_required: true # Boolean
        rollback_strategy: automatic # Options: automatic, manual, disabled
        environment_parity: full # Options: full, scaled-down, minimal
        warm_up_period_minutes: 5 # Range: 0-60
        composes:
          layered_after:
          - iac-cloudformation
      sec-auth-oauth2-oidc: # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
        oauth_flow: authorization_code  # Options: authorization_code, client_credentials, device_code, implicit
        token_storage: secure_storage # Options: secure_storage, memory_only, encrypted_storage, httponly_cookie
        pkce_enabled: true # Boolean
        scope_strategy: minimal # Options: minimal, role_based, resource_specific
        token_refresh: automatic # Options: automatic, manual, sliding_window
        id_token_validation: strict # Options: strict, standard, relaxed
        composes:
          wraps:
          - api-rest-resource-oriented
      crud-single-model: # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
        api_style: rest  # Options: rest, graphql, rpc
        validation_strategy: server-side # Options: server-side, client-side, both
        soft_delete: false # Boolean
        audit_logging: false # Boolean
        pagination_default_size: 20 # Range: 10-100
      finops-cost-allocation-tags: # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
        tagging_strategy: hierarchical  # Options: hierarchical, flat, hybrid
        enforcement_level: required # Options: required, recommended, optional
        cost_allocation_model: showback # Options: chargeback, showback, hybrid
        tag_inheritance: true # Boolean
        automated_tagging: true # Boolean
      release-feature-flags: # Decouple deploy from release; safer experiments; needs kill switches and governance.
        flag_storage: config_file  # Options: config_file, database, feature_flag_service, environment_variables
        evaluation_strategy: simple_boolean # Options: simple_boolean, percentage_rollout, user_targeting, multi_variate
        targeting_capability: none # Options: none, user_attributes, context_based, advanced_segments
        kill_switch_enabled: true # Boolean
        audit_logging: false # Boolean
      obs-telemetry-backend--aws-cloudwatch: # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
        log_retention_days: 30  # Options: 1, 3, 7, 14, 30, 60, 90, 180, 365
        xray_sampling_rate: 0.05 # Options: 0.01, 0.05, 0.1, 0.5, 1.0
      obs-golden-signals: # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
        latency_percentile: p95  # Options: p50, p95, p99, p99.9
        error_rate_threshold: 1_percent # Options: 0.1_percent, 1_percent, 5_percent
        saturation_metric: cpu_memory # Options: cpu_memory, queue_depth, connection_pool, disk_io
        sli_window: 30_days # Options: 7_days, 30_days, 90_days
        alert_severity_levels: critical_warning # Options: critical_only, critical_warning, critical_warning_info
        composes:
          layered_after:
          - obs-open-telemetry-baseline
      obs-open-telemetry-baseline: # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
        export_backend: otlp  # Options: otlp, jaeger, zipkin, prometheus, datadog, newrelic, honeycomb
        trace_sampling_strategy: parent-based # Options: always-on, always-off, parent-based, trace-id-ratio
        trace_sampling_rate: 1.0 # Range: 0.0-1.0
        metrics_export_interval: 60 # Range: 10-300
        log_correlation: true # Boolean
        resource_detection: true # Boolean
        propagation_format: w3c-tracecontext # Options: w3c-tracecontext, b3, jaeger, multi
        composes:
          co_runs_with:
          - api-rest-resource-oriented
      finops-budget-guardrails: # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
        budget_period: monthly  # Options: monthly, quarterly, annual
        alert_thresholds:
        - 50
        - 80
        - 100
        enforcement_action: alert # Options: alert, prevent, throttle
        tagging_strategy: mandatory # Options: mandatory, recommended, optional
        policy_enforcement: soft # Options: soft, hard, audit
        cost_allocation_level: project # Options: project, team, environment, service
      ops-slo-error-budgets: # Define SLOs and error budgets to balance reliability and velocity.
        slo_target_percentage: 99.9  # Range: 90-99.999
        measurement_window_days: 30 # Options: 7, 28, 30, 90
        error_budget_policy: halt-deployments # Options: halt-deployments, alert-only, slow-rollouts, require-approval
        sli_type: availability # Options: availability, latency, throughput, correctness, composite
        alerting_threshold_percentage: 80 # Range: 50-100
    P2:
      arch-egress-minimization:  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
        cdn_strategy: full  # Options: full, static-only, none
        data_locality: regional # Options: global, regional, single-zone
        cross_region_replication: minimal # Options: none, minimal, full
        compression_enabled: true # Boolean
        static_asset_strategy: cdn_edge # Options: cdn_edge, regional_storage, origin_only
      api-versioning-header: # Version via headers/media types; keeps URLs stable; harder to debug and cache.
        version_header_name: API-Version  # Options: API-Version, X-API-Version, Accept-Version, Custom-Header
        version_format: date-based # Options: semantic, date-based, sequential
        fallback_behavior: latest-stable # Options: latest-stable, oldest-supported, reject-request
        content_negotiation: false # Boolean
        deprecation_policy: warning-header # Options: sunset-header, warning-header, both
      gov-system-manifest: # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
        manifest_path: docs/architecture/manifest.yaml
        manifest_format: yaml  # Options: yaml, toml, json
        manifest_scope: # Options: agent-tools, agent-skills, agent-models, agent-prompts, data_sources, services, external_dependencies
        - agent-tools
        - agent-skills
        - agent-models
        - agent-prompts
        pin_versions: true # Boolean
        ci_validation: required # Options: required, optional, off
        drift_policy: fail-build # Options: fail-build, warn-only, off
        composes:
          layered_after:
          - iac-cloudformation
          co_runs_with:
          - release-feature-flags
          - gov-adrs-mandatory
          - ops-runbooks
    P3:
      ops-runbooks:  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
        runbook_format: markdown  # Options: markdown, wiki, structured_yaml, ticketing_system
        incident_severity_levels: 4 # Options: 3, 4, 5
        escalation_policy: tiered # Options: tiered, follow_the_sun, flat, hybrid
        automation_integration: manual # Options: manual, semi_automated, fully_automated
        review_frequency: quarterly # Options: monthly, quarterly, biannual, post_incident
      gov-adrs-mandatory: # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.
        adr_format: madr  # Options: madr, nygard, y-statements, custom
        storage_location: docs/adrs # Options: docs/adrs, docs/architecture/decisions, adr, wiki
        decision_threshold: significant # Options: all, significant, strategic-only
        review_requirement: peer-review # Options: peer-review, architect-approval, team-consensus, none

# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
# Total Patterns Selected: 24
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      50,650
# Pattern OpEx (monthly):     $         420
# Ops Team Cost (monthly):    $           0
# Total OpEx (monthly):       $         420
# Total TCO (24mo):         $      60,730
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✓ PASS
#
# ============================================================

# ============================================================
# Cost Feasibility Analysis (Details)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
#
# Ops team size: 0 engineers (no ops cost)
#
# Ops Team Cost Algorithm (for reference):
#   Formula: ops_team_size × single_resource_monthly_ops_usd × on_call_multiplier × deploy_freq_multiplier
#   Based on:
#     - Google SRE Handbook (2016): On-call burden = 25-50% FTE overhead
#     - DORA State of DevOps (2021): Deploy frequency impact on ops overhead
#
# Calculating costs for 24 selected patterns:
#
# PER-PATTERN COSTS:
# ────────────────────────────────────────────────────────────
#
#  1. arch-serverless--aws (match score: 35.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  2. db-managed-postgres (match score: 32.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  3. resilience-timeouts-retries-backoff (match score: 29.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  4. arch-serverless-pay-per-use--aws-lambda (match score: 28.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  5. idp-oidc--cognito (match score: 26.00)
#     Adoption: $200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  6. resilience-circuit-breaker (match score: 26.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  7. api-rest-resource-oriented (match score: 25.00)
#     Adoption: $750.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  8. sync-request-reply-rest (match score: 25.00)
#     Adoption: $300.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  9. deploy-blue-green (match score: 25.00)
#     Adoption: $2,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  10. sec-auth-oauth2-oidc (match score: 23.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  11. crud-single-model (match score: 22.00)
#     Adoption: $300.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  12. finops-cost-allocation-tags (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  13. arch-egress-minimization (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  14. release-feature-flags (match score: 19.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  15. api-versioning-header (match score: 16.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  16. obs-telemetry-backend--aws-cloudwatch (match score: 14.00)
#     Adoption: $500.0
#     Monthly (min): $20.0
#     Monthly (expected): $20.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $20.0
#
#  17. obs-golden-signals (match score: 12.00)
#     Adoption: $3,000.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  18. obs-open-telemetry-baseline (match score: 12.00)
#     Adoption: $3,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  19. finops-budget-guardrails (match score: 10.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  20. iac-cloudformation (match score: 10.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  21. ops-runbooks (match score: 8.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  22. ops-slo-error-budgets (match score: 8.00)
#     Adoption: $4,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  23. gov-system-manifest (match score: 7.00)
#     Adoption: $4,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  24. gov-adrs-mandatory (match score: 7.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
# Total Monthly OpEx: $420.0
# Monthly operational ceiling: $500 ✓ PASS
# ============================================================

The first useful artefact. Nothing was rejected, but the author should still read the assumptions block — those are silent decisions about feature flags, throughput, deploy cadence, ops team size. Each one is a candidate for explicit override in the next iterations.

3 Step 3: add the latency NFR as the target becomes clear

The author has agreed a latency target with the team and adds it to the spec. p95 ≤ 50ms, p99 ≤ 100ms. Each new constraint narrows pattern selection — some patterns that fit before will be dropped on recompile because they cannot meet 50ms p95.

spec.yaml (delta added in this step)

nfr:
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100

view full spec.yaml at this step (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws
  language: javascript
  platform: api
nfr:
  availability:
    target: 0.999
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100

4 Step 4: recompile — SPEC REJECTED, the latency budget triggered mandatory caching

The compiler refuses to produce a final architecture. The policy pattern caching-required-low-latency activated on /nfr/latency/p99Milliseconds <= 100 and demands caching be opted in via constraints.features.caching: true. The compiler tells the author EXACTLY which threshold triggered the rule and which constraint to flip. The annotated body inline shows the gate firing on the latency field and the ❌ marker on the caching feature flag — the architectural mismatch surfaces at spec time, not at deploy time.

This is the more useful face of the rejection signal called out in the README's Progressive-Refinement section: "if a hard NFR target cannot be satisfied by any available pattern, the compiler exits with code 1 and explains what failed."

compiler output (verbose -v)

# ─── ❌ COMPILER REJECTED THIS SPEC ───
❌ Constraints/NFRs trade-off requirements not met:
  [caching-required-low-latency] /constraints/features/caching == True
  → At P99 <= 100ms, caching is mandatory to keep tail latency within the SLO. Set constraints.features.caching: true

💡 Suggestions — consider changing these activation fields:
  caching-required-low-latency activated by:
    /constraints/cloud in [agnostic | aws | azure | gcp | on-prem | n/a]
    /nfr/latency/p99Milliseconds <= 100


# ─── input spec with pattern-activation annotations ───
# Each `# pattern-id` shows the patterns that activated on this spec value.
  cloud: aws  # caching-required-low-latency
    p99Milliseconds: 100  # caching-required-low-latency

# ─── what the compiler FILLED IN as assumptions ───
assumptions:
  constraints:
    tenantCount: 1
    features:
      caching: false  # ❌ caching-required-low-latency
      async_messaging: false
      ai_inference: false
      agentic_system: null
      multi_tenancy: false
      batch_processing: false
      distributed_transactions: false
      real_time_streaming: false
      vector_search: false
      document_store: false
      key_value_store: false
      graph_database: false
      time_series_db: false
      oltp_workload: true
      olap_workload: false
      cold_archive_tiering: false
      messaging_delivery_guarantee: null
  nfr:
    rpo_minutes: 60
    rto_minutes: 60
    latency:
      jobStartP95Seconds: null
      jobStartP99Seconds: null
    throughput:
      peak_jobs_per_hour: null
      peak_query_per_second_read: 5
      peak_query_per_second_write: 1
    data:
      retention_days: 90
      pii: false
      compliance:
    consistency:
      needsReadYourWrites: false
    durability:
      strict: false
    security:
    # …  (more defaults below; expand the full output to see them)

see full compiled-spec.yaml (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws  # caching-required-low-latency
  language: javascript
  platform: api
nfr:
  availability:
    target: 0.999
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100  # caching-required-low-latency
assumptions:
  constraints:
    saas-providers:
    disallowed-saas-providers:
    ai-inference-platforms:
    disallowed-ai-inference-platforms:
    model-vendors:
    disallowed-model-vendors:
    tenantCount: 1
    features:
      caching: false  # ❌ caching-required-low-latency
      async_messaging: false
      ai_inference: false
      agentic_system: null
      multi_tenancy: false
      batch_processing: false
      distributed_transactions: false
      real_time_streaming: false
      vector_search: false
      document_store: false
      key_value_store: false
      graph_database: false
      time_series_db: false
      oltp_workload: true
      olap_workload: false
      cold_archive_tiering: false
      messaging_delivery_guarantee: null
  nfr:
    rpo_minutes: 60
    rto_minutes: 60
    latency:
      jobStartP95Seconds: null
      jobStartP99Seconds: null
    throughput:
      peak_jobs_per_hour: null
      peak_query_per_second_read: 5
      peak_query_per_second_write: 1
    data:
      retention_days: 90
      pii: false
      compliance:
        gdpr: false
        gdpr_rtbf: false
        ccpa: false
        hipaa: false
        sox: false
    consistency:
      needsReadYourWrites: false
    durability:
      strict: false
    security:
      auth: oauth2_oidc
      tenant_isolation: n/a
      audit_logging: false
  operating_model:
    on_call: false
    deploy_freq: weekly
    ops_team_size: 0
    single_resource_monthly_ops_usd: 10000
    amortization_months: 24
  cost:
    intent:
      priority: minimize-opex
    ceilings:
      monthly_operational_usd: 500
      one_time_setup_usd: 1000
    preferences:
      prefer_free_tier_if_possible: true
      prefer_saas_first: false
  patterns:
    meta:
      caching-required-low-latency:
    P0:
      arch-serverless--aws:
        compute_service: lambda
        api_gateway: api-gateway-http
        database: dynamodb
        storage: s3-standard
        auth_service: cognito
        event_bus: eventbridge
      db-managed-postgres:
        provider: supabase
        instance_size: small
        storage_gb: 8
        backup_retention_days: 7
        connection_pooling: true
        high_availability: false
        ssl_mode: require
      arch-serverless-pay-per-use--aws-lambda:
        memory_size: 512
        timeout: 30
        architecture: x86_64
        provisioned_concurrency: 0
        reserved_concurrency: null
      iac-cloudformation:
        stack_naming_convention: project-environment-resource
        change_set_enabled: true
        termination_protection: false
        drift_detection: true
        stack_policy: none
    P1:
      resilience-circuit-breaker:
        failure_threshold: 5
        success_threshold: 2
        timeout_duration: 60
        half_open_max_calls: 1
        fallback_strategy: fail_fast
      api-rest-resource-oriented:
        pagination_style: offset
        max_page_size: 100
        versioning_strategy: uri
        filtering_style: query_params
        cache_strategy: etag
        id_format: uuid
        response_envelope: false
      deploy-blue-green:
        traffic_switch_strategy: instant
        health_check_required: true
        rollback_strategy: automatic
        environment_parity: full
        warm_up_period_minutes: 5
        composes:
          layered_after:
            - iac-cloudformation
      sec-auth-oauth2-oidc:
        oauth_flow: authorization_code
        token_storage: secure_storage
        pkce_enabled: true
        scope_strategy: minimal
        token_refresh: automatic
        id_token_validation: strict
        composes:
          wraps:
            - api-rest-resource-oriented
      crud-single-model:
        api_style: rest
        validation_strategy: server-side
        soft_delete: false
        audit_logging: false
        pagination_default_size: 20
      finops-cost-allocation-tags:
        tagging_strategy: hierarchical
        enforcement_level: required
        cost_allocation_model: showback
        tag_inheritance: true
        automated_tagging: true
      release-feature-flags:
        flag_storage: config_file
        evaluation_strategy: simple_boolean
        targeting_capability: none
        kill_switch_enabled: true
        audit_logging: false
      obs-telemetry-backend--aws-cloudwatch:
        log_retention_days: 30
        xray_sampling_rate: 0.05
      obs-golden-signals:
        latency_percentile: p95
        error_rate_threshold: 1_percent
        saturation_metric: cpu_memory
        sli_window: 30_days
        alert_severity_levels: critical_warning
        composes:
          layered_after:
            - obs-open-telemetry-baseline
      obs-open-telemetry-baseline:
        export_backend: otlp
        trace_sampling_strategy: parent-based
        trace_sampling_rate: 1.0
        metrics_export_interval: 60
        log_correlation: true
        resource_detection: true
        propagation_format: w3c-tracecontext
        composes:
          co_runs_with:
            - api-rest-resource-oriented
      finops-budget-guardrails:
        budget_period: monthly
        alert_thresholds:
          - 50
          - 80
          - 100
        enforcement_action: alert
        tagging_strategy: mandatory
        policy_enforcement: soft
        cost_allocation_level: project
      ops-slo-error-budgets:
        slo_target_percentage: 99.9
        measurement_window_days: 30
        error_budget_policy: halt-deployments
        sli_type: availability
        alerting_threshold_percentage: 80
    P2:
      arch-egress-minimization:
        cdn_strategy: full
        data_locality: regional
        cross_region_replication: minimal
        compression_enabled: true
        static_asset_strategy: cdn_edge
      api-versioning-header:
        version_header_name: API-Version
        version_format: date-based
        fallback_behavior: latest-stable
        content_negotiation: false
        deprecation_policy: warning-header
      gov-system-manifest:
        manifest_path: docs/architecture/manifest.yaml
        manifest_format: yaml
        manifest_scope:
          - agent-tools
          - agent-skills
          - agent-models
          - agent-prompts
        pin_versions: true
        ci_validation: required
        drift_policy: fail-build
        composes:
          layered_after:
            - iac-cloudformation
          co_runs_with:
            - release-feature-flags
            - gov-adrs-mandatory
            - ops-runbooks
    P3:
      ops-runbooks:
        runbook_format: markdown
        incident_severity_levels: 4
        escalation_policy: tiered
        automation_integration: manual
        review_frequency: quarterly
      gov-adrs-mandatory:
        adr_format: madr
        storage_location: docs/adrs
        decision_threshold: significant
        review_requirement: peer-review

❌ Constraints/NFRs trade-off requirements not met:
  [caching-required-low-latency] /constraints/features/caching == True
  → At P99 <= 100ms, caching is mandatory to keep tail latency within the SLO. Set constraints.features.caching: true

💡 Suggestions — consider changing these activation fields:
  caching-required-low-latency activated by:
    /constraints/cloud in [agnostic | aws | azure | gcp | on-prem | n/a]
    /nfr/latency/p99Milliseconds <= 100

Rejection: at p99 <= 100ms, caching is mandatory. The compiler refuses to silently ignore this — it forces the decision to be explicit. Step 5 fixes it.

5 Step 5: opt in to the caching feature flag explicitly

Feature flags live under constraints.features, not under nfr — they represent opt-in capabilities, not performance targets. The author flips features.caching: true to bring cache-aside-related patterns into scope.

spec.yaml (delta added in this step)

constraints:
  features:
    caching: true

view full spec.yaml at this step (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws
  language: javascript
  platform: api
  features:
    caching: true
nfr:
  availability:
    target: 0.999
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100

6 Step 6: recompile — cache-aside selected, warn_nfr advisory fires

cache-aside--redis joins the pattern set. But: peak read QPS isn't specified in the spec, so the compiler can't confirm caching is worth its overhead. A warn_nfr advisory fires — something like "caching enabled but peak read QPS is <10 req/s; caching overhead may outweigh benefit at this scale". The compiler is telling the author the pattern is selected but probably overkill — a prompt to provide real throughput data or reconsider the feature flag.

compiler output (verbose -v)

# ─── what the compiler FILLED IN as assumptions ───
assumptions:
  constraints:
    tenantCount: 1
    features:
      async_messaging: false  # arch-serverless--aws, arch-serverless-pay-per-use--aws-lambda, caching-required-low-latency, ... (8 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (7 more)
    throughput:
      peak_query_per_second_read: 5  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
      peak_query_per_second_write: 1 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      compliance:
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, caching-required-low-latency, api-rest-resource-oriented, ... (1 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      audit_logging: false # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (6 more)
  operating_model:
    on_call: false
    deploy_freq: weekly
    ops_team_size: 0
    # …  (more defaults below; expand the full output to see them)

# ─── Matched Patterns based on input spec ───
# meta  = policy gates (always emitted when their feature flag is set)
# P0    = high priority — load-bearing architectural decisions
# P1    = mid priority — operational + observability + security baseline
# P2/P3 = lower priority — refinements + governance + docs
# Override priority by adding `patterns.<id>.recommended_priority: P0` to spec.
  patterns:
    meta:    # (1 pattern)
      - caching-required-low-latency  # Policy pattern that enforces caching when P99 latency target is 100ms or below — the threshold at which direct database queries under any meaningful load are unlikely to reliably satisfy the SLO without a cache layer in front.
    P0:    # (4 patterns)
      - arch-serverless--aws  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
      - db-managed-postgres  # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
      - arch-serverless-pay-per-use--aws-lambda  # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
      - iac-cloudformation  # AWS-native IaC; deep service coverage; AWS-specific.
    P1:    # (13 patterns)
      - resilience-circuit-breaker  # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
      - cache-aside  # Application reads cache then DB on miss; writes update DB then invalidates/updates cache explicitly.
          composes:
            wraps: ['db-managed-postgres']
      - api-rest-resource-oriented  # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
      - deploy-blue-green  # Two environments; switch traffic; fast rollback; higher infra cost.
          composes:
            layered_after: ['iac-cloudformation']
      - sec-auth-oauth2-oidc  # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
          composes:
            wraps: ['api-rest-resource-oriented']
      - crud-single-model  # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
      - finops-cost-allocation-tags  # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
      - release-feature-flags  # Decouple deploy from release; safer experiments; needs kill switches and governance.
      - obs-telemetry-backend--aws-cloudwatch  # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
      - obs-golden-signals  # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
          composes:
            layered_after: ['obs-open-telemetry-baseline']
      - obs-open-telemetry-baseline  # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
          composes:
            co_runs_with: ['api-rest-resource-oriented']
      - finops-budget-guardrails  # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
      - ops-slo-error-budgets  # Define SLOs and error budgets to balance reliability and velocity.
    P2:    # (3 patterns)
      - arch-egress-minimization  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
      - api-versioning-header  # Version via headers/media types; keeps URLs stable; harder to debug and cache.
      - gov-system-manifest  # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
          composes:
            layered_after: ['iac-cloudformation']
            co_runs_with: ['release-feature-flags', 'gov-adrs-mandatory', 'ops-runbooks']
    P3:    # (2 patterns)
      - ops-runbooks  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
      - gov-adrs-mandatory  # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.

# ─── warns and cost feasibility ───
# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
# Total Patterns Selected: 23
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      49,750
# Pattern OpEx (monthly):     $         435
# Ops Team Cost (monthly):    $           0
# Total OpEx (monthly):       $         435
# Total TCO (24mo):         $      60,190
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✓ PASS
#
# ============================================================

# ============================================================
# ⚠️  Pattern Advisory Warnings
# (Patterns are still SELECTED — review these before finalizing)
# ============================================================
#
# [warning] warn_nfr:
#   cache-aside: peak read QPS is 5 req/s (<10 req/s). Caching overhead (infrastructure, invalidation, serialization) may outweigh benefit at this scale. Consider simplifying to direct DB access.
#
#   Suggestions:
#   - Cache-aside adds infrastructure complexity (cache store, invalidation logic, serialization). At <10 read req/s the overhead rarely justifies the benefit.
#
# ============================================================

see full compiled-spec.yaml (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (20 more)
  language: javascript # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (20 more)
  platform: api # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (20 more)
  features:
    caching: true  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (8 more)
nfr:
  availability:
    target: 0.999  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
  latency:
    p95Milliseconds: 50  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
    p99Milliseconds: 100 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (4 more)
assumptions:
  constraints:
    saas-providers: []
    disallowed-saas-providers: []
    ai-inference-platforms: []
    disallowed-ai-inference-platforms: []
    model-vendors: []
    disallowed-model-vendors: []
    tenantCount: 1
    features:
      async_messaging: false  # arch-serverless--aws, arch-serverless-pay-per-use--aws-lambda, caching-required-low-latency, ... (8 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (11 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (10 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (7 more)
    throughput:
      peak_query_per_second_read: 5  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
      peak_query_per_second_write: 1 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      compliance:
        gdpr: false  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (20 more)
        gdpr_rtbf: false
        ccpa: false  # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (20 more)
        hipaa: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (19 more)
        sox: false # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (20 more)
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, caching-required-low-latency, api-rest-resource-oriented, ... (1 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      audit_logging: false # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (6 more)
  operating_model:
    on_call: false
    deploy_freq: weekly
    ops_team_size: 0
    single_resource_monthly_ops_usd: 10000
    amortization_months: 24
  cost:
    intent:
      priority: minimize-opex
    ceilings:
      monthly_operational_usd: 500
      one_time_setup_usd: 1000
    preferences:
      prefer_free_tier_if_possible: true  # db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, caching-required-low-latency, ... (2 more)
      prefer_saas_first: false
  patterns:
    meta:
      caching-required-low-latency: {}  # Policy pattern that enforces caching when P99 latency target is 100ms or below — the threshold at which direct database queries under any meaningful load are unlikely to reliably satisfy the SLO without a cache layer in front.
    P0:
      arch-serverless--aws:  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
        compute_service: lambda  # Options: lambda, fargate
        api_gateway: api-gateway-http # Options: api-gateway-http, api-gateway-rest, alb, function-url
        database: dynamodb # Options: dynamodb, aurora-serverless, rds-proxy
        storage: s3-standard # Options: s3-standard, s3-intelligent-tiering, efs
        auth_service: cognito # Options: cognito, lambda-authorizer, iam
        event_bus: eventbridge # Options: eventbridge, sns-sqs, kinesis
      db-managed-postgres: # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
        provider: supabase  # Options: supabase, neon, render, railway, digitalocean-app-platform
        instance_size: small # Options: micro, small, medium, large
        storage_gb: 8 # Range: 1-500
        backup_retention_days: 7 # Range: 1-30
        connection_pooling: true # Boolean
        high_availability: false # Boolean
        ssl_mode: require # Options: disable, allow, prefer, require, verify-ca, verify-full
      arch-serverless-pay-per-use--aws-lambda: # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
        memory_size: 512  # Options: 128, 256, 512, 1024, 2048, 3008
        timeout: 30 # Range: 3-900
        architecture: x86_64 # Options: x86_64, arm64
        provisioned_concurrency: 0 # Range: 0-1000
      iac-cloudformation: # AWS-native IaC; deep service coverage; AWS-specific.
        stack_naming_convention: project-environment-resource  # Options: project-environment-resource, environment-project-resource, flat-naming, custom
        change_set_enabled: true # Boolean
        termination_protection: false # Boolean
        drift_detection: true # Boolean
        stack_policy: none # Options: none, protect-all, protect-data-resources, custom
    P1:
      resilience-circuit-breaker:  # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
        failure_threshold: 5  # Range: 1-20
        success_threshold: 2 # Range: 1-10
        timeout_duration: 60 # Range: 5-300
        half_open_max_calls: 1 # Range: 1-10
        fallback_strategy: fail_fast # Options: fail_fast, cached_response, default_value, alternative_service
      cache-aside: # Application reads cache then DB on miss; writes update DB then invalidates/updates cache explicitly.
        invalidation_strategy: ttl  # Options: ttl, event-based, manual, lru
        ttl_seconds: 3600 # Range: 1-86400
        max_memory_mb: 512 # Range: 128-16384
        cache_backend: redis # Options: redis, memcached, in-memory, hazelcast
        write_strategy: write-through # Options: write-through, write-behind, invalidate-only
        serialization_format: json # Options: json, msgpack, protobuf, pickle
        composes:
          wraps:
          - db-managed-postgres
      api-rest-resource-oriented: # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
        pagination_style: offset  # Options: offset, cursor, page_number
        max_page_size: 100 # Range: 10-1000
        versioning_strategy: uri # Options: uri, header, query_param, none
        filtering_style: query_params # Options: query_params, json_body, graphql_like
        cache_strategy: etag # Options: etag, last_modified, cache_control, none
        id_format: uuid # Options: uuid, integer, slug, composite
        response_envelope: false # Boolean
      deploy-blue-green: # Two environments; switch traffic; fast rollback; higher infra cost.
        traffic_switch_strategy: instant  # Options: instant, gradual, canary
        health_check_required: true # Boolean
        rollback_strategy: automatic # Options: automatic, manual, disabled
        environment_parity: full # Options: full, scaled-down, minimal
        warm_up_period_minutes: 5 # Range: 0-60
        composes:
          layered_after:
          - iac-cloudformation
      sec-auth-oauth2-oidc: # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
        oauth_flow: authorization_code  # Options: authorization_code, client_credentials, device_code, implicit
        token_storage: secure_storage # Options: secure_storage, memory_only, encrypted_storage, httponly_cookie
        pkce_enabled: true # Boolean
        scope_strategy: minimal # Options: minimal, role_based, resource_specific
        token_refresh: automatic # Options: automatic, manual, sliding_window
        id_token_validation: strict # Options: strict, standard, relaxed
        composes:
          wraps:
          - api-rest-resource-oriented
      crud-single-model: # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
        api_style: rest  # Options: rest, graphql, rpc
        validation_strategy: server-side # Options: server-side, client-side, both
        soft_delete: false # Boolean
        audit_logging: false # Boolean
        pagination_default_size: 20 # Range: 10-100
      finops-cost-allocation-tags: # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
        tagging_strategy: hierarchical  # Options: hierarchical, flat, hybrid
        enforcement_level: required # Options: required, recommended, optional
        cost_allocation_model: showback # Options: chargeback, showback, hybrid
        tag_inheritance: true # Boolean
        automated_tagging: true # Boolean
      release-feature-flags: # Decouple deploy from release; safer experiments; needs kill switches and governance.
        flag_storage: config_file  # Options: config_file, database, feature_flag_service, environment_variables
        evaluation_strategy: simple_boolean # Options: simple_boolean, percentage_rollout, user_targeting, multi_variate
        targeting_capability: none # Options: none, user_attributes, context_based, advanced_segments
        kill_switch_enabled: true # Boolean
        audit_logging: false # Boolean
      obs-telemetry-backend--aws-cloudwatch: # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
        log_retention_days: 30  # Options: 1, 3, 7, 14, 30, 60, 90, 180, 365
        xray_sampling_rate: 0.05 # Options: 0.01, 0.05, 0.1, 0.5, 1.0
      obs-golden-signals: # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
        latency_percentile: p95  # Options: p50, p95, p99, p99.9
        error_rate_threshold: 1_percent # Options: 0.1_percent, 1_percent, 5_percent
        saturation_metric: cpu_memory # Options: cpu_memory, queue_depth, connection_pool, disk_io
        sli_window: 30_days # Options: 7_days, 30_days, 90_days
        alert_severity_levels: critical_warning # Options: critical_only, critical_warning, critical_warning_info
        composes:
          layered_after:
          - obs-open-telemetry-baseline
      obs-open-telemetry-baseline: # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
        export_backend: otlp  # Options: otlp, jaeger, zipkin, prometheus, datadog, newrelic, honeycomb
        trace_sampling_strategy: parent-based # Options: always-on, always-off, parent-based, trace-id-ratio
        trace_sampling_rate: 1.0 # Range: 0.0-1.0
        metrics_export_interval: 60 # Range: 10-300
        log_correlation: true # Boolean
        resource_detection: true # Boolean
        propagation_format: w3c-tracecontext # Options: w3c-tracecontext, b3, jaeger, multi
        composes:
          co_runs_with:
          - api-rest-resource-oriented
      finops-budget-guardrails: # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
        budget_period: monthly  # Options: monthly, quarterly, annual
        alert_thresholds:
        - 50
        - 80
        - 100
        enforcement_action: alert # Options: alert, prevent, throttle
        tagging_strategy: mandatory # Options: mandatory, recommended, optional
        policy_enforcement: soft # Options: soft, hard, audit
        cost_allocation_level: project # Options: project, team, environment, service
      ops-slo-error-budgets: # Define SLOs and error budgets to balance reliability and velocity.
        slo_target_percentage: 99.9  # Range: 90-99.999
        measurement_window_days: 30 # Options: 7, 28, 30, 90
        error_budget_policy: halt-deployments # Options: halt-deployments, alert-only, slow-rollouts, require-approval
        sli_type: availability # Options: availability, latency, throughput, correctness, composite
        alerting_threshold_percentage: 80 # Range: 50-100
    P2:
      arch-egress-minimization:  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
        cdn_strategy: full  # Options: full, static-only, none
        data_locality: regional # Options: global, regional, single-zone
        cross_region_replication: minimal # Options: none, minimal, full
        compression_enabled: true # Boolean
        static_asset_strategy: cdn_edge # Options: cdn_edge, regional_storage, origin_only
      api-versioning-header: # Version via headers/media types; keeps URLs stable; harder to debug and cache.
        version_header_name: API-Version  # Options: API-Version, X-API-Version, Accept-Version, Custom-Header
        version_format: date-based # Options: semantic, date-based, sequential
        fallback_behavior: latest-stable # Options: latest-stable, oldest-supported, reject-request
        content_negotiation: false # Boolean
        deprecation_policy: warning-header # Options: sunset-header, warning-header, both
      gov-system-manifest: # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
        manifest_path: docs/architecture/manifest.yaml
        manifest_format: yaml  # Options: yaml, toml, json
        manifest_scope: # Options: agent-tools, agent-skills, agent-models, agent-prompts, data_sources, services, external_dependencies
        - agent-tools
        - agent-skills
        - agent-models
        - agent-prompts
        pin_versions: true # Boolean
        ci_validation: required # Options: required, optional, off
        drift_policy: fail-build # Options: fail-build, warn-only, off
        composes:
          layered_after:
          - iac-cloudformation
          co_runs_with:
          - release-feature-flags
          - gov-adrs-mandatory
          - ops-runbooks
    P3:
      ops-runbooks:  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
        runbook_format: markdown  # Options: markdown, wiki, structured_yaml, ticketing_system
        incident_severity_levels: 4 # Options: 3, 4, 5
        escalation_policy: tiered # Options: tiered, follow_the_sun, flat, hybrid
        automation_integration: manual # Options: manual, semi_automated, fully_automated
        review_frequency: quarterly # Options: monthly, quarterly, biannual, post_incident
      gov-adrs-mandatory: # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.
        adr_format: madr  # Options: madr, nygard, y-statements, custom
        storage_location: docs/adrs # Options: docs/adrs, docs/architecture/decisions, adr, wiki
        decision_threshold: significant # Options: all, significant, strategic-only
        review_requirement: peer-review # Options: peer-review, architect-approval, team-consensus, none

# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
# Total Patterns Selected: 23
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      49,750
# Pattern OpEx (monthly):     $         435
# Ops Team Cost (monthly):    $           0
# Total OpEx (monthly):       $         435
# Total TCO (24mo):         $      60,190
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✓ PASS
#
# ============================================================

# ============================================================
# Cost Feasibility Analysis (Details)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
#
# Ops team size: 0 engineers (no ops cost)
#
# Ops Team Cost Algorithm (for reference):
#   Formula: ops_team_size × single_resource_monthly_ops_usd × on_call_multiplier × deploy_freq_multiplier
#   Based on:
#     - Google SRE Handbook (2016): On-call burden = 25-50% FTE overhead
#     - DORA State of DevOps (2021): Deploy frequency impact on ops overhead
#
# Calculating costs for 23 selected patterns:
#
# PER-PATTERN COSTS:
# ────────────────────────────────────────────────────────────
#
#  1. arch-serverless--aws (match score: 35.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  2. db-managed-postgres (match score: 32.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  3. arch-serverless-pay-per-use--aws-lambda (match score: 28.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  4. caching-required-low-latency (match score: 27.00)
#     Adoption: $0.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  5. resilience-circuit-breaker (match score: 26.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  6. cache-aside (match score: 26.00)
#     Adoption: $800.0
#     Monthly (min): $15.0
#     Monthly (expected): $15.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $15.0
#
#  7. api-rest-resource-oriented (match score: 25.00)
#     Adoption: $750.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  8. deploy-blue-green (match score: 25.00)
#     Adoption: $2,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  9. sec-auth-oauth2-oidc (match score: 23.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  10. crud-single-model (match score: 22.00)
#     Adoption: $300.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  11. finops-cost-allocation-tags (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  12. arch-egress-minimization (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  13. release-feature-flags (match score: 19.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  14. api-versioning-header (match score: 16.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  15. obs-telemetry-backend--aws-cloudwatch (match score: 14.00)
#     Adoption: $500.0
#     Monthly (min): $20.0
#     Monthly (expected): $20.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $20.0
#
#  16. obs-golden-signals (match score: 12.00)
#     Adoption: $3,000.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  17. obs-open-telemetry-baseline (match score: 12.00)
#     Adoption: $3,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  18. finops-budget-guardrails (match score: 10.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  19. iac-cloudformation (match score: 10.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  20. ops-runbooks (match score: 8.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  21. ops-slo-error-budgets (match score: 8.00)
#     Adoption: $4,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  22. gov-system-manifest (match score: 7.00)
#     Adoption: $4,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  23. gov-adrs-mandatory (match score: 7.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
# Total Monthly OpEx: $435.0
# Monthly operational ceiling: $500 ✓ PASS
# ============================================================

# ============================================================
# ⚠️  Pattern Advisory Warnings
# (Patterns are still SELECTED — review these before finalizing)
# ============================================================
#
# [warning] warn_nfr:
#   cache-aside: peak read QPS is 5 req/s (<10 req/s). Caching overhead (infrastructure, invalidation, serialization) may outweigh benefit at this scale. Consider simplifying to direct DB access.
#
#   Suggestions:
#   - Cache-aside adds infrastructure complexity (cache store, invalidation logic, serialization). At <10 read req/s the overhead rarely justifies the benefit.
#
# ============================================================

warn_nfr advisory: a selected pattern is under-utilised given current NFR values. The fix is either add real throughput data (next step), or reconsider whether this feature flag is worth keeping for this workload.

7 Step 7: provide throughput data to resolve the warn_nfr advisory

The author looks up actual peak load: 20 read QPS, 10 write QPS. Modest by AWS API standards, but real numbers. With concrete throughput data the compiler can make a definitive call — either the advisory disappears (caching is justified at this rate) or it persists with sharper reasoning. Either outcome is more useful than a pattern selected in the dark.

spec.yaml (delta added in this step)

nfr:
  throughput:
    peak_query_per_second_read: 20
    peak_query_per_second_write: 10

view full spec.yaml at this step (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws
  language: javascript
  platform: api
  features:
    caching: true
nfr:
  availability:
    target: 0.999
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100
  throughput:
    peak_query_per_second_read: 20
    peak_query_per_second_write: 10

8 Step 8: recompile — pattern set stable, compose graph reveals the wiring

The pattern set stabilises. The verbose output lists every selected pattern under its priority bucket (meta / P0 / P1 / P2 / P3) with the full per-pattern defaultConfig values plus inline # Options: alternatives on each line. Below the compile output, the Mermaid composes graph visualises every inter-pattern relationship the registry declares for the selected patterns. Solid arrows are kept edges (target pattern is in the selection — the implementing agent will wire these); dashed arrows are pruned edges (target wasn't selected, so the relationship doesn't survive into this architecture).

compiler output (verbose -v)

# ─── what the compiler FILLED IN as assumptions ───
assumptions:
  constraints:
    tenantCount: 1
    features:
      async_messaging: false  # arch-serverless--aws, deploy-canary, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (8 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      compliance:
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, caching-required-low-latency, api-rest-resource-oriented, ... (1 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      audit_logging: false # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (6 more)
  operating_model:
    on_call: false
    deploy_freq: weekly
    ops_team_size: 0
    single_resource_monthly_ops_usd: 10000
    amortization_months: 24
  cost:
    # …  (more defaults below; expand the full output to see them)

# ─── pattern SELECTION with per-pattern config + alternatives ───
# Each line under a pattern shows the value the compiler ASSUMED.
# The `# Options: …` annotation lists alternatives you can override
# by setting `patterns.<pid>.<field>` in the spec (see next step).
  patterns:
    meta:
      caching-required-low-latency: {}  # Policy pattern that enforces caching when P99 latency target is 100ms or below — the threshold at which direct database queries under any meaningful load are unlikely to reliably satisfy the SLO without a cache layer in front.
    P0:
      arch-serverless--aws:  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
        compute_service: lambda  # Options: lambda, fargate
        api_gateway: api-gateway-http # Options: api-gateway-http, api-gateway-rest, alb, function-url
        database: dynamodb # Options: dynamodb, aurora-serverless, rds-proxy
        storage: s3-standard # Options: s3-standard, s3-intelligent-tiering, efs
        auth_service: cognito # Options: cognito, lambda-authorizer, iam
        event_bus: eventbridge # Options: eventbridge, sns-sqs, kinesis
      db-managed-postgres: # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
        provider: supabase  # Options: supabase, neon, render, railway, digitalocean-app-platform
        instance_size: small # Options: micro, small, medium, large
        storage_gb: 8 # Range: 1-500
        backup_retention_days: 7 # Range: 1-30
        connection_pooling: true # Boolean
        high_availability: false # Boolean
        ssl_mode: require # Options: disable, allow, prefer, require, verify-ca, verify-full
      arch-serverless-pay-per-use--aws-lambda: # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
        memory_size: 512  # Options: 128, 256, 512, 1024, 2048, 3008
        timeout: 30 # Range: 3-900
        architecture: x86_64 # Options: x86_64, arm64
        provisioned_concurrency: 0 # Range: 0-1000
      iac-cloudformation: # AWS-native IaC; deep service coverage; AWS-specific.
        stack_naming_convention: project-environment-resource  # Options: project-environment-resource, environment-project-resource, flat-naming, custom
        change_set_enabled: true # Boolean
        termination_protection: false # Boolean
        drift_detection: true # Boolean
        stack_policy: none # Options: none, protect-all, protect-data-resources, custom
    P1:
      deploy-canary:  # Release to small traffic slice; monitor; gradually increase; needs good metrics and routing controls.
        initial_traffic_percentage: 5  # Range: 1-50
        increment_percentage: 10 # Range: 5-50
        increment_interval_minutes: 15 # Range: 5-120
        monitoring_window_minutes: 10 # Range: 5-60
        rollback_on_error_threshold: 5 # Range: 0.1-10
        success_criteria: error_rate_and_latency # Options: error_rate_only, error_rate_and_latency, error_rate_and_latency_and_saturation, custom_metrics
        composes:
          layered_after:
          - iac-cloudformation
      resilience-rate-limiting: # Protect from overload; enforce per-tenant quotas; supports fairness and cost control.
        algorithm: token-bucket  # Options: token-bucket, leaky-bucket, fixed-window, sliding-window
        scope: per-tenant # Options: global, per-tenant, per-user, per-ip
        enforcement_point: application # Options: gateway, application, distributed
        quota_type: requests # Options: requests, compute-time, data-volume, cost-based
        burst_allowance: enabled # Options: enabled, disabled, limited
        composes:
          wraps:
          - api-rest-resource-oriented
      resilience-circuit-breaker: # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
        failure_threshold: 5  # Range: 1-20
        success_threshold: 2 # Range: 1-10
        timeout_duration: 60 # Range: 5-300
        half_open_max_calls: 1 # Range: 1-10
        fallback_strategy: fail_fast # Options: fail_fast, cached_response, default_value, alternative_service
      cache-aside: # Application reads cache then DB on miss; writes update DB then invalidates/updates cache explicitly.
        invalidation_strategy: ttl  # Options: ttl, event-based, manual, lru
        ttl_seconds: 3600 # Range: 1-86400
        max_memory_mb: 512 # Range: 128-16384
        cache_backend: redis # Options: redis, memcached, in-memory, hazelcast
        write_strategy: write-through # Options: write-through, write-behind, invalidate-only
        serialization_format: json # Options: json, msgpack, protobuf, pickle
        composes:
          wraps:
          - db-managed-postgres
      api-rest-resource-oriented: # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
        pagination_style: offset  # Options: offset, cursor, page_number
        max_page_size: 100 # Range: 10-1000
        versioning_strategy: uri # Options: uri, header, query_param, none
        filtering_style: query_params # Options: query_params, json_body, graphql_like
        cache_strategy: etag # Options: etag, last_modified, cache_control, none
        id_format: uuid # Options: uuid, integer, slug, composite
        response_envelope: false # Boolean
      sec-auth-oauth2-oidc: # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
        oauth_flow: authorization_code  # Options: authorization_code, client_credentials, device_code, implicit
        token_storage: secure_storage # Options: secure_storage, memory_only, encrypted_storage, httponly_cookie
        pkce_enabled: true # Boolean
        scope_strategy: minimal # Options: minimal, role_based, resource_specific
        token_refresh: automatic # Options: automatic, manual, sliding_window
        id_token_validation: strict # Options: strict, standard, relaxed
        composes:
          wraps:
          - api-rest-resource-oriented
      crud-single-model: # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
        api_style: rest  # Options: rest, graphql, rpc
        validation_strategy: server-side # Options: server-side, client-side, both
        soft_delete: false # Boolean
        audit_logging: false # Boolean
        pagination_default_size: 20 # Range: 10-100
      finops-cost-allocation-tags: # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
        tagging_strategy: hierarchical  # Options: hierarchical, flat, hybrid
        enforcement_level: required # Options: required, recommended, optional
        cost_allocation_model: showback # Options: chargeback, showback, hybrid
        tag_inheritance: true # Boolean
        automated_tagging: true # Boolean
      release-feature-flags: # Decouple deploy from release; safer experiments; needs kill switches and governance.
        flag_storage: config_file  # Options: config_file, database, feature_flag_service, environment_variables
        evaluation_strategy: simple_boolean # Options: simple_boolean, percentage_rollout, user_targeting, multi_variate
        targeting_capability: none # Options: none, user_attributes, context_based, advanced_segments
        kill_switch_enabled: true # Boolean
        audit_logging: false # Boolean
      obs-telemetry-backend--aws-cloudwatch: # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
        log_retention_days: 30  # Options: 1, 3, 7, 14, 30, 60, 90, 180, 365
        xray_sampling_rate: 0.05 # Options: 0.01, 0.05, 0.1, 0.5, 1.0
      obs-golden-signals: # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
        latency_percentile: p95  # Options: p50, p95, p99, p99.9
        error_rate_threshold: 1_percent # Options: 0.1_percent, 1_percent, 5_percent
        saturation_metric: cpu_memory # Options: cpu_memory, queue_depth, connection_pool, disk_io
        sli_window: 30_days # Options: 7_days, 30_days, 90_days
        alert_severity_levels: critical_warning # Options: critical_only, critical_warning, critical_warning_info
        composes:
          layered_after:
          - obs-open-telemetry-baseline
      obs-open-telemetry-baseline: # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
        export_backend: otlp  # Options: otlp, jaeger, zipkin, prometheus, datadog, newrelic, honeycomb
        trace_sampling_strategy: parent-based # Options: always-on, always-off, parent-based, trace-id-ratio
        trace_sampling_rate: 1.0 # Range: 0.0-1.0
        metrics_export_interval: 60 # Range: 10-300
        log_correlation: true # Boolean
        resource_detection: true # Boolean
        propagation_format: w3c-tracecontext # Options: w3c-tracecontext, b3, jaeger, multi
        composes:
          co_runs_with:
          - api-rest-resource-oriented
      finops-budget-guardrails: # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
        budget_period: monthly  # Options: monthly, quarterly, annual
        alert_thresholds:
        - 50
        - 80
        - 100
        enforcement_action: alert # Options: alert, prevent, throttle
        tagging_strategy: mandatory # Options: mandatory, recommended, optional
        policy_enforcement: soft # Options: soft, hard, audit
        cost_allocation_level: project # Options: project, team, environment, service
      ops-slo-error-budgets: # Define SLOs and error budgets to balance reliability and velocity.
        slo_target_percentage: 99.9  # Range: 90-99.999
        measurement_window_days: 30 # Options: 7, 28, 30, 90
        error_budget_policy: halt-deployments # Options: halt-deployments, alert-only, slow-rollouts, require-approval
        sli_type: availability # Options: availability, latency, throughput, correctness, composite
        alerting_threshold_percentage: 80 # Range: 50-100
    P2:
      arch-egress-minimization:  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
        cdn_strategy: full  # Options: full, static-only, none
        data_locality: regional # Options: global, regional, single-zone
        cross_region_replication: minimal # Options: none, minimal, full
        compression_enabled: true # Boolean
        static_asset_strategy: cdn_edge # Options: cdn_edge, regional_storage, origin_only
      api-versioning-header: # Version via headers/media types; keeps URLs stable; harder to debug and cache.
        version_header_name: API-Version  # Options: API-Version, X-API-Version, Accept-Version, Custom-Header
        version_format: date-based # Options: semantic, date-based, sequential
        fallback_behavior: latest-stable # Options: latest-stable, oldest-supported, reject-request
        content_negotiation: false # Boolean
        deprecation_policy: warning-header # Options: sunset-header, warning-header, both
      gov-system-manifest: # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
        manifest_path: docs/architecture/manifest.yaml
        manifest_format: yaml  # Options: yaml, toml, json
        manifest_scope: # Options: agent-tools, agent-skills, agent-models, agent-prompts, data_sources, services, external_dependencies
        - agent-tools
        - agent-skills
        - agent-models
        - agent-prompts
        pin_versions: true # Boolean
        ci_validation: required # Options: required, optional, off
        drift_policy: fail-build # Options: fail-build, warn-only, off
        composes:
          layered_after:
          - iac-cloudformation
          co_runs_with:
          - release-feature-flags
          - gov-adrs-mandatory
          - ops-runbooks
    P3:
      ops-runbooks:  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
        runbook_format: markdown  # Options: markdown, wiki, structured_yaml, ticketing_system
        incident_severity_levels: 4 # Options: 3, 4, 5
        escalation_policy: tiered # Options: tiered, follow_the_sun, flat, hybrid
        automation_integration: manual # Options: manual, semi_automated, fully_automated
        review_frequency: quarterly # Options: monthly, quarterly, biannual, post_incident
      gov-adrs-mandatory: # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.
        adr_format: madr  # Options: madr, nygard, y-statements, custom
        storage_location: docs/adrs # Options: docs/adrs, docs/architecture/decisions, adr, wiki
        decision_threshold: significant # Options: all, significant, strategic-only
        review_requirement: peer-review # Options: peer-review, architect-approval, team-consensus, none

# ─── warns and cost feasibility ───
# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
# Total Patterns Selected: 24
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      52,250
# Pattern OpEx (monthly):     $         435
# Ops Team Cost (monthly):    $           0
# Total OpEx (monthly):       $         435
# Total TCO (24mo):         $      62,690
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✓ PASS
#
# ============================================================

see full compiled-spec.yaml (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
  language: javascript # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
  platform: api # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
  features:
    caching: true  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (9 more)
nfr:
  availability:
    target: 0.999  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
  latency:
    p95Milliseconds: 50  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (5 more)
    p99Milliseconds: 100 # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (6 more)
  throughput:
    peak_query_per_second_read: 20  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (5 more)
    peak_query_per_second_write: 10 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
assumptions:
  constraints:
    saas-providers: []
    disallowed-saas-providers: []
    ai-inference-platforms: []
    disallowed-ai-inference-platforms: []
    model-vendors: []
    disallowed-model-vendors: []
    tenantCount: 1
    features:
      async_messaging: false  # arch-serverless--aws, deploy-canary, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (8 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      compliance:
        gdpr: false  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
        gdpr_rtbf: false
        ccpa: false  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
        hipaa: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (20 more)
        sox: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, caching-required-low-latency, api-rest-resource-oriented, ... (1 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      audit_logging: false # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (6 more)
  operating_model:
    on_call: false
    deploy_freq: weekly
    ops_team_size: 0
    single_resource_monthly_ops_usd: 10000
    amortization_months: 24
  cost:
    intent:
      priority: minimize-opex
    ceilings:
      monthly_operational_usd: 500
      one_time_setup_usd: 1000
    preferences:
      prefer_free_tier_if_possible: true  # db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, caching-required-low-latency, ... (2 more)
      prefer_saas_first: false
  patterns:
    meta:
      caching-required-low-latency: {}  # Policy pattern that enforces caching when P99 latency target is 100ms or below — the threshold at which direct database queries under any meaningful load are unlikely to reliably satisfy the SLO without a cache layer in front.
    P0:
      arch-serverless--aws:  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
        compute_service: lambda  # Options: lambda, fargate
        api_gateway: api-gateway-http # Options: api-gateway-http, api-gateway-rest, alb, function-url
        database: dynamodb # Options: dynamodb, aurora-serverless, rds-proxy
        storage: s3-standard # Options: s3-standard, s3-intelligent-tiering, efs
        auth_service: cognito # Options: cognito, lambda-authorizer, iam
        event_bus: eventbridge # Options: eventbridge, sns-sqs, kinesis
      db-managed-postgres: # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
        provider: supabase  # Options: supabase, neon, render, railway, digitalocean-app-platform
        instance_size: small # Options: micro, small, medium, large
        storage_gb: 8 # Range: 1-500
        backup_retention_days: 7 # Range: 1-30
        connection_pooling: true # Boolean
        high_availability: false # Boolean
        ssl_mode: require # Options: disable, allow, prefer, require, verify-ca, verify-full
      arch-serverless-pay-per-use--aws-lambda: # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
        memory_size: 512  # Options: 128, 256, 512, 1024, 2048, 3008
        timeout: 30 # Range: 3-900
        architecture: x86_64 # Options: x86_64, arm64
        provisioned_concurrency: 0 # Range: 0-1000
      iac-cloudformation: # AWS-native IaC; deep service coverage; AWS-specific.
        stack_naming_convention: project-environment-resource  # Options: project-environment-resource, environment-project-resource, flat-naming, custom
        change_set_enabled: true # Boolean
        termination_protection: false # Boolean
        drift_detection: true # Boolean
        stack_policy: none # Options: none, protect-all, protect-data-resources, custom
    P1:
      deploy-canary:  # Release to small traffic slice; monitor; gradually increase; needs good metrics and routing controls.
        initial_traffic_percentage: 5  # Range: 1-50
        increment_percentage: 10 # Range: 5-50
        increment_interval_minutes: 15 # Range: 5-120
        monitoring_window_minutes: 10 # Range: 5-60
        rollback_on_error_threshold: 5 # Range: 0.1-10
        success_criteria: error_rate_and_latency # Options: error_rate_only, error_rate_and_latency, error_rate_and_latency_and_saturation, custom_metrics
        composes:
          layered_after:
          - iac-cloudformation
      resilience-rate-limiting: # Protect from overload; enforce per-tenant quotas; supports fairness and cost control.
        algorithm: token-bucket  # Options: token-bucket, leaky-bucket, fixed-window, sliding-window
        scope: per-tenant # Options: global, per-tenant, per-user, per-ip
        enforcement_point: application # Options: gateway, application, distributed
        quota_type: requests # Options: requests, compute-time, data-volume, cost-based
        burst_allowance: enabled # Options: enabled, disabled, limited
        composes:
          wraps:
          - api-rest-resource-oriented
      resilience-circuit-breaker: # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
        failure_threshold: 5  # Range: 1-20
        success_threshold: 2 # Range: 1-10
        timeout_duration: 60 # Range: 5-300
        half_open_max_calls: 1 # Range: 1-10
        fallback_strategy: fail_fast # Options: fail_fast, cached_response, default_value, alternative_service
      cache-aside: # Application reads cache then DB on miss; writes update DB then invalidates/updates cache explicitly.
        invalidation_strategy: ttl  # Options: ttl, event-based, manual, lru
        ttl_seconds: 3600 # Range: 1-86400
        max_memory_mb: 512 # Range: 128-16384
        cache_backend: redis # Options: redis, memcached, in-memory, hazelcast
        write_strategy: write-through # Options: write-through, write-behind, invalidate-only
        serialization_format: json # Options: json, msgpack, protobuf, pickle
        composes:
          wraps:
          - db-managed-postgres
      api-rest-resource-oriented: # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
        pagination_style: offset  # Options: offset, cursor, page_number
        max_page_size: 100 # Range: 10-1000
        versioning_strategy: uri # Options: uri, header, query_param, none
        filtering_style: query_params # Options: query_params, json_body, graphql_like
        cache_strategy: etag # Options: etag, last_modified, cache_control, none
        id_format: uuid # Options: uuid, integer, slug, composite
        response_envelope: false # Boolean
      sec-auth-oauth2-oidc: # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
        oauth_flow: authorization_code  # Options: authorization_code, client_credentials, device_code, implicit
        token_storage: secure_storage # Options: secure_storage, memory_only, encrypted_storage, httponly_cookie
        pkce_enabled: true # Boolean
        scope_strategy: minimal # Options: minimal, role_based, resource_specific
        token_refresh: automatic # Options: automatic, manual, sliding_window
        id_token_validation: strict # Options: strict, standard, relaxed
        composes:
          wraps:
          - api-rest-resource-oriented
      crud-single-model: # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
        api_style: rest  # Options: rest, graphql, rpc
        validation_strategy: server-side # Options: server-side, client-side, both
        soft_delete: false # Boolean
        audit_logging: false # Boolean
        pagination_default_size: 20 # Range: 10-100
      finops-cost-allocation-tags: # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
        tagging_strategy: hierarchical  # Options: hierarchical, flat, hybrid
        enforcement_level: required # Options: required, recommended, optional
        cost_allocation_model: showback # Options: chargeback, showback, hybrid
        tag_inheritance: true # Boolean
        automated_tagging: true # Boolean
      release-feature-flags: # Decouple deploy from release; safer experiments; needs kill switches and governance.
        flag_storage: config_file  # Options: config_file, database, feature_flag_service, environment_variables
        evaluation_strategy: simple_boolean # Options: simple_boolean, percentage_rollout, user_targeting, multi_variate
        targeting_capability: none # Options: none, user_attributes, context_based, advanced_segments
        kill_switch_enabled: true # Boolean
        audit_logging: false # Boolean
      obs-telemetry-backend--aws-cloudwatch: # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
        log_retention_days: 30  # Options: 1, 3, 7, 14, 30, 60, 90, 180, 365
        xray_sampling_rate: 0.05 # Options: 0.01, 0.05, 0.1, 0.5, 1.0
      obs-golden-signals: # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
        latency_percentile: p95  # Options: p50, p95, p99, p99.9
        error_rate_threshold: 1_percent # Options: 0.1_percent, 1_percent, 5_percent
        saturation_metric: cpu_memory # Options: cpu_memory, queue_depth, connection_pool, disk_io
        sli_window: 30_days # Options: 7_days, 30_days, 90_days
        alert_severity_levels: critical_warning # Options: critical_only, critical_warning, critical_warning_info
        composes:
          layered_after:
          - obs-open-telemetry-baseline
      obs-open-telemetry-baseline: # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
        export_backend: otlp  # Options: otlp, jaeger, zipkin, prometheus, datadog, newrelic, honeycomb
        trace_sampling_strategy: parent-based # Options: always-on, always-off, parent-based, trace-id-ratio
        trace_sampling_rate: 1.0 # Range: 0.0-1.0
        metrics_export_interval: 60 # Range: 10-300
        log_correlation: true # Boolean
        resource_detection: true # Boolean
        propagation_format: w3c-tracecontext # Options: w3c-tracecontext, b3, jaeger, multi
        composes:
          co_runs_with:
          - api-rest-resource-oriented
      finops-budget-guardrails: # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
        budget_period: monthly  # Options: monthly, quarterly, annual
        alert_thresholds:
        - 50
        - 80
        - 100
        enforcement_action: alert # Options: alert, prevent, throttle
        tagging_strategy: mandatory # Options: mandatory, recommended, optional
        policy_enforcement: soft # Options: soft, hard, audit
        cost_allocation_level: project # Options: project, team, environment, service
      ops-slo-error-budgets: # Define SLOs and error budgets to balance reliability and velocity.
        slo_target_percentage: 99.9  # Range: 90-99.999
        measurement_window_days: 30 # Options: 7, 28, 30, 90
        error_budget_policy: halt-deployments # Options: halt-deployments, alert-only, slow-rollouts, require-approval
        sli_type: availability # Options: availability, latency, throughput, correctness, composite
        alerting_threshold_percentage: 80 # Range: 50-100
    P2:
      arch-egress-minimization:  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
        cdn_strategy: full  # Options: full, static-only, none
        data_locality: regional # Options: global, regional, single-zone
        cross_region_replication: minimal # Options: none, minimal, full
        compression_enabled: true # Boolean
        static_asset_strategy: cdn_edge # Options: cdn_edge, regional_storage, origin_only
      api-versioning-header: # Version via headers/media types; keeps URLs stable; harder to debug and cache.
        version_header_name: API-Version  # Options: API-Version, X-API-Version, Accept-Version, Custom-Header
        version_format: date-based # Options: semantic, date-based, sequential
        fallback_behavior: latest-stable # Options: latest-stable, oldest-supported, reject-request
        content_negotiation: false # Boolean
        deprecation_policy: warning-header # Options: sunset-header, warning-header, both
      gov-system-manifest: # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
        manifest_path: docs/architecture/manifest.yaml
        manifest_format: yaml  # Options: yaml, toml, json
        manifest_scope: # Options: agent-tools, agent-skills, agent-models, agent-prompts, data_sources, services, external_dependencies
        - agent-tools
        - agent-skills
        - agent-models
        - agent-prompts
        pin_versions: true # Boolean
        ci_validation: required # Options: required, optional, off
        drift_policy: fail-build # Options: fail-build, warn-only, off
        composes:
          layered_after:
          - iac-cloudformation
          co_runs_with:
          - release-feature-flags
          - gov-adrs-mandatory
          - ops-runbooks
    P3:
      ops-runbooks:  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
        runbook_format: markdown  # Options: markdown, wiki, structured_yaml, ticketing_system
        incident_severity_levels: 4 # Options: 3, 4, 5
        escalation_policy: tiered # Options: tiered, follow_the_sun, flat, hybrid
        automation_integration: manual # Options: manual, semi_automated, fully_automated
        review_frequency: quarterly # Options: monthly, quarterly, biannual, post_incident
      gov-adrs-mandatory: # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.
        adr_format: madr  # Options: madr, nygard, y-statements, custom
        storage_location: docs/adrs # Options: docs/adrs, docs/architecture/decisions, adr, wiki
        decision_threshold: significant # Options: all, significant, strategic-only
        review_requirement: peer-review # Options: peer-review, architect-approval, team-consensus, none

# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
# Total Patterns Selected: 24
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      52,250
# Pattern OpEx (monthly):     $         435
# Ops Team Cost (monthly):    $           0
# Total OpEx (monthly):       $         435
# Total TCO (24mo):         $      62,690
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✓ PASS
#
# ============================================================

# ============================================================
# Cost Feasibility Analysis (Details)
# ============================================================
#
# Intent: minimize-opex
# Amortization: 24 months
#
# Ops team size: 0 engineers (no ops cost)
#
# Ops Team Cost Algorithm (for reference):
#   Formula: ops_team_size × single_resource_monthly_ops_usd × on_call_multiplier × deploy_freq_multiplier
#   Based on:
#     - Google SRE Handbook (2016): On-call burden = 25-50% FTE overhead
#     - DORA State of DevOps (2021): Deploy frequency impact on ops overhead
#
# Calculating costs for 24 selected patterns:
#
# PER-PATTERN COSTS:
# ────────────────────────────────────────────────────────────
#
#  1. arch-serverless--aws (match score: 35.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  2. db-managed-postgres (match score: 32.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  3. arch-serverless-pay-per-use--aws-lambda (match score: 28.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  4. deploy-canary (match score: 28.00)
#     Adoption: $3,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  5. caching-required-low-latency (match score: 27.00)
#     Adoption: $0.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  6. resilience-rate-limiting (match score: 27.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  7. resilience-circuit-breaker (match score: 26.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  8. cache-aside (match score: 26.00)
#     Adoption: $800.0
#     Monthly (min): $15.0
#     Monthly (expected): $15.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $15.0
#
#  9. api-rest-resource-oriented (match score: 25.00)
#     Adoption: $750.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  10. sec-auth-oauth2-oidc (match score: 23.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  11. crud-single-model (match score: 22.00)
#     Adoption: $300.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  12. finops-cost-allocation-tags (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  13. arch-egress-minimization (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  14. release-feature-flags (match score: 19.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  15. api-versioning-header (match score: 16.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  16. obs-telemetry-backend--aws-cloudwatch (match score: 14.00)
#     Adoption: $500.0
#     Monthly (min): $20.0
#     Monthly (expected): $20.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $20.0
#
#  17. obs-golden-signals (match score: 12.00)
#     Adoption: $3,000.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  18. obs-open-telemetry-baseline (match score: 12.00)
#     Adoption: $3,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  19. finops-budget-guardrails (match score: 10.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  20. iac-cloudformation (match score: 10.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  21. ops-runbooks (match score: 8.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  22. ops-slo-error-budgets (match score: 8.00)
#     Adoption: $4,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $100.0
#
#  23. gov-system-manifest (match score: 7.00)
#     Adoption: $4,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
#  24. gov-adrs-mandatory (match score: 7.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     Monthly OpEx: $0.0
#
# Total Monthly OpEx: $435.0
# Monthly operational ceiling: $500 ✓ PASS
# ============================================================

composes graph — pattern relationships (10 kept, 12 pruned)

graph LR
  n_gov_system_manifest["gov-system-manifest"] -->|co-runs with| n_gov_adrs_mandatory["gov-adrs-mandatory"]
  n_gov_system_manifest["gov-system-manifest"] -->|co-runs with| n_ops_runbooks["ops-runbooks"]
  n_gov_system_manifest["gov-system-manifest"] -->|co-runs with| n_release_feature_flags["release-feature-flags"]
  n_obs_open_telemetry_baseline["obs-open-telemetry-baseline"] -->|co-runs with| n_api_rest_resource_oriented["api-rest-resource-oriented"]
  n_deploy_canary["deploy-canary"] -->|layered after| n_iac_cloudformation["iac-cloudformation"]
  n_gov_system_manifest["gov-system-manifest"] -->|layered after| n_iac_cloudformation["iac-cloudformation"]
  n_obs_golden_signals["obs-golden-signals"] -->|layered after| n_obs_open_telemetry_baseline["obs-open-telemetry-baseline"]
  n_cache_aside["cache-aside"] -->|wraps| n_db_managed_postgres["db-managed-postgres"]
  n_resilience_rate_limiting["resilience-rate-limiting"] -->|wraps| n_api_rest_resource_oriented["api-rest-resource-oriented"]
  n_sec_auth_oauth2_oidc["sec-auth-oauth2-oidc"] -->|wraps| n_api_rest_resource_oriented["api-rest-resource-oriented"]
  n_obs_open_telemetry_baseline["obs-open-telemetry-baseline"] -.->|co-runs with| n_api_graphql_schema_first["api-graphql-schema-first"]
  n_deploy_canary["deploy-canary"] -.->|layered after| n_iac_bicep["iac-bicep"]
  n_deploy_canary["deploy-canary"] -.->|layered after| n_iac_terraform["iac-terraform"]
  n_gov_system_manifest["gov-system-manifest"] -.->|layered after| n_iac_terraform["iac-terraform"]
  n_cache_aside["cache-aside"] -.->|wraps| n_db_key_value__generic["db-key-value--generic"]
  n_cache_aside["cache-aside"] -.->|wraps| n_db_managed_mysql__planetscale["db-managed-mysql--planetscale"]
  n_cache_aside["cache-aside"] -.->|wraps| n_db_nosql_document__generic["db-nosql-document--generic"]
  n_resilience_circuit_breaker["resilience-circuit-breaker"] -.->|wraps| n_async_fire_and_forget["async-fire-and-forget"]
  n_resilience_circuit_breaker["resilience-circuit-breaker"] -.->|wraps| n_sync_request_reply_grpc["sync-request-reply-grpc"]
  n_resilience_circuit_breaker["resilience-circuit-breaker"] -.->|wraps| n_sync_request_reply_rest["sync-request-reply-rest"]
  n_resilience_rate_limiting["resilience-rate-limiting"] -.->|wraps| n_api_graphql_schema_first["api-graphql-schema-first"]
  n_sec_auth_oauth2_oidc["sec-auth-oauth2-oidc"] -.->|wraps| n_api_graphql_schema_first["api-graphql-schema-first"]
  classDef pruned stroke-dasharray:4,color:#aaa,fill:#222,stroke:#888
  class n_api_graphql_schema_first pruned
  class n_async_fire_and_forget pruned
  class n_db_key_value__generic pruned
  class n_db_managed_mysql__planetscale pruned
  class n_db_nosql_document__generic pruned
  class n_iac_bicep pruned
  class n_iac_terraform pruned
  class n_sync_request_reply_grpc pruned
  class n_sync_request_reply_rest pruned

Solid arrows are kept edges (target pattern is in the selected set — the implementing agent will wire these). Dashed arrows + dimmed nodes are pruned edges (target pattern was not selected for this spec, so the compiler dropped the edge from the inlined graph — the pattern still ships, but without that relationship). Edge labels: layered after (build/deploy order), wraps (request-time concern), co-runs with (runtime siblings), dispatches to (handoff).

Pattern set sized to real load. The advisory either cleared or re-fired with sharper reasoning. Both outcomes are signal — the architecture is now grounded in numbers, not guesses.

9 Step 9: declare the cost intent + operating model + ceilings

Up to now everything has been pattern fit. Step 9 commits to the cost story so the compiler can run a full feasibility check. Intent: optimize-tco. Ceilings: $500/month operational, $1000 one-time setup. Operating model: 2 dedicated ops engineers at $10k/mo loaded, on-call enabled (1.5× multiplier per the Google SRE Book), daily deploys (1.0× multiplier per DORA State of DevOps), 24-month CapEx amortisation.

spec.yaml (delta added in this step)

cost:
  intent:
    priority: optimize-tco
  ceilings:
    monthly_operational_usd: 500
    one_time_setup_usd: 1000
operating_model:
  ops_team_size: 2
  single_resource_monthly_ops_usd: 10000
  on_call: true
  deploy_freq: daily
  amortization_months: 24

view full spec.yaml at this step (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws
  language: javascript
  platform: api
  features:
    caching: true
nfr:
  availability:
    target: 0.999
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100
  throughput:
    peak_query_per_second_read: 20
    peak_query_per_second_write: 10
cost:
  intent:
    priority: optimize-tco
  ceilings:
    monthly_operational_usd: 500
    one_time_setup_usd: 1000
operating_model:
  ops_team_size: 2
  single_resource_monthly_ops_usd: 10000
  on_call: true
  deploy_freq: daily
  amortization_months: 24

10 Step 10: recompile — cost feasibility check fails loudly with concrete fixes

The compiler runs a full cost feasibility analysis across three buckets: Pattern OpEx (sum of each selected pattern's estimated monthly infra cost), Ops Team Cost (ops_team_size × single_resource_monthly_ops_usd × on_call_multiplier × deploy_freq_multiplier), and CapEx (one-time adoption + setup). For this spec the breakdown lands at Pattern OpEx $435/mo, Ops Team Cost $18,000/mo, CapEx $48,250 one-time, TCO over 24 months $490,690. Every one of those numbers exceeds the declared ceiling — the [high] cost_tco_exceeds_ceiling warning fires referencing the optimize-tco intent, AND lists three concrete levers the author can pull: raise ceilings, drop specific high-TCO patterns (the compiler names them), or shorten the amortisation horizon.

compiler output (verbose -v)

# ─── what the compiler FILLED IN as assumptions ───
assumptions:
  constraints:
    tenantCount: 1
    features:
      async_messaging: false  # arch-serverless--aws, deploy-canary, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (8 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      compliance:
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, caching-required-low-latency, api-rest-resource-oriented, ... (1 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      audit_logging: false # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (6 more)
  cost:

# ─── Matched Patterns based on input spec ───
# meta  = policy gates (always emitted when their feature flag is set)
# P0    = high priority — load-bearing architectural decisions
# P1    = mid priority — operational + observability + security baseline
# P2/P3 = lower priority — refinements + governance + docs
# Override priority by adding `patterns.<id>.recommended_priority: P0` to spec.
  patterns:
    meta:    # (1 pattern)
      - caching-required-low-latency  # Policy pattern that enforces caching when P99 latency target is 100ms or below — the threshold at which direct database queries under any meaningful load are unlikely to reliably satisfy the SLO without a cache layer in front.
    P0:    # (4 patterns)
      - arch-serverless--aws  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
      - db-managed-postgres  # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
      - arch-serverless-pay-per-use--aws-lambda  # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
      - iac-cloudformation  # AWS-native IaC; deep service coverage; AWS-specific.
    P1:    # (14 patterns)
      - deploy-canary  # Release to small traffic slice; monitor; gradually increase; needs good metrics and routing controls.
          composes:
            layered_after: ['iac-cloudformation']
      - resilience-rate-limiting  # Protect from overload; enforce per-tenant quotas; supports fairness and cost control.
          composes:
            wraps: ['api-rest-resource-oriented']
      - resilience-circuit-breaker  # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
      - cache-aside  # Application reads cache then DB on miss; writes update DB then invalidates/updates cache explicitly.
          composes:
            wraps: ['db-managed-postgres']
      - api-rest-resource-oriented  # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
      - sec-auth-oauth2-oidc  # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
          composes:
            wraps: ['api-rest-resource-oriented']
      - crud-single-model  # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
      - finops-cost-allocation-tags  # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
      - release-feature-flags  # Decouple deploy from release; safer experiments; needs kill switches and governance.
      - obs-telemetry-backend--aws-cloudwatch  # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
      - obs-golden-signals  # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
          composes:
            layered_after: ['obs-open-telemetry-baseline']
      - obs-open-telemetry-baseline  # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
          composes:
            co_runs_with: ['api-rest-resource-oriented']
      - finops-budget-guardrails  # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
      - ops-slo-error-budgets  # Define SLOs and error budgets to balance reliability and velocity.
    P2:    # (3 patterns)
      - arch-egress-minimization  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
      - api-versioning-header  # Version via headers/media types; keeps URLs stable; harder to debug and cache.
      - gov-system-manifest  # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
          composes:
            layered_after: ['iac-cloudformation']
            co_runs_with: ['release-feature-flags', 'gov-adrs-mandatory', 'ops-runbooks']
    P3:    # (2 patterns)
      - ops-runbooks  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
      - gov-adrs-mandatory  # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.

# ─── warns and cost feasibility ───
# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: optimize-tco
# Amortization: 24 months
# Total Patterns Selected: 24
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      52,250
# Pattern OpEx (monthly):     $         435
# Ops Team Cost (monthly):    $      18,000  (2 × $10,000)
# Total OpEx (monthly):       $      18,435
# Total TCO (24mo):         $     494,690
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✗ FAIL
#
# ⚠️  WARNINGS:
# ────────────────────────────────────────────────────────────
# [high] cost_tco_exceeds_ceiling:
#   Total cost of ownership over 24 months ($494690) exceeds ceiling ($13000) by $481690 (intent: optimize-tco)
#
#   Suggestions:
#   - Increase ceilings (capex: $1000, opex: $500)
#   - Remove high-TCO patterns: ops-slo-error-budgets, deploy-canary, obs-open-telemetry-baseline
#   - Reduce amortization period from 24 months
#
# ============================================================

see full compiled-spec.yaml (expand inline)

project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
  language: javascript # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
  platform: api # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
  features:
    caching: true  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (9 more)
nfr:
  availability:
    target: 0.999  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
  latency:
    p95Milliseconds: 50  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (5 more)
    p99Milliseconds: 100 # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (6 more)
  throughput:
    peak_query_per_second_read: 20  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (5 more)
    peak_query_per_second_write: 10 # arch-serverless--aws, db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, ... (3 more)
cost:
  intent:
    priority: optimize-tco
  ceilings:
    monthly_operational_usd: 500
    one_time_setup_usd: 1000
operating_model:
  ops_team_size: 2
  single_resource_monthly_ops_usd: 10000
  on_call: true
  deploy_freq: daily
  amortization_months: 24
assumptions:
  constraints:
    saas-providers: []
    disallowed-saas-providers: []
    ai-inference-platforms: []
    disallowed-ai-inference-platforms: []
    model-vendors: []
    disallowed-model-vendors: []
    tenantCount: 1
    features:
      async_messaging: false  # arch-serverless--aws, deploy-canary, arch-serverless-pay-per-use--aws-lambda, ... (9 more)
      ai_inference: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      multi_tenancy: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      batch_processing: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      distributed_transactions: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      real_time_streaming: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      vector_search: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (12 more)
      document_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      key_value_store: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      graph_database: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      time_series_db: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (10 more)
      oltp_workload: true # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      olap_workload: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (11 more)
      cold_archive_tiering: false
  nfr:
    rpo_minutes: 60  # finops-budget-guardrails
    rto_minutes: 60 # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (8 more)
    data:
      retention_days: 90
      pii: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      compliance:
        gdpr: false  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
        gdpr_rtbf: false
        ccpa: false  # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
        hipaa: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (20 more)
        sox: false # arch-serverless--aws, db-managed-postgres, deploy-canary, ... (21 more)
    consistency:
      needsReadYourWrites: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    durability:
      strict: false  # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (3 more)
    security:
      auth: oauth2_oidc  # arch-serverless--aws, caching-required-low-latency, api-rest-resource-oriented, ... (1 more)
      tenant_isolation: n/a # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (2 more)
      audit_logging: false # arch-serverless--aws, db-managed-postgres, caching-required-low-latency, ... (6 more)
  cost:
    preferences:
      prefer_free_tier_if_possible: true  # db-managed-postgres, arch-serverless-pay-per-use--aws-lambda, caching-required-low-latency, ... (2 more)
      prefer_saas_first: false
  patterns:
    meta:
      caching-required-low-latency: {}  # Policy pattern that enforces caching when P99 latency target is 100ms or below — the threshold at which direct database queries under any meaningful load are unlikely to reliably satisfy the SLO without a cache layer in front.
    P0:
      arch-serverless--aws:  # Structure the system as stateless, event-driven function handlers backed by AWS managed services (Lambda, API Gateway, DynamoDB, S3, SQS). No persistent servers — each function activates on demand, executes, and terminates. The architectural commitment is to build around events and AWS-managed primitives rather than long-running processes.
        compute_service: lambda  # Options: lambda, fargate
        api_gateway: api-gateway-http # Options: api-gateway-http, api-gateway-rest, alb, function-url
        database: dynamodb # Options: dynamodb, aurora-serverless, rds-proxy
        storage: s3-standard # Options: s3-standard, s3-intelligent-tiering, efs
        auth_service: cognito # Options: cognito, lambda-authorizer, iam
        event_bus: eventbridge # Options: eventbridge, sns-sqs, kinesis
      db-managed-postgres: # Use low-ops managed Postgres DBaaS providers (e.g., Supabase and managed cloud Postgres offerings) to reduce DB operations overhead; validate quotas, compliance, and scale limits.
        provider: supabase  # Options: supabase, neon, render, railway, digitalocean-app-platform
        instance_size: small # Options: micro, small, medium, large
        storage_gb: 8 # Range: 1-500
        backup_retention_days: 7 # Range: 1-30
        connection_pooling: true # Boolean
        high_availability: false # Boolean
        ssl_mode: require # Options: disable, allow, prefer, require, verify-ca, verify-full
      arch-serverless-pay-per-use--aws-lambda: # Cost optimisation pattern that eliminates idle infrastructure spend by running workloads on AWS Lambda’s per-invocation billing model. Well-suited to bursty or unpredictable workloads where provisioned servers would sit idle most of the time; accepts Lambda cold-start latency as the trade-off.
        memory_size: 512  # Options: 128, 256, 512, 1024, 2048, 3008
        timeout: 30 # Range: 3-900
        architecture: x86_64 # Options: x86_64, arm64
        provisioned_concurrency: 0 # Range: 0-1000
      iac-cloudformation: # AWS-native IaC; deep service coverage; AWS-specific.
        stack_naming_convention: project-environment-resource  # Options: project-environment-resource, environment-project-resource, flat-naming, custom
        change_set_enabled: true # Boolean
        termination_protection: false # Boolean
        drift_detection: true # Boolean
        stack_policy: none # Options: none, protect-all, protect-data-resources, custom
    P1:
      deploy-canary:  # Release to small traffic slice; monitor; gradually increase; needs good metrics and routing controls.
        initial_traffic_percentage: 5  # Range: 1-50
        increment_percentage: 10 # Range: 5-50
        increment_interval_minutes: 15 # Range: 5-120
        monitoring_window_minutes: 10 # Range: 5-60
        rollback_on_error_threshold: 5 # Range: 0.1-10
        success_criteria: error_rate_and_latency # Options: error_rate_only, error_rate_and_latency, error_rate_and_latency_and_saturation, custom_metrics
        composes:
          layered_after:
          - iac-cloudformation
      resilience-rate-limiting: # Protect from overload; enforce per-tenant quotas; supports fairness and cost control.
        algorithm: token-bucket  # Options: token-bucket, leaky-bucket, fixed-window, sliding-window
        scope: per-tenant # Options: global, per-tenant, per-user, per-ip
        enforcement_point: application # Options: gateway, application, distributed
        quota_type: requests # Options: requests, compute-time, data-volume, cost-based
        burst_allowance: enabled # Options: enabled, disabled, limited
        composes:
          wraps:
          - api-rest-resource-oriented
      resilience-circuit-breaker: # Stop calls to failing dependencies; half-open probing; fallback. Prevents cascading failures.
        failure_threshold: 5  # Range: 1-20
        success_threshold: 2 # Range: 1-10
        timeout_duration: 60 # Range: 5-300
        half_open_max_calls: 1 # Range: 1-10
        fallback_strategy: fail_fast # Options: fail_fast, cached_response, default_value, alternative_service
      cache-aside: # Application reads cache then DB on miss; writes update DB then invalidates/updates cache explicitly.
        invalidation_strategy: ttl  # Options: ttl, event-based, manual, lru
        ttl_seconds: 3600 # Range: 1-86400
        max_memory_mb: 512 # Range: 128-16384
        cache_backend: redis # Options: redis, memcached, in-memory, hazelcast
        write_strategy: write-through # Options: write-through, write-behind, invalidate-only
        serialization_format: json # Options: json, msgpack, protobuf, pickle
        composes:
          wraps:
          - db-managed-postgres
      api-rest-resource-oriented: # REST API designed around resources (nouns) manipulated via standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Resources are identified by stable URLs, responses are cacheable by default, and pagination/filtering are expressed as query parameters. Simpler tooling and stronger HTTP cache semantics than GraphQL; well-suited to public APIs and CRUD-heavy domains.
        pagination_style: offset  # Options: offset, cursor, page_number
        max_page_size: 100 # Range: 10-1000
        versioning_strategy: uri # Options: uri, header, query_param, none
        filtering_style: query_params # Options: query_params, json_body, graphql_like
        cache_strategy: etag # Options: etag, last_modified, cache_control, none
        id_format: uuid # Options: uuid, integer, slug, composite
        response_envelope: false # Boolean
      sec-auth-oauth2-oidc: # Use OAuth2 flows with OIDC identity tokens; standardized claims; delegated auth support.
        oauth_flow: authorization_code  # Options: authorization_code, client_credentials, device_code, implicit
        token_storage: secure_storage # Options: secure_storage, memory_only, encrypted_storage, httponly_cookie
        pkce_enabled: true # Boolean
        scope_strategy: minimal # Options: minimal, role_based, resource_specific
        token_refresh: automatic # Options: automatic, manual, sliding_window
        id_token_validation: strict # Options: strict, standard, relaxed
        composes:
          wraps:
          - api-rest-resource-oriented
      crud-single-model: # Simple CRUD on one canonical model; lowest complexity; best for straightforward domains.
        api_style: rest  # Options: rest, graphql, rpc
        validation_strategy: server-side # Options: server-side, client-side, both
        soft_delete: false # Boolean
        audit_logging: false # Boolean
        pagination_default_size: 20 # Range: 10-100
      finops-cost-allocation-tags: # Tagging/labeling strategy for per-tenant/product cost allocation and chargeback/showback.
        tagging_strategy: hierarchical  # Options: hierarchical, flat, hybrid
        enforcement_level: required # Options: required, recommended, optional
        cost_allocation_model: showback # Options: chargeback, showback, hybrid
        tag_inheritance: true # Boolean
        automated_tagging: true # Boolean
      release-feature-flags: # Decouple deploy from release; safer experiments; needs kill switches and governance.
        flag_storage: config_file  # Options: config_file, database, feature_flag_service, environment_variables
        evaluation_strategy: simple_boolean # Options: simple_boolean, percentage_rollout, user_targeting, multi_variate
        targeting_capability: none # Options: none, user_attributes, context_based, advanced_segments
        kill_switch_enabled: true # Boolean
        audit_logging: false # Boolean
      obs-telemetry-backend--aws-cloudwatch: # AWS-native observability backend using CloudWatch Metrics, CloudWatch Logs, and AWS X-Ray for distributed tracing. Zero infrastructure to operate; deeply integrated with all AWS services.
        log_retention_days: 30  # Options: 1, 3, 7, 14, 30, 60, 90, 180, 365
        xray_sampling_rate: 0.05 # Options: 0.01, 0.05, 0.1, 0.5, 1.0
      obs-golden-signals: # Monitor latency/traffic/errors/saturation; define SLIs and alert policies.
        latency_percentile: p95  # Options: p50, p95, p99, p99.9
        error_rate_threshold: 1_percent # Options: 0.1_percent, 1_percent, 5_percent
        saturation_metric: cpu_memory # Options: cpu_memory, queue_depth, connection_pool, disk_io
        sli_window: 30_days # Options: 7_days, 30_days, 90_days
        alert_severity_levels: critical_warning # Options: critical_only, critical_warning, critical_warning_info
        composes:
          layered_after:
          - obs-open-telemetry-baseline
      obs-open-telemetry-baseline: # Standardize traces/metrics/log correlation via OpenTelemetry; export to vendor or OSS backends.
        export_backend: otlp  # Options: otlp, jaeger, zipkin, prometheus, datadog, newrelic, honeycomb
        trace_sampling_strategy: parent-based # Options: always-on, always-off, parent-based, trace-id-ratio
        trace_sampling_rate: 1.0 # Range: 0.0-1.0
        metrics_export_interval: 60 # Range: 10-300
        log_correlation: true # Boolean
        resource_detection: true # Boolean
        propagation_format: w3c-tracecontext # Options: w3c-tracecontext, b3, jaeger, multi
        composes:
          co_runs_with:
          - api-rest-resource-oriented
      finops-budget-guardrails: # Implement budgets, alerts, tagging, and policy-as-code guardrails to enforce cost ceilings and prevent runaway spend.
        budget_period: monthly  # Options: monthly, quarterly, annual
        alert_thresholds:
        - 50
        - 80
        - 100
        enforcement_action: alert # Options: alert, prevent, throttle
        tagging_strategy: mandatory # Options: mandatory, recommended, optional
        policy_enforcement: soft # Options: soft, hard, audit
        cost_allocation_level: project # Options: project, team, environment, service
      ops-slo-error-budgets: # Define SLOs and error budgets to balance reliability and velocity.
        slo_target_percentage: 99.9  # Range: 90-99.999
        measurement_window_days: 30 # Options: 7, 28, 30, 90
        error_budget_policy: halt-deployments # Options: halt-deployments, alert-only, slow-rollouts, require-approval
        sli_type: availability # Options: availability, latency, throughput, correctness, composite
        alerting_threshold_percentage: 80 # Range: 50-100
    P2:
      arch-egress-minimization:  # Reduce cloud egress cost by co-locating compute and data, using CDNs, and avoiding cross-region data flows.
        cdn_strategy: full  # Options: full, static-only, none
        data_locality: regional # Options: global, regional, single-zone
        cross_region_replication: minimal # Options: none, minimal, full
        compression_enabled: true # Boolean
        static_asset_strategy: cdn_edge # Options: cdn_edge, regional_storage, origin_only
      api-versioning-header: # Version via headers/media types; keeps URLs stable; harder to debug and cache.
        version_header_name: API-Version  # Options: API-Version, X-API-Version, Accept-Version, Custom-Header
        version_format: date-based # Options: semantic, date-based, sequential
        fallback_behavior: latest-stable # Options: latest-stable, oldest-supported, reject-request
        content_negotiation: false # Boolean
        deprecation_policy: warning-header # Options: sunset-header, warning-header, both
      gov-system-manifest: # Pin and govern the inventory of components (agent-tools, agent-skills, agent-models, agent-prompts, services, data sources, external dependencies) the system depends on at a declared manifest path; CI validates on every PR and drift between manifest and built system fails the build.
        manifest_path: docs/architecture/manifest.yaml
        manifest_format: yaml  # Options: yaml, toml, json
        manifest_scope: # Options: agent-tools, agent-skills, agent-models, agent-prompts, data_sources, services, external_dependencies
        - agent-tools
        - agent-skills
        - agent-models
        - agent-prompts
        pin_versions: true # Boolean
        ci_validation: required # Options: required, optional, off
        drift_policy: fail-build # Options: fail-build, warn-only, off
        composes:
          layered_after:
          - iac-cloudformation
          co_runs_with:
          - release-feature-flags
          - gov-adrs-mandatory
          - ops-runbooks
    P3:
      ops-runbooks:  # Standard runbooks for incidents and routine ops; reduces MTTR and on-call stress.
        runbook_format: markdown  # Options: markdown, wiki, structured_yaml, ticketing_system
        incident_severity_levels: 4 # Options: 3, 4, 5
        escalation_policy: tiered # Options: tiered, follow_the_sun, flat, hybrid
        automation_integration: manual # Options: manual, semi_automated, fully_automated
        review_frequency: quarterly # Options: monthly, quarterly, biannual, post_incident
      gov-adrs-mandatory: # Record architecture decisions and tradeoffs; improves continuity; keep lightweight.
        adr_format: madr  # Options: madr, nygard, y-statements, custom
        storage_location: docs/adrs # Options: docs/adrs, docs/architecture/decisions, adr, wiki
        decision_threshold: significant # Options: all, significant, strategic-only
        review_requirement: peer-review # Options: peer-review, architect-approval, team-consensus, none

# ============================================================
# Cost Feasibility Analysis (Summary)
# ============================================================
#
# Intent: optimize-tco
# Amortization: 24 months
# Total Patterns Selected: 24
#
# COST BREAKDOWN:
# ────────────────────────────────────────────────────────────
# Total CapEx (one-time):     $      52,250
# Pattern OpEx (monthly):     $         435
# Ops Team Cost (monthly):    $      18,000  (2 × $10,000)
# Total OpEx (monthly):       $      18,435
# Total TCO (24mo):         $     494,690
#
# COST CEILINGS:
# ────────────────────────────────────────────────────────────
# CapEx Ceiling:              $       1,000 ✗ FAIL
# OpEx Ceiling (monthly):     $         500 ✗ FAIL
#
# ⚠️  WARNINGS:
# ────────────────────────────────────────────────────────────
# [high] cost_tco_exceeds_ceiling:
#   Total cost of ownership over 24 months ($494690) exceeds ceiling ($13000) by $481690 (intent: optimize-tco)
#
#   Suggestions:
#   - Increase ceilings (capex: $1000, opex: $500)
#   - Remove high-TCO patterns: ops-slo-error-budgets, deploy-canary, obs-open-telemetry-baseline
#   - Reduce amortization period from 24 months
#
# ============================================================

# ============================================================
# Cost Feasibility Analysis (Details)
# ============================================================
#
# Intent: optimize-tco
# Amortization: 24 months
#
# Ops Team Cost Breakdown:
#   Base: 2 engineers × $10,000/month = $20,000
#   On-call multiplier: 1.5x (on-call burden)
#   Deploy frequency multiplier: 0.6x (deploy_freq: daily, high automation)
#   Adjusted ops cost: $20,000 × 1.5 × 0.6 = $18,000/month
#
#   Deploy Frequency Options (DORA State of DevOps):
#     on-demand: 0.5x  (very high automation)
#     daily:     0.6x  (high automation)
#     weekly:    0.8x  (moderate automation)
#     biweekly:  0.9x  (manual processes)
#     monthly:   1.0x  (very manual)
#     quarterly: 1.1x  (extremely manual)
#
#
# Ops Team Cost Algorithm (for reference):
#   Formula: ops_team_size × single_resource_monthly_ops_usd × on_call_multiplier × deploy_freq_multiplier
#   Based on:
#     - Google SRE Handbook (2016): On-call burden = 25-50% FTE overhead
#     - DORA State of DevOps (2021): Deploy frequency impact on ops overhead
#
# Calculating costs for 24 selected patterns:
#
# PER-PATTERN COSTS:
# ────────────────────────────────────────────────────────────
#
#  1. arch-serverless--aws (match score: 35.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $3,500.0 + ($0.0 × 24) = $3,500.0
#
#  2. db-managed-postgres (match score: 32.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $1,200.0 + ($0.0 × 24) = $1,200.0
#
#  3. arch-serverless-pay-per-use--aws-lambda (match score: 28.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $1,500.0 + ($0.0 × 24) = $1,500.0
#
#  4. deploy-canary (match score: 28.00)
#     Adoption: $3,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $3,500.0 + ($100.0 × 24) = $5,900.0
#
#  5. caching-required-low-latency (match score: 27.00)
#     Adoption: $0.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $0.0 + ($0.0 × 24) = $0.0
#
#  6. resilience-rate-limiting (match score: 27.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $1,500.0 + ($0.0 × 24) = $1,500.0
#
#  7. resilience-circuit-breaker (match score: 26.00)
#     Adoption: $1,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $1,500.0 + ($0.0 × 24) = $1,500.0
#
#  8. cache-aside (match score: 26.00)
#     Adoption: $800.0
#     Monthly (min): $15.0
#     Monthly (expected): $15.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $800.0 + ($15.0 × 24) = $1,160.0
#
#  9. api-rest-resource-oriented (match score: 25.00)
#     Adoption: $750.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $750.0 + ($0.0 × 24) = $750.0
#
#  10. sec-auth-oauth2-oidc (match score: 23.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $3,500.0 + ($0.0 × 24) = $3,500.0
#
#  11. crud-single-model (match score: 22.00)
#     Adoption: $300.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $300.0 + ($0.0 × 24) = $300.0
#
#  12. finops-cost-allocation-tags (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $2,500.0 + ($0.0 × 24) = $2,500.0
#
#  13. arch-egress-minimization (match score: 21.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $2,500.0 + ($0.0 × 24) = $2,500.0
#
#  14. release-feature-flags (match score: 19.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $2,000.0 + ($0.0 × 24) = $2,000.0
#
#  15. api-versioning-header (match score: 16.00)
#     Adoption: $1,200.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $1,200.0 + ($0.0 × 24) = $1,200.0
#
#  16. obs-telemetry-backend--aws-cloudwatch (match score: 14.00)
#     Adoption: $500.0
#     Monthly (min): $20.0
#     Monthly (expected): $20.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $500.0 + ($20.0 × 24) = $980.0
#
#  17. obs-golden-signals (match score: 12.00)
#     Adoption: $3,000.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $3,000.0 + ($100.0 × 24) = $5,400.0
#
#  18. obs-open-telemetry-baseline (match score: 12.00)
#     Adoption: $3,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $3,500.0 + ($100.0 × 24) = $5,900.0
#
#  19. finops-budget-guardrails (match score: 10.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $2,500.0 + ($0.0 × 24) = $2,500.0
#
#  20. iac-cloudformation (match score: 10.00)
#     Adoption: $3,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $3,500.0 + ($0.0 × 24) = $3,500.0
#
#  21. ops-runbooks (match score: 8.00)
#     Adoption: $2,500.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $2,500.0 + ($0.0 × 24) = $2,500.0
#
#  22. ops-slo-error-budgets (match score: 8.00)
#     Adoption: $4,500.0
#     Monthly (min): $100.0
#     Monthly (expected): $100.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $4,500.0 + ($100.0 × 24) = $6,900.0
#
#  23. gov-system-manifest (match score: 7.00)
#     Adoption: $4,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $4,000.0 + ($0.0 × 24) = $4,000.0
#
#  24. gov-adrs-mandatory (match score: 7.00)
#     Adoption: $2,000.0
#     Monthly (min): $0.0
#     Monthly (expected): $0.0
#     Ops cost: $0 (no infrastructure)
#     ──────────────────────────────────────
#     TCO (24mo): $2,000.0 + ($0.0 × 24) = $2,000.0
#
# Total TCO (24mo): $494,690.0
# Monthly operational ceiling: $500 ✗ FAIL
# One-time setup ceiling: $1,000 ✗ FAIL
# ============================================================

Cost feasibility failed. The compiler tells the author exactly which numbers don't fit and which three levers bring TCO under the ceiling. Without an explicit operating_model, the compiler would have defaulted ops_team_size: 0 — silently understating real TCO. Declaring it surfaced the mismatch.

11 Step 11: raise ceilings, promote three patterns, swap gov-system-manifest scope to non-agentic, then approve

The author makes three decisions before approving the architecture.

Pick the first cost lever. Step 10 surfaced Pattern OpEx $435/mo + Ops Team Cost $18,000/mo = $18,435/mo total operational, and CapEx $48,250 one-time. The team commits to $20,000/mo OpEx and $50,000 CapEx — comfortable headroom over the actual cost. The cost-feasibility warning clears.
Promote three patterns to P0. resilience-rate-limiting, resilience-circuit-breaker, and cache-aside all sit at P1 by registry default — a reasonable generic recommendation. But for this team they are load-bearing: rate limiting and the circuit breaker protect the database and downstream services from overload and cascading failures; the cache-aside read path is on-the-critical-path for the p99 ≤ 100ms latency budget. The team bumps all three to P0 via the bucket-grouped patterns.P0.<pid>: {} form (empty body means priority-only override; registry defaultConfig is backfilled into the approved architecture).
Swap gov-system-manifest.manifest_scope to the non-agentic classes. The pattern's registry default pins the four agentic-flavoured classes (agent-tools, agent-skills, agent-models, agent-prompts) because the most common adopter is an agentic system. This spec is non-agentic — the four agentic classes don't apply. The team swaps to [data_sources, services, external_dependencies] so the CI drift check exercises the right surface (e.g. the AWS managed-Postgres endpoint, the third-party licensing SaaS, the inbound webhook providers). The override rides on the same bucket-grouped form: patterns.P2.gov-system-manifest.manifest_scope: [data_sources, services, external_dependencies].

The compiler then runs the compiling-architecture skill's promotion contract: EVERY key under assumptions.* is lifted into the top-level spec body so the result has no assumptions block. The # STATUS: APPROVED comment header is prepended at the top. The footer below the panel verifies the handoff: re-compiling the promoted architecture must exit 0 (idempotent). Approver name and date are anonymised placeholders here; in practice the author fills them in at commit time.

architecture.yaml (approved + no assumptions block)

# STATUS: APPROVED
# Approved by: <architect-on-record>
# Approved at: <YYYY-MM-DD>
#
# This header is consumed by skills/implementing-architecture/SKILL.md
# to verify the architecture is handoff-ready. Recompilation of the
# underlying spec invalidates this approval — fresh review required.
project:
  name: aws production api
  domain: product-catalog
constraints:
  cloud: aws
  language: javascript
  platform: api
  features:
    caching: true
    async_messaging: false
    ai_inference: false
    multi_tenancy: false
    batch_processing: false
    distributed_transactions: false
    real_time_streaming: false
    vector_search: false
    document_store: false
    key_value_store: false
    graph_database: false
    time_series_db: false
    oltp_workload: true
    olap_workload: false
    cold_archive_tiering: false
  saas-providers: []
  disallowed-saas-providers: []
  ai-inference-platforms: []
  disallowed-ai-inference-platforms: []
  model-vendors: []
  disallowed-model-vendors: []
  tenantCount: 1
cost:
  intent:
    priority: optimize-tco
  ceilings:
    monthly_operational_usd: 20000
    one_time_setup_usd: 50000
  preferences:
    prefer_free_tier_if_possible: true
    prefer_saas_first: false
operating_model:
  ops_team_size: 2
  single_resource_monthly_ops_usd: 10000
  on_call: true
  deploy_freq: daily
  amortization_months: 24
nfr:
  availability:
    target: 0.999
  latency:
    p95Milliseconds: 50
    p99Milliseconds: 100
  throughput:
    peak_query_per_second_read: 20
    peak_query_per_second_write: 10
  rpo_minutes: 60
  rto_minutes: 60
  data:
    retention_days: 90
    pii: false
    compliance:
      gdpr: false
      gdpr_rtbf: false
      ccpa: false
      hipaa: false
      sox: false
  consistency:
    needsReadYourWrites: false
  durability:
    strict: false
  security:
    auth: oauth2_oidc
    tenant_isolation: n/a
    audit_logging: false
patterns:
  meta:
    caching-required-low-latency: {}
  P0:
    arch-serverless--aws:
      compute_service: lambda
      api_gateway: api-gateway-http
      database: dynamodb
      storage: s3-standard
      auth_service: cognito
      event_bus: eventbridge
    db-managed-postgres:
      provider: supabase
      instance_size: small
      storage_gb: 8
      backup_retention_days: 7
      connection_pooling: true
      high_availability: false
      ssl_mode: require
    arch-serverless-pay-per-use--aws-lambda:
      memory_size: 512
      timeout: 30
      architecture: x86_64
      provisioned_concurrency: 0
      reserved_concurrency: null
    iac-cloudformation:
      stack_naming_convention: project-environment-resource
      change_set_enabled: true
      termination_protection: false
      drift_detection: true
      stack_policy: none
    resilience-rate-limiting:
      algorithm: token-bucket
      scope: per-tenant
      enforcement_point: application
      quota_type: requests
      burst_allowance: enabled
    resilience-circuit-breaker:
      failure_threshold: 5
      success_threshold: 2
      timeout_duration: 60
      half_open_max_calls: 1
      fallback_strategy: fail_fast
    cache-aside:
      invalidation_strategy: ttl
      ttl_seconds: 3600
      max_memory_mb: 512
      cache_backend: redis
      write_strategy: write-through
      serialization_format: json
  P1:
    deploy-canary:
      initial_traffic_percentage: 5
      increment_percentage: 10
      increment_interval_minutes: 15
      monitoring_window_minutes: 10
      rollback_on_error_threshold: 5
      success_criteria: error_rate_and_latency
      composes:
        layered_after:
        - iac-cloudformation
    api-rest-resource-oriented:
      pagination_style: offset
      max_page_size: 100
      versioning_strategy: uri
      filtering_style: query_params
      cache_strategy: etag
      id_format: uuid
      response_envelope: false
    sec-auth-oauth2-oidc:
      oauth_flow: authorization_code
      token_storage: secure_storage
      pkce_enabled: true
      scope_strategy: minimal
      token_refresh: automatic
      id_token_validation: strict
      composes:
        wraps:
        - api-rest-resource-oriented
    crud-single-model:
      api_style: rest
      validation_strategy: server-side
      soft_delete: false
      audit_logging: false
      pagination_default_size: 20
    finops-cost-allocation-tags:
      tagging_strategy: hierarchical
      enforcement_level: required
      cost_allocation_model: showback
      tag_inheritance: true
      automated_tagging: true
    release-feature-flags:
      flag_storage: config_file
      evaluation_strategy: simple_boolean
      targeting_capability: none
      kill_switch_enabled: true
      audit_logging: false
    obs-telemetry-backend--aws-cloudwatch:
      log_retention_days: 30
      xray_sampling_rate: 0.05
    obs-golden-signals:
      latency_percentile: p95
      error_rate_threshold: 1_percent
      saturation_metric: cpu_memory
      sli_window: 30_days
      alert_severity_levels: critical_warning
      composes:
        layered_after:
        - obs-open-telemetry-baseline
    obs-open-telemetry-baseline:
      export_backend: otlp
      trace_sampling_strategy: parent-based
      trace_sampling_rate: 1.0
      metrics_export_interval: 60
      log_correlation: true
      resource_detection: true
      propagation_format: w3c-tracecontext
      composes:
        co_runs_with:
        - api-rest-resource-oriented
    finops-budget-guardrails:
      budget_period: monthly
      alert_thresholds:
      - 50
      - 80
      - 100
      enforcement_action: alert
      tagging_strategy: mandatory
      policy_enforcement: soft
      cost_allocation_level: project
    ops-slo-error-budgets:
      slo_target_percentage: 99.9
      measurement_window_days: 30
      error_budget_policy: halt-deployments
      sli_type: availability
      alerting_threshold_percentage: 80
  P2:
    arch-egress-minimization:
      cdn_strategy: full
      data_locality: regional
      cross_region_replication: minimal
      compression_enabled: true
      static_asset_strategy: cdn_edge
    api-versioning-header:
      version_header_name: API-Version
      version_format: date-based
      fallback_behavior: latest-stable
      content_negotiation: false
      deprecation_policy: warning-header
    gov-system-manifest:
      manifest_scope:
      - data_sources
      - services
      - external_dependencies
      manifest_path: docs/architecture/manifest.yaml
      manifest_format: yaml
      pin_versions: true
      ci_validation: required
      drift_policy: fail-build
  P3:
    ops-runbooks:
      runbook_format: markdown
      incident_severity_levels: 4
      escalation_policy: tiered
      automation_integration: manual
      review_frequency: quarterly
    gov-adrs-mandatory:
      adr_format: madr
      storage_location: docs/adrs
      decision_threshold: significant
      review_requirement: peer-review

✓ compiler verified: re-compiling the promoted architecture exits 0 (clean)

Handoff-ready. Write this architecture.yaml to <app-repo>/docs/architecture/architecture.yaml and commit it. skills/implementing-architecture/SKILL.md reads from that path. Recompilation of the underlying spec invalidates the approval header — fresh review required.

What this demo did NOT cover

The Progressive-Refinement walkthrough above ends with an approved architecture. Several facilities of the compiler and the adjacent skills were intentionally out of scope:

The other two cost-feasibility levers. Step 11 above resolved the cost warning by raising ceilings — one of three levers the compiler offered. The other two are: (a) drop specific high-TCO patterns — the compiler names them in the warning text, so the author can decide which ones are negotiable for this workload; and (b) shorten the amortisation period, which spreads CapEx across fewer months and may change the optimal pattern set under optimize-tco. The lever choice is an architectural decision the compiler intentionally leaves to the author.
Hard rejection on a spec that has no feasible pattern. Step 4 above shows a rejection that the author can fix with one constraint flip (features.caching: true). The harder case is when a hard NFR (e.g. 10ms p95 latency, or a very high availability requirement combined with on-prem deployment) cannot be satisfied by any registered pattern at all — the compiler exits 1 and names the pattern whose supports_nfr gate failed. The agentic demo's Step 2 shows a structurally similar rejection (n8n + multi-agent hosting mismatch).
Per-pattern defaultConfig / NFR overrides — step 11 above demonstrates the priority-override leg via the bucket-grouped patterns.<bucket>.<pid>: {} form. The same shape with a non-empty body overrides per-pattern defaultConfig field values (e.g. patterns.P0.cache-aside.invalidation_strategy: explicit) or NFR contributions. Useful when registry defaults don't fit your data-handling, latency, or operational requirements. See the agentic demo's step 9 for the priority-promotion + config-override combined example with rich BEFORE/AFTER rendering.
The rejected-patterns.yaml side file — produced by verbose mode (-v), lists every pattern the compiler considered AND dropped, with per-pattern reasoning (which activation gate failed, which conflict fired, which NFR contribution couldn't be met). The flip side of "why was this pattern selected?" — equally useful for debugging.
Implementing the approved architecture. Once the approved architecture.yaml is committed, the skills/implementing-architecture skill reads it and scaffolds the project. The composes graph (visible in Step 8 above) is the implementing agent's primary signal for build order and runtime layering. See skills/implementing-architecture/SKILL.md for the workflow that picks up where this demo ends.
Brownfield specs. The compiling-architecture skill documents how to read an existing prototype's choices into the spec before compiling. See its "Brownfield" section for the protocol.

Every selected pattern carries reference_design_url + reference_developer_doc_url fields under patterns/<pid>.json pointing at the canonical product / SDK docs for that technology — the implementing agent uses these to ground its scaffolding decisions.

Generated by scripts/build_demo.py. Scenario source: test-specs-demo/scenarios/non-agentic-aws-api-progressive.yaml