ADR-002: No LLM at Runtime

Status: Accepted | Date: 2026-02-10

Context

Fascia uses large language models (LLMs) to help users design backend systems during the build phase. The architectural question is whether AI should also participate in runtime execution -- interpreting requests, making decisions, or generating responses on the fly.

Many AI-powered platforms blur the line between design-time intelligence and runtime intelligence. This can lead to non-deterministic behavior, unpredictable costs, variable latency, and security concerns such as prompt injection.

Decision

LLM calls are permitted only during the build and design phase. This includes Chat Studio (natural language design), the Safety Agent (spec analysis and risk classification), and the Risk Engine (policy enforcement).

The runtime Executor never calls any LLM API. All runtime behavior is deterministic and derived from compiled spec bundles. Once a Tool is deployed, its execution follows the exact flow graph defined in the spec with no AI involvement.

Alternatives Considered

Option	Pros	Cons
No LLM at runtime (chosen)	Deterministic, predictable costs, consistent latency, auditable	Less flexible at runtime
LLM-assisted runtime	Could handle ambiguous inputs, natural language queries	Non-deterministic, unpredictable costs, latency spikes, hallucination risk
Hybrid (LLM for specific runtime tasks)	Theoretical best of both worlds	Two execution models to maintain, hard to reason about behavior guarantees

Consequences

Positive

Determinism -- Every API call returns the same result for the same input. Behavior is fully reproducible.
Predictable costs -- Runtime costs scale with compute and database usage, not with LLM token consumption per request.
Consistent latency -- No waiting for LLM inference during request processing. Response times are bounded by flow graph complexity and database queries.
Auditability -- Every execution step can be traced through the flow graph. There are no opaque AI decisions in the execution path.
Security -- No prompt injection attack surface at runtime. The Executor processes structured data, not natural language.

Negative

No natural language at runtime -- End users of customer products interact through structured API calls, not conversational interfaces. Natural language processing must happen in the customer's own frontend layer if needed.
Compile-away intelligence -- All AI assistance must produce concrete specs at design time. The intelligence cannot adapt at runtime.

Risks

Users may expect AI-powered runtime features (such as natural language API queries or intelligent routing). Product messaging must clearly communicate the design-time vs. runtime boundary.
As the market moves toward AI-native backends, this decision may need revisiting in future phases -- but only with strong guarantees around determinism and cost predictability.

Context​

Decision​

Alternatives Considered​

Consequences​

Positive​

Negative​

Risks​