REFERENCE
The agentic stack
A reference architecture for engineering AI-native systems. Two hierarchies, one resolution boundary, four cross-cutting concerns.
The agentic stack is the conceptual backbone this site points to. Every essay here lives somewhere on it — specifying an environment layer, pressurizing a runtime boundary, or operating a cross-cutting concern. When a post calls out context, orchestration, specs, or validation, it is talking about a specific place on this diagram.
Two hierarchies, not one
Most "agentic systems" collapse the declarative surface and the runtime surface into one codebase, one deploy, one mental model. They do not operate as systems — they operate as monoliths with agent names.
A real agentic stack separates what exists (the declarative environment) from what happens (the runtime operational stack), and binds one to the other through a single, auditable resolution step.
Environment (declarative, E0–E4)
Versioned, inspectable surface. Declared, resolved, and bound before any task runs. If resolution fails, the stack fails fast — before work enters the pipeline.
- E0 Compute
- Hardware, containers, GPUs. The substrate every agent ultimately runs on.
- E1 Models
- Inference endpoints with explicit context limits, latency profiles, and cost. A model is an endpoint contract, not a vibe.
- E2 Tools
- MCP servers, APIs, callables the agent can invoke. Every tool is a typed interface with declared side effects. Unversioned tool-signature changes are a production-incident factory.
- E3 Skills
- Composed capabilities and workflows loaded on demand. A skill is a named context pattern — "when you see X, do Y" — without burning context every session.
- E4 Agent
- Persona, role, and permission grants. An agent is a declared identity with a capability list and a governance envelope, not a prompt.
Environment resolution boundary
The phase transition from declarative to runtime. Everything above this line is declared; everything below it runs.
At the boundary, E4 binds to L5. Tools register. Permissions confirm. Memory initializes. Model endpoints confirm reachable. Agent capability declarations match task requirements. The boundary enforces an admission gate: if any of this fails, the task is rejected before L5–L8 ever run. Without this gate, misconfiguration surfaces mid-workflow, three steps in, with an unrelated error message.
Operational stack (runtime, L5–L8)
Runs against the resolved environment. The operational stack is what happens.
- L5 Routing
- Agent discovery and logical endpoints. The router identifies which agent in E4 has the declared capability to handle a task. Without routing, every task goes to "the AI" — a destination so generic it is not a destination at all.
- L6 Delivery
- Count and ordering guarantees for inter-agent messages. At-least-once with acknowledgment. Receiver-side deduplication. Correlation IDs threading every exchange end-to-end.
- L7 Prompt / Encoding
- Templates, schemas, and structured payloads crossing agent boundaries. Where unstructured chat turns into typed contract. Invalid payloads fail at the boundary.
- L8 Workflow
- DAGs, agent tasks, business logic. Decomposes a real task into routable steps, fans out parallel work, reduces results, enforces dependencies.
Cross-cutting concerns
Four concerns span both hierarchies and do not belong to any single layer. They are the properties that make the stack operable.
- Ontology
- Schema contracts and contract continuity. The shared typed language E2 and L7 must speak.
- Memory
- Three tiers. M1 window context (ephemeral content inside a single prompt). M2 session state (conversation and accumulated results across a multi-step task). M3 durable memory (project knowledge, persistent conventions). Most teams build M1 and M3 and skip M2 — and then wonder why the agent "feels confused."
- Governance
- Permission enforcement, memory provenance, schema enforcement. The policy envelope the admission gate enforces.
- Observability
- Tracing, logging, session recording, provenance. The diagnostic surface the whole stack is inspectable through. A system without observability is not a system; it is a slot machine.
Commonly missing components
Three components show up everywhere in the reference architecture and get built almost nowhere in practice.
- Delivery guarantees between agents (L6). Without at-least-once delivery with acknowledgment and receiver-side deduplication, "slow agent" and "lost message" become indistinguishable. Retries turn into guesswork.
- A session memory tier (M2). Decisions made three steps ago must stay fresh and queryable for the agent making step seven. Without M2 the agent operates on outdated or missing context.
- An explicit admission gate. The checkpoint between resolution and L5 where tools bind, permissions confirm, memory initializes, and endpoints verify — before any task runs.
The diagnostic test
When your next production failure happens, can you point to the layer where it occurred?
- A prompt problem is L7.
- A routing problem is L5.
- A delivery problem is L6.
- A tool-binding problem is the admission gate.
- A stale-memory problem is M2 (or M3, if someone forgot to update it).
- A capability-mismatch problem is E4 → L5 resolution.
- A model-capacity problem is E1.
If you cannot answer, the architecture is incomplete.
For the full argument behind the stack — why the two hierarchies, why the boundary, why the three commonly-missing components — read the essay:
