Agentic Patterns

Reasoning & Planning

workflow

Prompt Chaining

A sequence of LLM calls where each pass narrows, transforms, or verifies the output of the previous. One model, one thread — no coordination overhead.

Each step has a single well-defined transformation (extract → classify → respond)
Simpler than multi-agent: no state synchronisation, no handoffs
Fail fast: if step N produces bad output, the chain halts before wasting downstream calls
Natural fit for structured extraction, multi-step reasoning, and document pipelines

request pipeline

pattern

ReAct — Reason + Act

The backbone of most agents. Interleaves reasoning traces with tool calls: think, call a tool, observe the result, think again.

Thought — internal reasoning about what to do next
Action — call a tool or produce output
Observation — result returned from the tool, injected into context
Repeat until task complete or max steps reached

assistant / tool_call

workflow

Plan-and-Execute

Separates planning from execution. A planner decomposes the goal upfront; an executor works through steps, optionally with its own ReAct loop.

Planner — decomposes the goal into an ordered list of steps
Executor — runs each step; may have an inner ReAct loop per step
Replanner — optional: revises the plan when a step fails or context changes
More predictable than pure ReAct; less adaptive to mid-run surprises

task planner

pattern

Tree of Thoughts

Branching + backtracking reasoning. Agent explores multiple candidate paths simultaneously, scores each branch, and prunes dead ends.

Branch — from each state, generate N candidate next thoughts
Score — evaluate each branch with a critic or heuristic
Prune — discard low-scoring branches early to focus compute
Backtrack — if a branch dead-ends, return to the last branch point

reasoning tree

architecture

State Machine

Agent operates through explicit states and labeled transitions. Current state determines which actions are available — more structured than free-form ReAct.

States — discrete phases with defined entry and exit conditions
Transitions — labeled edges triggered by conditions or tool results
Guard clauses — prevent invalid transitions from the current state
Useful when business logic has defined phases (draft → review → deploy)

workflow state

Grounding & Action

pattern

Tool Use / Function Calling

Agent operates on a typed schema of tools. The model selects a tool and fills in parameters; parameters are validated before execution; results are injected back as observations.

Schema — each tool has a name, description, and typed parameter list
Selection — model chooses based on current reasoning state
Validation — parameters checked against schema before execution
Injection — result returned to context as a structured observation

tool explorer

technique

Sandboxed Execution

Code runs in an isolated environment with no access to host filesystem, network, or env vars. Output is safely captured and returned as an observation.

Isolation — separate container, subprocess, or WASM runtime
Resource limits — CPU, memory, and time caps prevent runaway processes
Capture — stdout, stderr, exit code returned as structured output
Teardown — environment destroyed after each execution

runtime / sandbox

pattern

Agentic RAG

Agent steers retrieval — decides what to fetch, evaluates chunk relevance, refines its query, and iterates until the retrieved context is sufficient to generate a grounded answer.

Query — agent generates a targeted query, not just the raw user message
Evaluate — agent scores chunk relevance; low scores trigger re-query
Refine — query is reformulated based on what was missing
Iterate — repeats until context is sufficient; distinct from static RAG

retrieval agent

pattern

Citation / Attribution

Every claim the agent makes is linked to a retrieved source. A verification pass checks citations are accurate. Ungrounded claims are flagged as uncertain.

Source IDs — retrieved chunks tagged with origin (doc, url, chunk index)
Inline citations — agent required to cite [n] after each factual statement
Verification — second pass checks citations against the source material
Essential for research agents where hallucination cost is high

research output

Memory & Knowledge

technique

In-Context Memory

The active prompt window — conversation history, chain-of-thought traces, and working state. Zero latency, bounded by context length, lost when the session ends.

Conversation history — all prior turns visible in the current window
Working set — current task state, intermediate values, scratch reasoning
Bounded — attention degrades for tokens far from the present
Ephemeral — lost on session end; must be persisted externally to survive

context window

technique

Context Distillation

When the context window fills up, the agent compresses older history into a dense summary block and continues from there — without losing the thread.

Trigger — token count approaching the model's context limit
Summarize — agent compresses older messages into a denser summary block
Replace — original messages removed; summary takes their place
Different from RAG — this manages growing in-context state, not external retrieval

context manager

pattern

Semantic Memory

Agent-curated, structured fact storage. Not passive retrieval — the agent writes and updates entries. Two models: a single profile JSON or a collection of narrow documents.

Profile model — single JSON document updated in place (well-scoped facts)
Collection model — many narrow documents, each representing one entity or fact
Agent-curated — the agent decides what to store and when to update
Queryable — retrieved by semantic search or exact key lookup

memory store / profile

pattern

Episodic Memory

A persistent log of past runs and their outcomes. Before acting, the agent retrieves relevant episodes to avoid known failure modes and reuse successful strategies.

Write — after each run, record what happened and what worked
Read — at task start, retrieve relevant past episodes by similarity
Decay — old or low-relevance memories pruned over time
Update — memories revised when new evidence contradicts them

episode log

pattern

Procedural Memory

The agent's operating instructions stored as updateable text. After a run, the agent evaluates its own process and proposes rule changes — writing better procedures over time.

Instructions — operating procedures stored as editable rules or system prompt
Learn — agent evaluates its own process and proposes improvements
Write back — approved rules update the instruction store
Version — previous procedures retained for rollback

agent rules / procedures

pattern

External Memory (RAG)

Documents chunked, embedded, and retrieved by nearest-neighbor search. Static retrieval: one pass, then generate. Decouples knowledge from context length.

Embed — documents chunked and embedded into a vector store offline
Query — agent generates a retrieval query from current context
Retrieve — top-k nearest chunks fetched by cosine similarity
Augment — chunks injected into prompt before generation (no iteration)

knowledge base

technique

Memory Write Strategies

A meta-pattern: when and how memories are committed. Hot path writes inline during execution; background defers to an async task. The choice trades latency for consistency.

Hot path — write inline during the run; immediate availability, adds latency
Background — defer write to an async task; no latency, eventual consistency
Choose hot path when — the agent needs to read its own writes in the same session
Choose background when — memory is for future sessions, throughput matters more

write strategy

Quality & Verification

pattern

Reflection / Self-Critique

Agent evaluates its own output against a set of criteria and retries if the score falls below a threshold. The same model generates and critiques.

Generate — produce an initial output
Critique — score output against criteria (correctness, completeness, safety)
Refine — revise based on critique, repeat until threshold met
Max iterations — hard cap prevents infinite loops

review / iterate

pattern

Evaluator-Optimizer

Two separate LLMs: one generates, one evaluates. The evaluator returns structured feedback; the generator revises. Not self-loop — separate roles, separate prompts.

Generator — produces a candidate output
Evaluator — separate model scores it and returns structured critique
Optimizer — generator revises using evaluator's specific feedback
Key distinction from Reflection — evaluator is a separate LLM, not self-critique

generator / evaluator

architecture

Debate / Critique

Two agents argue opposing positions; a judge evaluates the exchange and synthesises a resolution. Surfaces assumptions a single agent misses.

Proposer — generates a solution or claim
Critic — argues against it, finds flaws and edge cases
Judge — evaluates the exchange, picks winner or synthesises
Useful for verification, adversarial testing, and high-stakes decisions

debate / verdict

pattern

Guardrails

Validation layer on every input and output. Passive mode checks rules and flags; active mode blocks and escalates to human or replanning. Always on — not event-driven.

Passive — regex, schema, or classifier checks on every pass; flag violations
Active — violations trigger human escalation or agent replanning
Layered — input guardrails before the model, output guardrails after
Distinct from Approval Gate — always running, not triggered by action type

content / safety layer

Multi-Agent Coordination

architecture

Pipeline

Output of one agent becomes input to the next. Each stage has a single well-defined transformation. Simple, auditable — a failed stage halts the pipeline.

Each agent has one transformation: spec → code → tests → review → docs
No agent needs awareness of stages before or after it
Simple and auditable — clear data lineage through each stage
Failed stage halts the pipeline; the artifact doesn't advance

ci / pipeline

architecture

Fan-out / Parallel

Orchestrator decomposes a task into N independent chunks and dispatches them simultaneously. Collects results when all complete, then reduces.

Decompose — identify chunks that can run independently
Dispatch — all chunks sent simultaneously to N agents
Wait — collect when all complete (or a quorum)
Reduce — merge results, resolve conflicts, synthesise final output

parallel tasks

pattern

Routing / Triage

Classify the incoming request and dispatch to the right specialist or workflow branch. Pure dispatch logic — once routed, the classifier is done.

Classify — identify the intent, domain, or urgency of the request
Route — select the appropriate specialist agent or workflow branch
Stateless — each request classified and routed independently
No coordination — distinct from orchestration; the router doesn't synthesise

support / inbox

architecture

Orchestrator + Subagents

One coordinator agent holds overall context and delegates focused subtasks to specialist subagents. Orchestrator synthesises outputs into a final result.

Orchestrator — plans, delegates, holds context, aggregates
Subagents — specialist roles: coder, researcher, reviewer, tester
Context passing — orchestrator decides what each subagent needs
Result synthesis — orchestrator merges outputs into final answer

team board

architecture

Hierarchical Agent

Multi-level delegation tree: manager → supervisors → workers. Each tier receives only the context it needs. Scales to problems that exceed a single context window.

Manager — holds top-level goal and overall context
Supervisors — hold team-level context, break goal into subtasks
Workers — receive focused subtask input only; no awareness of broader goal
Context scoping — prevents pollution; each tier gets exactly what it needs

org / hierarchy

pattern

Handoff

Explicit transfer of control and conversation state from one specialist to the next. State packet travels with the handoff — each agent is fully active one at a time.

State packet — context, history, and intent transferred explicitly
Specialist — receiving agent activates with full context; prior agent deactivates
Chain — can handoff multiple times (triage → billing → senior support)
Unlike orchestration — agents are sequential, not concurrent

support ticket

architecture

Swarm / Decentralised

Agents communicate peer-to-peer with no central coordinator. Self-select tasks based on availability and signals. Resilient — no single point of failure.

Peer-to-peer — agents communicate directly, no orchestrator
Self-selection — agents pick up tasks based on local signals
Emergent — overall behaviour arises from local decisions
Resilient — any node can drop without halting the system

peer network

architecture

Event-Driven / Actor Model

Each agent has an inbox queue and processes messages independently. Agents emit events and continue without waiting. No blocking — fully async.

Actors — each agent has an inbox and processes messages independently
Async — agents emit events and continue without waiting for replies
Decoupled — agents don't know who consumes their output events
Inbox depth — a backpressure signal; deep queues indicate overload

message queues

Control Flow & Reliability

pattern

Retry with Fallback

On tool failure, classify the error: transient errors get retried with backoff; structural errors trigger a fallback strategy. Avoids hitting the same wall repeatedly.

Classify — transient (network, rate limit) vs structural (wrong key, unsupported op)
Backoff — exponential: 1s, 2s, 4s for transient errors only
Fallback chain — ordered list of alternative strategies
Dead letter — if all fallbacks fail, surface to human or dead-letter queue

error / recovery

pattern

Checkpoint / Approval Gate

Agent runs autonomously until it reaches an irreversible or high-blast-radius action. At the gate it surfaces its plan to a human and waits for approval.

Classify — reversible vs irreversible, local vs shared system
Gate triggers — file deletion, push to remote, external API side effects
Present context — show plan and reasoning before asking
Resume or replan — on approval continue; on denial replan or stop

approval required

pattern

Confidence-Gated Autonomy

Agent scores its own confidence before acting. Above threshold: proceed autonomously. Below threshold: escalate to human. Thresholds vary by action type.

Score — agent rates confidence 0–1 before each action
Threshold — tuned per action type (0.95 for destructive, 0.60 for safe reads)
Escalate — below threshold → ask human; above → proceed
Calibrate — track actual accuracy vs confidence over time

confidence / autonomy

pattern

Speculative Execution

Spawn N agents with different strategies in parallel. The first to succeed wins; remaining branches are cancelled. Trades compute for latency.

Branch — spawn N agents with different strategies simultaneously
Race — first successful result wins; remaining branches cancelled
Merge — alternative: collect all results and combine instead of racing
Useful when the correct approach is uncertain upfront

parallel strategies

technique

Prompt Caching

Cache repeated long prefixes (system prompt, tool definitions, document context) across calls. The provider skips re-encoding the prefix — significant latency and cost reduction.

Identify — system prompt, tool schemas, and document context are cacheable
Cache — provider stores the prefix KV state after first call
Reuse — subsequent calls skip re-encoding the cached prefix
Impact — meaningful latency and token cost reduction for long-context agents

request / cache

Coding Agent Patterns

pattern

Code → Test → Fix Loop

The tightest feedback loop in coding agents. Write code, run the test suite in a sandbox, parse failure messages, apply targeted patches, repeat.

Write — generate implementation from spec
Execute — run test suite in sandbox; parse stderr, assertions, stack traces
Patch — targeted edits based on failure messages; not full rewrites
Repeat — until all tests pass or max iterations hit

test runner

pattern

Linter-in-the-Loop

Static analysis runs after code generation and before execution. Lint errors are parsed — file, line, rule, message — and fed back as observations for targeted fixes.

Write — generate code
Lint — run static analyzer (ESLint, mypy, clippy) before execution
Parse — extract file, line, rule, message from lint output
Fix — targeted edits per error; re-lint to verify fixes are clean

editor / lint

technique

Scaffolded Execution

Agent writes code into an isolated runtime, executes it, and reads stdout/stderr/exit code as structured observations. The environment itself becomes a tool.

Scaffold — set up isolated runtime (container, WASM, subprocess)
Write — agent writes code to a file in the sandbox
Execute — run the file; capture stdout, stderr, exit code
Observe — output fed back as structured tool result; sandbox torn down

runtime / container

pattern

Git-Aware Agent

Agent reads git log, diff, and blame as context before acting. Understands what changed and why — not just the current state — by mining commit history and authorship.

git log — understand recent trajectory and intent behind changes
git diff — see what changed since last commit; understand current state
git blame — trace a line back to its author, commit, and reason
Branch context — main vs feature vs hotfix informs risk level of proposed changes

git / history