Agentic Patterns

How agents reason, act, remember, and collaborate — 37 patterns across 7 dimensions.

Reasoning & Planning

workflow

Prompt Chaining

A sequence of LLM calls where each pass narrows, transforms, or verifies the output of the previous. One model, one thread — no coordination overhead.

  • Each step has a single well-defined transformation (extract → classify → respond)
  • Simpler than multi-agent: no state synchronisation, no handoffs
  • Fail fast: if step N produces bad output, the chain halts before wasting downstream calls
  • Natural fit for structured extraction, multi-step reasoning, and document pipelines
request pipeline
pattern

ReAct — Reason + Act

The backbone of most agents. Interleaves reasoning traces with tool calls: think, call a tool, observe the result, think again.

  • Thought — internal reasoning about what to do next
  • Action — call a tool or produce output
  • Observation — result returned from the tool, injected into context
  • Repeat until task complete or max steps reached
assistant / tool_call
workflow

Plan-and-Execute

Separates planning from execution. A planner decomposes the goal upfront; an executor works through steps, optionally with its own ReAct loop.

  • Planner — decomposes the goal into an ordered list of steps
  • Executor — runs each step; may have an inner ReAct loop per step
  • Replanner — optional: revises the plan when a step fails or context changes
  • More predictable than pure ReAct; less adaptive to mid-run surprises
task planner
pattern

Tree of Thoughts

Branching + backtracking reasoning. Agent explores multiple candidate paths simultaneously, scores each branch, and prunes dead ends.

  • Branch — from each state, generate N candidate next thoughts
  • Score — evaluate each branch with a critic or heuristic
  • Prune — discard low-scoring branches early to focus compute
  • Backtrack — if a branch dead-ends, return to the last branch point
reasoning tree
architecture

State Machine

Agent operates through explicit states and labeled transitions. Current state determines which actions are available — more structured than free-form ReAct.

  • States — discrete phases with defined entry and exit conditions
  • Transitions — labeled edges triggered by conditions or tool results
  • Guard clauses — prevent invalid transitions from the current state
  • Useful when business logic has defined phases (draft → review → deploy)
workflow state

Grounding & Action

pattern

Tool Use / Function Calling

Agent operates on a typed schema of tools. The model selects a tool and fills in parameters; parameters are validated before execution; results are injected back as observations.

  • Schema — each tool has a name, description, and typed parameter list
  • Selection — model chooses based on current reasoning state
  • Validation — parameters checked against schema before execution
  • Injection — result returned to context as a structured observation
tool explorer
technique

Sandboxed Execution

Code runs in an isolated environment with no access to host filesystem, network, or env vars. Output is safely captured and returned as an observation.

  • Isolation — separate container, subprocess, or WASM runtime
  • Resource limits — CPU, memory, and time caps prevent runaway processes
  • Capture — stdout, stderr, exit code returned as structured output
  • Teardown — environment destroyed after each execution
runtime / sandbox
pattern

Agentic RAG

Agent steers retrieval — decides what to fetch, evaluates chunk relevance, refines its query, and iterates until the retrieved context is sufficient to generate a grounded answer.

  • Query — agent generates a targeted query, not just the raw user message
  • Evaluate — agent scores chunk relevance; low scores trigger re-query
  • Refine — query is reformulated based on what was missing
  • Iterate — repeats until context is sufficient; distinct from static RAG
retrieval agent
pattern

Citation / Attribution

Every claim the agent makes is linked to a retrieved source. A verification pass checks citations are accurate. Ungrounded claims are flagged as uncertain.

  • Source IDs — retrieved chunks tagged with origin (doc, url, chunk index)
  • Inline citations — agent required to cite [n] after each factual statement
  • Verification — second pass checks citations against the source material
  • Essential for research agents where hallucination cost is high
research output

Memory & Knowledge

technique

In-Context Memory

The active prompt window — conversation history, chain-of-thought traces, and working state. Zero latency, bounded by context length, lost when the session ends.

  • Conversation history — all prior turns visible in the current window
  • Working set — current task state, intermediate values, scratch reasoning
  • Bounded — attention degrades for tokens far from the present
  • Ephemeral — lost on session end; must be persisted externally to survive
context window
technique

Context Distillation

When the context window fills up, the agent compresses older history into a dense summary block and continues from there — without losing the thread.

  • Trigger — token count approaching the model's context limit
  • Summarize — agent compresses older messages into a denser summary block
  • Replace — original messages removed; summary takes their place
  • Different from RAG — this manages growing in-context state, not external retrieval
context manager
pattern

Semantic Memory

Agent-curated, structured fact storage. Not passive retrieval — the agent writes and updates entries. Two models: a single profile JSON or a collection of narrow documents.

  • Profile model — single JSON document updated in place (well-scoped facts)
  • Collection model — many narrow documents, each representing one entity or fact
  • Agent-curated — the agent decides what to store and when to update
  • Queryable — retrieved by semantic search or exact key lookup
memory store / profile
pattern

Episodic Memory

A persistent log of past runs and their outcomes. Before acting, the agent retrieves relevant episodes to avoid known failure modes and reuse successful strategies.

  • Write — after each run, record what happened and what worked
  • Read — at task start, retrieve relevant past episodes by similarity
  • Decay — old or low-relevance memories pruned over time
  • Update — memories revised when new evidence contradicts them
episode log
pattern

Procedural Memory

The agent's operating instructions stored as updateable text. After a run, the agent evaluates its own process and proposes rule changes — writing better procedures over time.

  • Instructions — operating procedures stored as editable rules or system prompt
  • Learn — agent evaluates its own process and proposes improvements
  • Write back — approved rules update the instruction store
  • Version — previous procedures retained for rollback
agent rules / procedures
pattern

External Memory (RAG)

Documents chunked, embedded, and retrieved by nearest-neighbor search. Static retrieval: one pass, then generate. Decouples knowledge from context length.

  • Embed — documents chunked and embedded into a vector store offline
  • Query — agent generates a retrieval query from current context
  • Retrieve — top-k nearest chunks fetched by cosine similarity
  • Augment — chunks injected into prompt before generation (no iteration)
knowledge base
technique

Memory Write Strategies

A meta-pattern: when and how memories are committed. Hot path writes inline during execution; background defers to an async task. The choice trades latency for consistency.

  • Hot path — write inline during the run; immediate availability, adds latency
  • Background — defer write to an async task; no latency, eventual consistency
  • Choose hot path when — the agent needs to read its own writes in the same session
  • Choose background when — memory is for future sessions, throughput matters more
write strategy

Quality & Verification

pattern

Reflection / Self-Critique

Agent evaluates its own output against a set of criteria and retries if the score falls below a threshold. The same model generates and critiques.

  • Generate — produce an initial output
  • Critique — score output against criteria (correctness, completeness, safety)
  • Refine — revise based on critique, repeat until threshold met
  • Max iterations — hard cap prevents infinite loops
review / iterate
pattern

Evaluator-Optimizer

Two separate LLMs: one generates, one evaluates. The evaluator returns structured feedback; the generator revises. Not self-loop — separate roles, separate prompts.

  • Generator — produces a candidate output
  • Evaluator — separate model scores it and returns structured critique
  • Optimizer — generator revises using evaluator's specific feedback
  • Key distinction from Reflection — evaluator is a separate LLM, not self-critique
generator / evaluator
architecture

Debate / Critique

Two agents argue opposing positions; a judge evaluates the exchange and synthesises a resolution. Surfaces assumptions a single agent misses.

  • Proposer — generates a solution or claim
  • Critic — argues against it, finds flaws and edge cases
  • Judge — evaluates the exchange, picks winner or synthesises
  • Useful for verification, adversarial testing, and high-stakes decisions
debate / verdict
pattern

Guardrails

Validation layer on every input and output. Passive mode checks rules and flags; active mode blocks and escalates to human or replanning. Always on — not event-driven.

  • Passive — regex, schema, or classifier checks on every pass; flag violations
  • Active — violations trigger human escalation or agent replanning
  • Layered — input guardrails before the model, output guardrails after
  • Distinct from Approval Gate — always running, not triggered by action type
content / safety layer

Multi-Agent Coordination

architecture

Pipeline

Output of one agent becomes input to the next. Each stage has a single well-defined transformation. Simple, auditable — a failed stage halts the pipeline.

  • Each agent has one transformation: spec → code → tests → review → docs
  • No agent needs awareness of stages before or after it
  • Simple and auditable — clear data lineage through each stage
  • Failed stage halts the pipeline; the artifact doesn't advance
ci / pipeline
architecture

Fan-out / Parallel

Orchestrator decomposes a task into N independent chunks and dispatches them simultaneously. Collects results when all complete, then reduces.

  • Decompose — identify chunks that can run independently
  • Dispatch — all chunks sent simultaneously to N agents
  • Wait — collect when all complete (or a quorum)
  • Reduce — merge results, resolve conflicts, synthesise final output
parallel tasks
pattern

Routing / Triage

Classify the incoming request and dispatch to the right specialist or workflow branch. Pure dispatch logic — once routed, the classifier is done.

  • Classify — identify the intent, domain, or urgency of the request
  • Route — select the appropriate specialist agent or workflow branch
  • Stateless — each request classified and routed independently
  • No coordination — distinct from orchestration; the router doesn't synthesise
support / inbox
architecture

Orchestrator + Subagents

One coordinator agent holds overall context and delegates focused subtasks to specialist subagents. Orchestrator synthesises outputs into a final result.

  • Orchestrator — plans, delegates, holds context, aggregates
  • Subagents — specialist roles: coder, researcher, reviewer, tester
  • Context passing — orchestrator decides what each subagent needs
  • Result synthesis — orchestrator merges outputs into final answer
team board
architecture

Hierarchical Agent

Multi-level delegation tree: manager → supervisors → workers. Each tier receives only the context it needs. Scales to problems that exceed a single context window.

  • Manager — holds top-level goal and overall context
  • Supervisors — hold team-level context, break goal into subtasks
  • Workers — receive focused subtask input only; no awareness of broader goal
  • Context scoping — prevents pollution; each tier gets exactly what it needs
org / hierarchy
pattern

Handoff

Explicit transfer of control and conversation state from one specialist to the next. State packet travels with the handoff — each agent is fully active one at a time.

  • State packet — context, history, and intent transferred explicitly
  • Specialist — receiving agent activates with full context; prior agent deactivates
  • Chain — can handoff multiple times (triage → billing → senior support)
  • Unlike orchestration — agents are sequential, not concurrent
support ticket
architecture

Swarm / Decentralised

Agents communicate peer-to-peer with no central coordinator. Self-select tasks based on availability and signals. Resilient — no single point of failure.

  • Peer-to-peer — agents communicate directly, no orchestrator
  • Self-selection — agents pick up tasks based on local signals
  • Emergent — overall behaviour arises from local decisions
  • Resilient — any node can drop without halting the system
peer network
architecture

Event-Driven / Actor Model

Each agent has an inbox queue and processes messages independently. Agents emit events and continue without waiting. No blocking — fully async.

  • Actors — each agent has an inbox and processes messages independently
  • Async — agents emit events and continue without waiting for replies
  • Decoupled — agents don't know who consumes their output events
  • Inbox depth — a backpressure signal; deep queues indicate overload
message queues

Control Flow & Reliability

pattern

Retry with Fallback

On tool failure, classify the error: transient errors get retried with backoff; structural errors trigger a fallback strategy. Avoids hitting the same wall repeatedly.

  • Classify — transient (network, rate limit) vs structural (wrong key, unsupported op)
  • Backoff — exponential: 1s, 2s, 4s for transient errors only
  • Fallback chain — ordered list of alternative strategies
  • Dead letter — if all fallbacks fail, surface to human or dead-letter queue
error / recovery
pattern

Checkpoint / Approval Gate

Agent runs autonomously until it reaches an irreversible or high-blast-radius action. At the gate it surfaces its plan to a human and waits for approval.

  • Classify — reversible vs irreversible, local vs shared system
  • Gate triggers — file deletion, push to remote, external API side effects
  • Present context — show plan and reasoning before asking
  • Resume or replan — on approval continue; on denial replan or stop
approval required
pattern

Confidence-Gated Autonomy

Agent scores its own confidence before acting. Above threshold: proceed autonomously. Below threshold: escalate to human. Thresholds vary by action type.

  • Score — agent rates confidence 0–1 before each action
  • Threshold — tuned per action type (0.95 for destructive, 0.60 for safe reads)
  • Escalate — below threshold → ask human; above → proceed
  • Calibrate — track actual accuracy vs confidence over time
confidence / autonomy
pattern

Speculative Execution

Spawn N agents with different strategies in parallel. The first to succeed wins; remaining branches are cancelled. Trades compute for latency.

  • Branch — spawn N agents with different strategies simultaneously
  • Race — first successful result wins; remaining branches cancelled
  • Merge — alternative: collect all results and combine instead of racing
  • Useful when the correct approach is uncertain upfront
parallel strategies
technique

Prompt Caching

Cache repeated long prefixes (system prompt, tool definitions, document context) across calls. The provider skips re-encoding the prefix — significant latency and cost reduction.

  • Identify — system prompt, tool schemas, and document context are cacheable
  • Cache — provider stores the prefix KV state after first call
  • Reuse — subsequent calls skip re-encoding the cached prefix
  • Impact — meaningful latency and token cost reduction for long-context agents
request / cache

Coding Agent Patterns

pattern

Code → Test → Fix Loop

The tightest feedback loop in coding agents. Write code, run the test suite in a sandbox, parse failure messages, apply targeted patches, repeat.

  • Write — generate implementation from spec
  • Execute — run test suite in sandbox; parse stderr, assertions, stack traces
  • Patch — targeted edits based on failure messages; not full rewrites
  • Repeat — until all tests pass or max iterations hit
test runner
pattern

Linter-in-the-Loop

Static analysis runs after code generation and before execution. Lint errors are parsed — file, line, rule, message — and fed back as observations for targeted fixes.

  • Write — generate code
  • Lint — run static analyzer (ESLint, mypy, clippy) before execution
  • Parse — extract file, line, rule, message from lint output
  • Fix — targeted edits per error; re-lint to verify fixes are clean
editor / lint
technique

Scaffolded Execution

Agent writes code into an isolated runtime, executes it, and reads stdout/stderr/exit code as structured observations. The environment itself becomes a tool.

  • Scaffold — set up isolated runtime (container, WASM, subprocess)
  • Write — agent writes code to a file in the sandbox
  • Execute — run the file; capture stdout, stderr, exit code
  • Observe — output fed back as structured tool result; sandbox torn down
runtime / container
pattern

Git-Aware Agent

Agent reads git log, diff, and blame as context before acting. Understands what changed and why — not just the current state — by mining commit history and authorship.

  • git log — understand recent trajectory and intent behind changes
  • git diff — see what changed since last commit; understand current state
  • git blame — trace a line back to its author, commit, and reason
  • Branch context — main vs feature vs hotfix informs risk level of proposed changes
git / history