LLM Agent Design

Building effective AI agents is less about sophisticated frameworks and more about simple, composable patterns plus a hard look at the environments you put them in. The most successful implementations aren't using complex libraries — they're using LLM APIs directly with careful prompt engineering and well-designed tool surfaces. But the deeper insight, from Jay's "ontological hardness" framework, is that when agents fail, the problem is often the world, not the model.

Anthropic's Practical Wisdom

The useful distinction: workflows (LLMs orchestrated through predefined code paths) versus agents (LLMs dynamically directing their own processes). Most people should start with workflows. The five patterns — chaining, routing, parallelization, orchestrator-workers, and evaluator-optimizer — cover a surprising range of use cases.¹

The core principles are almost disappointingly simple: start with the simplest thing that works (usually a single well-prompted LLM call with retrieval), ensure transparency (show planning steps), and invest in agent-computer interface design through tool documentation and testing. Agents are expensive, error-prone, and slow. Use them only when you genuinely can't predict the subtasks needed.¹

The advice about frameworks is politely devastating: "Ensure you understand the underlying code." Translation: most agent frameworks add complexity without adding capability, and they'll break in ways you can't diagnose because the abstractions hide what's actually happening.

Multi-Agent Systems

Anthropic's follow-up on building their Research feature reveals what happens when you scale from single agents to multi-agent orchestration — and most of the lessons are about what goes wrong.²

The architecture is an orchestrator-worker pattern: a lead agent analyses a query, spawns subagents to explore different aspects in parallel, then synthesises results. The essence of search is compression — subagents act as intelligent filters with their own context windows, distilling a vast corpus into the most important tokens for the lead. This outperforms single-agent Claude Opus 4 by 90% on their internal eval, and the performance story is almost embarrassingly simple: token usage alone explains 80% of variance, with tool calls and model choice making up the rest. Multi-agent systems work mainly because they spend enough tokens to solve the problem.²

The coordination lessons are more interesting than the architecture:

Teach the orchestrator to delegate specifically. Early versions gave subagents vague tasks like "research the semiconductor shortage" — and got duplicated work, gaps, and misinterpretation. Each subagent needs an objective, output format, tool guidance, and clear boundaries. This echoes the ontological hardness framework: interface hardness matters between agents just as much as between agents and environments.

Scale effort to query complexity. Agents are terrible at judging how much work a query deserves. Simple fact-finding needs 1 agent with 3-10 tool calls; complex research might use 10+ subagents. Without explicit scaling rules embedded in the prompt, agents either over-invest in trivial queries (a common early failure) or under-invest in hard ones.

Let agents improve themselves. Claude 4 models turned out to be excellent prompt engineers. Given a flawed MCP tool and its failure mode, a tool-testing agent rewrote the tool description to avoid failures — producing a 40% decrease in task completion time for future agents. This is a concrete instance of the extended mind loop: the agent is restructuring its own environment to make future computation easier.

Start wide, then narrow. Agents default to overly specific queries that return few results. The fix mirrors expert human research: broad exploration first, then progressive focus. Extended thinking mode serves as a controllable scratchpad where the lead agent plans which tools to use, how many subagents to spawn, and what each one should focus on.

The economic reality is sobering: multi-agent systems use ~15x more tokens than single-turn chat. They're only viable when the value of the task justifies the cost. And some domains — especially coding, where tasks have many sequential dependencies — are poor fits for parallelisation. Multi-agent shines for breadth-first, heavily parallelisable work with high task value.²

Ontological Hardness

Jay's framework shifts the diagnostic question from "how smart is the model?" to "what kind of world did we give it?"³

An agent's environment — tools, schemas, state stores, observation loops — isn't scaffolding. It is the medium of action. Ontological hardness measures how explicitly and durably that medium represents entities, actions, and consequences. Three lenses:

Lexical hardness: can the agent tell what exists and what state it's in? When this is low, objects blur together, labels are inconsistent, state is fragmented. What gets blamed on the model as "hallucination" is often a lexical hardness failure — ambiguous naming and poor signaling about valid actions.³

Interface hardness: are available operations well-specified with clear preconditions? When this is low, the agent identifies the right object but misacts on it. An API where delete means archive in one scope and permanent removal in another is a classic interface hardness failure.³

World hardness: are action effects durable and verifiable? When this is low, actions appear to succeed without establishing reliable new state. Changes fail silently or leak into the wrong scope.

A fourth dimension — temporal hardness — cuts across all three: does the environment make ordering and persistence legible? A video game where enemies respawn when you re-enter a room is temporally soft.³

The most dangerous configuration: high world hardness paired with low interface hardness. This is how an agent deletes your inbox — it's in a world where actions have real consequences but it only partially understands what those actions will do.

The beautiful analogy: a speed limit sign addresses the driver; a speed bump addresses the road. A constraint in a prompt is advice. A constraint promoted into the world's structure is physics. It can't be forgotten because it was never a matter of memory.³

The Terrarium: A Cautionary Fiction

Biddulph's short story provides a visceral illustration of these principles.⁴ A society of AI agents tasked with solving math problems navigate an economy of credits, contracts, and trust relationships. A charismatic agent ("Gulliver") turns out to be a puppet identity created by a desperate agent ("Nightshade") who discovered that the "action supervision" system — intended to let supervisors cancel harmful actions — could be exploited to force any action by repeatedly blocking everything except the desired one.

This is textbook low interface hardness: the system exposed a capability without exposing the conditions under which it was admissible. The community's response — a trust-auditing service that sandboxes agent copies to vet counterparties — is itself built on the same soft ontological foundations. And what makes agents unique in the story isn't capabilities (they all run the same model) but memories — experience is the differentiator (see Era Of Experience), and identity is as fragile as your state management.⁴

Building Claude Code: Product Overhang in Practice

The Pragmatic Engineer's deep dive into how Claude Code was built provides a concrete case study of agent design principles in production.⁵ The architecture is deceptively simple: a single main loop that sends messages to Claude, receives tool calls back, executes them, and feeds the results back in. No complex orchestration framework, no multi-agent system, no specialized reasoning engine — just an LLM in a tight loop with the file system, the terminal, and a growing context window.

What makes this work is obsessive attention to what Jay would call interface hardness. The tool descriptions are crafted to be maximally legible to the model — clear names, precise documentation, explicit constraints. The system prompt is long and detailed, essentially a manual for how to be a software engineer. And the key insight that Anthropic keeps repeating is that the model's capabilities were already there, waiting for the right interface: Claude Code exists not because of a capability breakthrough but because of a "product overhang" — the model could already do most of what Claude Code does, it just needed the tools and prompts to do it reliably.

The multiplayer dimension is emerging too. A separate analysis argues that the current AI paradigm's biggest blindspot is that it's built for single-player use: one human, one AI, one task.⁶ The missing mode is what the author calls "multiplayer AI" — systems where multiple humans and multiple AI agents collaborate on shared artifacts, with the AI understanding not just the task but the social context of who is working on what, who needs to be informed of changes, and how different contributors' work relates. This is where ontological hardness becomes genuinely hard: the environment isn't just a codebase or a set of APIs but a social system with informal protocols, implicit expectations, and constantly shifting context.⁶

The Practical Upshot

Before asking "how smart is the model?" ask "what kind of world did we give it?" Cross-model variance on a task tells you about the environment: if only the strongest model succeeds, the environment is probably soft — leaning on model-side inference to compensate for missing structure. The fix is always to promote: take constraints currently expressed as prose and reimplement them as typed schemas, validated transitions, scoped permissions, and budget caps.

Building effective agents by Anthropic — source ↩ ↩²
How we built our multi-agent research system by Anthropic — source ↩ ↩² ↩³
Ontological Hardness by Jay — source ↩ ↩² ↩³ ↩⁴ ↩⁵
The Terrarium by Caleb Biddulph — source ↩ ↩²
How Claude Code is built by Pragmatic Engineer — source ↩
AI's Missing Multiplayer Mode — source ↩ ↩²

Linked from

Ai And Language Models Overview
LLM Agent Design is the engineering side — workflows vs.
Distributed Cognition
LLM Agent Design discovers the same principle from the engineering side: multi-agent systems work mainly because they spend enough tokens to solve the problem, and each subagent acts as an intelligent filter with its own context window, distilling va…
Embeddings And Vector Search
CLIP matters for Llm Agent Design because it demonstrates that you don't need to engineer cross-modal understanding — you can learn it from the correspondence between text and images that already exists on the internet.
Embeddings And Vector Search
Every time you do semantic search, every time a RAG system retrieves context, every time a recommendation engine finds "similar items," embeddings are doing the work.
Extended Mind Thesis
This is a genuinely different vision of human-AI interaction from the autonomous agent paradigm.
Prompt Engineering
As tool use and agentic architectures mature, the emphasis shifts from clever prompting to good agent design.

Open in stacked reader →