The Axe Came Down Again

The Year Is 2014

Domain-Driven Design is the new gospel. The message is simple and intoxicating: find the nouns, draw the lines, ship the services.

Customer. Microservice. Order. Microservice. Payment. Microservice. Email. Microservice.

Teams took functional, working monoliths and carved them apart. Not because the seams were real, but because the nouns were obvious. The architecture diagrams looked beautiful. Everyone was doing it.

There was just one problem: the nouns were the wrong decomposition axis.

What teams got was a distributed monolith. All the coupling of the original system, plus network hops, serialization overhead, and a prayer that the message broker wouldn’t drop something important between Order and Payment.

The Distributed Systems Bill Arrived

Distributed queries. A SQL join became a choreography across three services, each with its own eventual consistency window.

Distributed transactions. Sagas are elegant in theory and brutal in practice. Half the teams got compensating actions wrong in edge cases that only surfaced under load.

Distributed monitoring. A single checkout became a correlation ID scavenger hunt across dashboards. “The request failed somewhere between Payment and Fulfillment” became a sentence people said with straight faces in production incidents.

These problems are all solvable. But solvable is not free. The bitter realization was that teams were paying this cost not because they needed to, but because they split things that should have stayed together.

The lesson crystallized: decompose along behavioral boundaries, not noun boundaries.

The Year Is 2026

The new gospel is agentic AI. The slides now have colored agent boxes instead of bounded contexts. And the axes are coming down again.

Configuration Agent. Checkout Agent. FAQ Agent. Recommendation Agent.

“We have configuration, checkout, and support, therefore three agents.” This is the exact reasoning that produced microservice-per-entity in 2014. The noun is not the seam.

The hidden costs rhyme eerily.

Context loss at handoffs. The next agent hallucinates because it’s missing information from the previous context window.

Coordination overhead. Routing decisions made by yet another LLM call, moving complexity without reducing it.

Distributed observability. Tracing a user journey across non-deterministic agents is the correlation ID scavenger hunt all over again.

Latency multiplication. Every handoff is a new LLM call with new prompt assembly and inference time.

Token cost duplication. Each agent carries its own system prompt, few-shot examples, and tool definitions.

The Noun Trap, Revisited

“FAQ” feels like a distinct agent until you realize a user asking about return policy during checkout needs FAQ capability woven into the checkout context, not a handoff to a separate agent that doesn’t know they just configured a vehicle.

When your motivation for splitting is organizational convenience, the thing you’re decomposing almost always turns out to have been a workflow all along. You just hadn’t noticed.

The Diagnostic That Cuts Through

One question does most of the work.

Would the user perceive and benefit from a specialist handoff?

“Let me transfer you to our tax advice specialist.” That’s a real agent boundary. The user expects different expertise and different behavior. “Let me transfer you to the checkout agent.” Nobody says this. This is a workflow stage masquerading as an agent boundary.

Beyond handoff perception, four structural tests matter:

Prompt incompatibility. Do the capabilities genuinely conflict in one system prompt, or are you just uncomfortable with a longer prompt?

Context window pressure. Is the context unreasonably large even after tool subsetting and state injection?

Trust boundaries. Do the capabilities operate under different compliance or audit constraints?

Independent evolution. Can you write meaningful evals for each capability in isolation, with no shared state?

What Actually Warrants a Split

Most designs people reach for multi-agent to solve have predictable order: configuration, then checkout, then confirmation. But predictable order means you already know the path in advance. Orchestration wired in code is the definition of a workflow, not a multi-agent system.

The real value of splitting comes from different behavioral loops and genuine parallelism. A code research agent is the clearest example: an orchestrator breaks a task into independent subproblems, fans out worker agents to search different parts of the codebase simultaneously, then synthesizes their findings. Each worker runs the same tight loop (search, read, summarize) but they run it in parallel and their results don’t depend on each other mid-flight. That’s a real boundary because the work is genuinely concurrent and the orchestrator can’t do it faster alone.

The same logic applies when the reasoning patterns actually differ: a ReACT loop for open-ended configuration reasoning, a Chain-of-Verification loop for compliance checks that must cite sources, a Plan-Execute-Replan loop for multi-step orchestration that needs to adapt mid-run. These need different token budgets, different retry behaviors, different tool sets. You can’t collapse them into one prompt without one pattern interfering with another.

“Configuration” vs. “checkout” vs. “FAQ”? Those are phases of one customer journey. They want the same context, the same conversation history, the same user state. There’s no parallelism, no conflicting loop structure, no isolation requirement. Splitting them doesn’t reduce complexity; it just moves it somewhere harder to see.

The Single-Agent Toolkit Is Deeper Than You Think

Before reaching for multi-agent, exhaust what a single agent can actually do.

Phase-aware system prompts with a stable spine and swappable body sections. The parts of the prompt that describe the agent’s fundamental behavior stay constant. The parts that describe the current task get swapped by the runtime.

Tool subsetting by phase. Only expose tools relevant to the current stage. A configuration agent doesn’t need payment tools, so don’t give it any.

Tool-call-driven phase transitions. When the agent calls start_checkout, the runtime swaps the tool subset and prompt section automatically. No handoff, no routing LLM, no information loss.

Context offloading via RAG, summarization at thresholds, and on-demand detail retrieval. Most context pressure problems have solutions that don’t require a new agent.

The killer advantage: in a single-agent setup, phase detection lives in the same context as the action. No information loss, no routing overhead. In multi-agent, phase detection becomes a separate routing LLM call with its own prompt and its own failure modes. You’ve moved complexity without reducing it.

The Pattern Rhymes

2014: Microservices2026: Multi-Agent
Seductive heuristicOne service per domain nounOne agent per role noun
What actually happenedDistributed monolithDistributed workflow
Hidden costDistributed transactions, queries, monitoringContext loss, routing overhead, observability gaps
Real split criterionIndependent deployability, failure isolationIndependent reasoning loops, trust boundaries

Where to Land

The correct number of agents is not a product of how many boxes you can draw. It emerges from real differences in reasoning loop, context pressure, trust boundary, or control flow pattern. If those differences aren’t there, a single agent with a well-structured workflow isn’t a compromise. It’s the better architecture.

Start with one agent. Make it excellent. Add more only when you can articulate, precisely and structurally, what the second agent can do that a workflow or a phase-based agent cannot.

The nouns will always tempt you to split. Resist until the behavior forces your hand.

← All posts