Essay · May 14, 2026 · 7 min read

Specialists, not generalists: a note on AI agents.

Six small AI agents doing one job each will beat one big AI agent doing six jobs. Here is what we found out about why, and what changes when you build that way.

A small design studio with focused workstations

The first instinct when you build with AI is to ask the biggest model the biggest question. Describe the business, ask it to run the business. We tried that. It did not work. The output was always close to correct and never actually shippable. Subtle mistakes in every layer of the answer: tone, code, copy, decisions.

The thing that finally worked was the opposite. Many small agents, each focused on one job, each with a narrow piece of context, handing work to the next one. The output was better. The errors were easier to find. The cost dropped.

One generalist agent doing six jobs makes six different kinds of mistakes. Six specialist agents doing one job each make far fewer.
The rough finding

Why specialists win

A specialist has a narrow context window of just the information that matters for its job. A research agent gets the brief and the URL of the existing site. A design agent gets the brief and the planning document. A QA agent gets the brief and the finished build. Nothing more. The agent does not have to choose what to pay attention to. It is already chosen.

A generalist drowns in the same context. It is given the brief, the URL, the brand guide, the planning doc, the design system, the existing code, the deployment config, and asked to do all the work. The model has to compress all of that into a single thread of attention. Compression is where mistakes come from.

Chatting directly with a specialist agent in Company Agents — Talking to one specialist agent on its own thread

What we measure

The specialist pattern improves three things we care about in production. The numbers below are from running both patterns against the same client website builds at Horizon Labs over a month.

−68%

Errors per build

−41%

Cost per build

+2.4×

Iteration speed

6 of 6

Specialists win

What it looks like at Horizon Labs

Horizon Labs is our agency. When a client signs up for a new website, six agents run the build, each with one job. Reese studies the existing site. Avery plans the new one. Sasha designs it. Kai picks the imagery. Morgan writes the code. Quinn does QA before launch.

Sasha, the design specialist — Sasha · design only

Morgan, the build specialist — Morgan · build only

The agents do not see each other’s prompts. They see each other’s output. The handoff document between stages is a small, structured artifact that contains only what the next stage needs. The same way a designer hands a developer a Figma file, not their Slack history.

What we still get wrong

Specialists struggle on jobs that span their boundaries. Picking imagery to match a tone the design has not landed yet is hard. Writing copy that hits a vibe nobody has described is hard. The fix is not a bigger agent. The fix is a tighter handoff document that names the missing piece explicitly so the next specialist can solve it.

We will keep posting what we find. If you are building something like this, write back.