Sakana launches Fugu, taking aim at the top of multi-agent orchestration
Sakana AI, the startup founded in 2023 by former Google Brain researchers — including David Ha, former research director at Google DeepMind Japan — announced Fugu this Monday, a model dedicated exclusively to orchestrating AI agent squads. Based in Tokyo, the company is known for developing models inspired by evolutionary and self-organization principles. Fugu represents its most direct bet on the multi-agent systems market, which Gartner expects to grow more than 45% per year through 2028.
What Fugu is and why it's different
Fugu is not another generalist LLM. It is a model trained specifically for the coordination task: deciding which specialist agent responds, in what order, with what shared context, and how to consolidate outputs into a coherent response.
Instead of a single giant model trying to do reasoning, coding, search and synthesis simultaneously — which penalizes cost and latency — Fugu acts as the conductor of an ensemble. Each squad agent brings narrow specialization; Fugu maintains the global task state and dispatches subproblems.
According to Sakana's technical blog, Fugu was evaluated on complex multi-step reasoning benchmarks (MATH, composite HumanEval, WebArena) with a 38% reduction in average inference cost compared to single-model monolithic approaches.
The architecture: why it matters for engineers
Fugu's design is based on delegation graphs: directed graphs where each node is an agent and edges carry the minimum context needed for the subtask. This solves a classic multi-agent problem — token explosion when context from one agent pollutes another's, inflating cost and degrading quality.
Another key technical point: Fugu has native support for hierarchical tool-calling. Child agents can have their own tool sets (search, code, database) without the orchestrator needing to know each tool. Fugu only needs to know each agent's input/output contract.
For teams using LangGraph, AutoGen or CrewAI today, Fugu can work as a layer above — an option worth evaluating as the API becomes widely available.
Why companies should pay attention now
The timing is strategic: frontier models have stopped delivering quality leaps month over month. The performance delta between GPT-4o and Claude Sonnet 3.7 is smaller than the delta between having a well-orchestrated squad and not having one.
In practice this means:
- Cost per result drops when subproblems are delegated to smaller specialized models
- Latency improves with real subtask parallelization
- Reliability increases because each agent has a narrow scope with fewer hallucination risks
Sources
- Sakana AI — official Fugu announcement (blog.sakana.ai, Jun. 2026)
- Gartner — *Forecast: AI Agent Platforms, Worldwide, 2024-2028*
- Stanford HAI — *AI Index Report 2025*
- Sequoia Capital — *State of AI 2024*
Got an AI, video, or growth project?
Talk to us →