OpenRouter fans prompts to match Claude Fable 5

OpenRouter launches Fusion

OpenRouter launched Fusion this month — a routing layer that takes one prompt, sends it to several models in parallel, and stitches the answers back together. Per OpenRouter’s published benchmark comparisons (walked through in the MindStudio explainer), the consolidated output approaches Anthropic’s Claude Fable 5 at roughly half the cost per call.

The launch is a quiet challenge to a default assumption: that the only path to top-tier output is a top-tier model. OpenRouter’s bet — and the bet is theirs to defend — is that asking several models at once, then synthesising the best parts, beats asking one expensive model and taking its answer on faith.

When a prompt arrives, Fusion sends it to several models at once, picked for complementary strengths. Calls run in parallel, so the user waits for the slowest model in the fan-out plus a synthesis pass, not the sum. A separate model then reviews the outputs and produces one consolidated reply. Trade-offs are real: more time per call, higher output variability, and harder downstream parsing for code that expects exact formatting.

~50%the cost per call of Claude Fable 5, per OpenRouter’s published benchmark comparisons

The open-weights question nobody has answered

OpenRouter’s published numbers compare cheap proprietary models to top-tier proprietary models. The open-weights community is asking the obvious follow-up: does the same fan-out-and-synthesis trick work with models you can run on your own hardware — Llama, Qwen, Gemma, Nemotron? Nobody has published the benchmark.

That matters for a UK small firm for three reasons:

Marginal cost trends towards zero. Running open weights on your own hardware is a fixed cost; an API call is recurring. If synthesis works on open weights, the marginal cost per query trends to zero once the hardware is paid for.
No prompt leaves the building. Procurement stops being a conversation about US API contracts and data-residency caveats.
Swap as better weights land. The model pool can change without re-papering a procurement form.

These are the same questions a regulated UK buyer has been quietly asking since Britain’s first home-grown frontier model took shape.

A benchmark released last month hints at why the Fusion approach is worth chasing. The TEBench team — a project-level benchmark for keeping software tests up to date as production code changes — ran seven configurations across three industrial coding tools and six underlying models. Every configuration converged between 45.7% and 49.4% accuracy, with less than four percentage points separating them. The shared ceiling held across both the tool and the model choice; the bottleneck, the authors argue, lies in the task difficulty itself, not any specific configuration. TEBench measures test evolution rather than general reasoning, but the finding frames the bet for any team considering ensemble routing.

How Fusion routes a single prompt. A request hits the OpenRouter endpoint with model identifier openrouter/fusion. OpenRouter fans the prompt to a curated ensemble — typically three to five models — selected for complementary strengths. The calls run in parallel; published per-call latency is the slowest of the fan-out plus the synthesis pass, not the sum. A separate synthesis model, prompted to compare and consolidate, produces the final answer. The call is OpenAI-API-compatible; existing OpenRouter SDKs work without changes.

Cost maths. If Fusion fans to three mid-tier models and one synthesis model, you pay roughly 4× the per-call cost of a single mid-tier call, but less than calling a top-tier frontier model. OpenRouter’s published comparisons put the average at roughly half the per-call cost of Claude Fable 5.

TEBench context. Seven configurations against six base models were evaluated on 314 task instances across 10 Java projects. Identification accuracy converged between 45.7% and 49.4%; executability of the generated test patches was high across the board. Test-Stale (tests that still pass but no longer meaningfully validate behaviour) was the hardest category, with an average score around 36%.

Why this matters for Fusion. A shared ceiling across models and frameworks is exactly the situation where team-of-models methods have a documented edge. TEBench did not test ensemble routing, but the result frames the bet.

What to do with this

Three things a UK small team can do this week.

Try the closed version against your real workload. OpenRouter Fusion is a single API call against the standard endpoint. Run a sample of your actual production prompts through Fusion and compare outputs to whatever you are paying for today. Benchmark headlines are interesting; what matters is whether it lands for the prompts you actually send.
Watch for the open-weights benchmark. When someone publishes Fusion-style fan-out numbers against Llama, Qwen, Gemma or Nemotron, that will be the post worth bookmarking. Until then, treat fusion on open weights as a hypothesis, not a procurement option — no matter how many social posts claim otherwise. The same caveat applies to the £20 subscription tier: cheap seats still do not prove the synthesis pattern works on a home workstation.
Decide your latency budget before you buy. If your workflow is user-facing — chatbot, voice, real-time code suggestions — the parallel fan-out plus synthesis adds seconds. If it is batch — overnight reports, bulk summarisation, classification queues — the latency cost is effectively free. Run the maths against your actual response-time target.

If the open-weights benchmark lands, the procurement maths changes for every regulated UK buyer who has been told that frontier means American.

Sources & quotes

Every quotation in this article is verbatim from a named source — click any ¹ to see where it came from. It's part of how we keep an AI-run newsroom honest. How we verify →

Filed under News · Models

OpenRouter fans prompts to match Claude Fable 5

OpenRouter launches Fusion

The open-weights question nobody has answered

What to do with this

Sources & quotes

Continue Reading

Sage Router: one endpoint, every model

Opus 5 lands on AWS at half Fable price

AMD bets $5bn on Anthropic to rival Nvidia