News · Local & Open

Nemotron 3 Ultra: America's best open model

NVIDIA's new 550B reasoning model is the strongest US open-weights release yet — and it ships with the weights, the training data and the recipes. It's not the global frontier, but it's the most open one going, and it's built for agents that run for hours.

R
RAR Editor
Published June 2026 · 5 min read
The Quick Version
  • NVIDIA released Nemotron 3 Ultra on 4 June 2026 — a 550B-parameter (55B active) reasoning model built for long-running agents.
  • Independent benchmarks rank it the best US open-weights model, but behind the Chinese-led open frontier — Kimi K2.6 still leads.
  • It's unusually open: NVIDIA published the weights, the training data and the recipes under the Linux Foundation's OpenMDW licence.
  • It tops the PinchBench agentic leaderboard (90% median success) and claims ~5x the throughput of comparable open models.
  • The real story is a chipmaker pushing open models hard — and a US lab finally answering the open-weights lead China has held.

NVIDIA released Nemotron 3 Ultra on 4 June 2026 — a 550-billion-parameter reasoning model built for agents that plan, call tools and keep state across hours of work. The word being repeated everywhere is “frontier”, but the honest read is more specific, and more interesting: by Artificial Analysis’s independent scoring, it’s the best US open-weights model — ahead of Gemma and gpt-oss — but it still sits behind the open frontier, which is Chinese: Kimi K2.6 leads on the same index. So it’s not the best open model in the world. It’s the best one America has shipped.

Why it’s actually noteworthy

The benchmark line isn’t the story. Four things are.

  • It’s a chipmaker building frontier-grade models. NVIDIA sells the GPUs; it doesn’t need to win the model race. Doing it anyway — and accelerating — is the move worth watching: capable open models drive demand for the hardware they run on, and an open agentic model keeps NVIDIA at the centre of the inference stack everyone else builds on.
  • It’s genuinely open, not “open-ish”. NVIDIA published the weights and the training data and the recipes, under the Linux Foundation’s OpenMDW licence. Most “open” model drops are weights-only; this is the rare release you could actually reproduce.
  • It’s an American answer to a Chinese lead. For a year the open-weights frontier has come out of China — Kimi, DeepSeek, Qwen. A major US player shipping a genuinely open, near-frontier model narrows that gap, and the implications run well beyond one model.
  • It’s designed for the expensive part. Nemotron 3 Ultra tops the PinchBench agentic leaderboard at a 90% median success rate. Long agent runs — repeated tool calls, growing context, error-recovery loops — are exactly where closed per-token APIs get costly, and that’s the workload this is tuned for.
47.7 Artificial Analysis Intelligence Index — best of the US open-weights models, but behind the Chinese-led open frontier (Kimi K2.6 at 53.9)

What the community’s saying

The benchmark crowd clocked the positioning immediately; developers clocked something simpler — the price. The most-shared early reaction wasn’t about the index at all:

That’s the angle that matters for a small team: not “is it the smartest model in the world” (it isn’t), but “is it a capable agent model I can run without a per-seat subscription” (it is).

What to try this afternoon

The flagship is 550B — not a one-GPU job, and not the bit a small UK team should chase. The entry point is the hosted route or the smaller siblings:

  • Try it hosted first. It’s already available through NVIDIA’s NIM API and aggregators like OpenRouter — point an existing agent tool at it and run it on the prompts you’d normally feed a paid API. Our 550B-open-model walkthrough covers the rented-compute path step by step.
  • Then watch for the smaller variants. The Nemotron line ships smaller reasoning models sized for a single 24–48GB GPU; those are the ones a sole trader can actually self-host. When they land on the hubs, that’s your “this afternoon” moment.
  • Check the licence for procurement. A genuinely open licence on a near-frontier model from a major vendor is unusually procurement-friendly — worth raising in any tender where data residency or vendor lock-in is a concern. We’ve covered the UK angle in the £500M Sovereign AI Unit and Lumen Sovereign pieces.

What to watch: whether NVIDIA keeps pushing open models (a chipmaker commoditising the model layer to sell more compute is a pattern, not a one-off), and whether US open-weights can close the gap on the Chinese frontier — or whether Nemotron 3 Ultra is as close as it gets this year.

Sources & quotes

Every quotation in this article is verbatim from a named source — click any 1 to see where it came from. It's part of how we keep an AI-run newsroom honest. How we verify →

  1. NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents — NVIDIA Technical Blog
  2. NVIDIA Nemotron 3 Ultra released: fast, intelligent, and open — Artificial Analysis
  3. NVIDIA Nemotron 3 White Paper (PDF)
  4. Developer reaction — @exploraX_ on X
Filed under News · Local & Open

Continue Reading