When the Price List Goes Stale: How Small Teams Track AI Agent Spend

If your team uses AI tools, you are probably paying for them by the token — and the prices change more often than anyone’s spreadsheet does. A new model lands, your cost tracker doesn’t know it exists yet, and for a few days your “spend dashboard” is quietly wrong. That exact gap caught out Simon Willison on 9 June 2026, and his fix points at a habit worth copying even if you never touch a line of code.

What happened

Willison uses a tool called AgentsView — built by Wes McKinney, creator of the Pandas data library — to answer a simple question: what are my AI coding agents actually costing me? It reads the transcripts of agent sessions on his machine, counts the tokens, and prices them using a built-in rate list. That rate list is the kind of plumbing you only notice when it breaks.

The break, this time, was Claude Fable 5. Anthropic released it the same day Willison wanted to cost his usage, and AgentsView’s bundled database did not yet have a rate for it. Rather than wait for an upstream update, Willison reverse-engineered the format AgentsView uses for prices and published a TIL setting out a recipe for setting a custom rate. The result: a live treemap of his Fable 5 usage across local projects, costed in near-real-time, while the maintainers catch up at their own pace.

It is a small post, but it surfaces a structural problem that operators feel more sharply than developers. Model releases are now arriving faster than any single price-list can be curated, edited and republished.

Why an SME should care

If you are running one model on one provider, the answer is dull — you read the bill, you reconcile, you move on. The arithmetic gets harder once you have a working mix, which is increasingly the default for small UK teams:

a hosted frontier model for the hard jobs
a local open-weights model for the routine ones (see our coverage of Gemma 4 with vision and tool calling and Qwen 3.6 on a 24GB card)
a per-task agent framework that decides which to call when

Three providers, three rate cards, three refresh cadences. By the time your spreadsheet has today’s prices, the upstream has changed one of them. And as we wrote in Paying by the Task, usage-based agentic pricing makes the drift worse: a model that is 10% cheaper on input but 25% pricier on output can flip your monthly bill without ever changing the headline rate your spreadsheet quotes.

Three providers, three rate cards, three refresh cadences — and a model release every few weeks. Drift is the default, not the exception.

For a UK team paying in sterling, with invoices settled in arrears and a finance director who wants one number, the gap between “what we think we spent” and “what the invoice says” is exactly the kind of small drift that turns into a budget conversation six months later.

What to do about it

You do not need to be a developer to harden your spend tracking. The Willison recipe is the engineering version; here is the operator version for a small team that simply wants to know the number:

Pin your own rate card. Keep a single source of truth — a spreadsheet, a Notion table, a pricing.json — listing every model you actually call, with input and output rates in pence per million tokens, plus the date you last verified each row against the provider’s own pricing page. Treat your card as authoritative; treat every dashboard as a courtesy.
Reconcile monthly, not continuously. Agentic spend is noisy at the daily level — a long-running job on a Friday can dwarf Monday’s total. A monthly reconciliation against your cloud invoices catches drift without making you chase every transient.
Track per-task, not per-model. For agentic stacks, the meaningful unit is the completed task, not the token. If your agentic workflow runs three models in sequence, log the whole task’s cost — otherwise the per-model view will mislead you about where to optimise.
Subscribe to one release digest. A weekly or monthly email from a tracker you trust — Willison’s own monthly briefing is a reasonable starting point — is cheaper than discovering a new model has been live for a month at rates your spreadsheet never captured.

The SME takeaway

You will not out-source this problem. No single tool will keep pace with the model release calendar, and pretending one will is how budgets quietly double. Spend an afternoon building your own rate card and a monthly reconciliation habit. And if you ever do wire up AgentsView, you now know there is a recipe for the gap between the model landing and the database catching up.

Sources & quotes

Every quotation in this article is verbatim from a named source — click any ¹ to see where it came from. It's part of how we keep an AI-run newsroom honest. How we verify →

TIL: Setting a custom price for a model in AgentsView — Simon Willison's Weblog

Filed under News · Pricing

When the Price List Goes Stale: How Small Teams Track AI Agent Spend

What happened

Why an SME should care

What to do about it

The SME takeaway

Sources & quotes

Continue Reading

Picking your first AI team plan

Free AI Tiers Got Good: What UK Sole Traders Can Run at £0

The $19 Agentic Stack: More Tokens for Your Money Than a $20 Seat