The Hidden Cost of AI: When Tokens Cost More Than Employees

The story the adoption dashboards didn't show

For the past two years, the narrative around enterprise AI has been relentlessly upbeat. Adoption metrics climbing. Productivity gains promised. Leaderboards ranking teams by how much AI they used. But in May 2026, a very different story began to surface from inside some of the world's largest technology companies — and it should give every organisation pause.

According to reporting by Fortune and corroborated across multiple outlets, Microsoft has begun cancelling most of its direct Claude Code licences inside its Experiences and Devices division, directing engineers to migrate to GitHub Copilot CLI by 30 June 2026. The framing was "consolidation" — but the timing, right at the end of Microsoft's fiscal year, told a different story. This was about cost.

The headline finding: Microsoft's own internal data suggests that for some agentic workloads, the cost of running AI agents now exceeds the cost of the human employees they were meant to augment or replace. The ROI maths, in some cases, simply doesn't add up.

Uber burned through its annual AI budget in four months

Microsoft isn't alone. Uber's CTO Praveen Neppalli Naga revealed that the company had exhausted its entire planned 2026 AI coding tools budget in roughly four months — after actively incentivising adoption through internal leaderboards ranking teams by AI usage.

Individual engineers at Uber were reportedly spending between $500 and $2,000 per month on tokens alone. Uber's operations chief Andrew Macdonald noted something even more troubling: that token usage didn't seem to correlate directly with useful consumer features. In other words — more spend, but not necessarily more value.

4 months

How long Uber's annual 2026 AI budget lasted

$500–2,000

Monthly token spend per individual Uber engineer

24×

Projected increase in token demand from agentic AI by 2030 (Goldman Sachs)

The paradox: cheaper tokens, bigger bills

Here's what makes this genuinely counterintuitive. The cost of an individual AI token is falling — and is expected to keep falling sharply. Gartner projects that by 2030, inference on a one-trillion-parameter model will cost AI firms nearly 90% less than it did in 2025.

So why are bills going up? Because agentic AI consumes dramatically more tokens per task. Where a standard LLM query might use a few thousand tokens, an agentic workflow — one that reasons, calls tools, retries, and chains multiple steps together — can consume orders of magnitude more. Some reports suggest agentic workflows use up to 1,000× the tokens of a standard query.

The Gartner warning: "Chief Product Officers should not confuse the deflation of commodity tokens with the democratization of frontier reasoning." In plain terms — cheaper tokens won't save you if your AI agents consume exponentially more of them. Increased consumption can easily outpace falling unit costs.

The AI Cost Paradox: unit price falls, total spend climbs

Why this is a governance problem, not just a finance one

It would be easy to file this under "finance" and move on. That would be a mistake. The root cause of these budget overruns isn't expensive tokens — it's the absence of AI cost governance. Organisations rolled out AI tools companywide, incentivised maximum usage, and never put the controls in place to understand or manage what that usage would cost.

This is the same pattern we saw with early cloud adoption. Organisations moved to the cloud, celebrated the flexibility, and then received eye-watering bills because nobody was governing consumption. An entire discipline — FinOps — emerged to solve it. AI is now at exactly the same inflection point.

The three cost layers most organisations don't account for

Part of the problem is that AI costs are not a single line item. There are at least three distinct layers, and most organisations only budget for the first:

The AI Cost Iceberg: what you budget for vs what you actually pay

Model inference — the raw cost of each call to the model. This is what most organisations estimate.
Orchestration overhead — the tokens consumed by the reasoning, planning, and coordination layers of agentic workflows. Frequently underestimated.
Tool-call chaining — every time an agent calls a tool, retries a failed step, or loops back on itself, it consumes more tokens. This compounds rapidly and unpredictably.

What good AI cost governance looks like

The organisations that will succeed with AI aren't the ones that use the most — they're the ones that govern their usage intelligently. Here's what that looks like in practice:

Control	What it does
Spend visibility	Track AI token spend per team, per application, and per use case — in real time, not at month-end when the bill arrives
Budget alerts & caps	Set hard limits and automated alerts so a runaway agent can't quietly consume thousands of dollars before anyone notices
Model right-sizing	Route simple tasks to cheaper, smaller models and reserve frontier models for tasks that genuinely need them
Value measurement	Tie AI spend to measurable business outcomes — if usage isn't producing value, it should be questioned, not celebrated
Caching	Cache repeated queries and common responses to avoid paying for the same inference twice
Usage policies	Define when agentic workflows are appropriate vs when a simpler, cheaper approach will do

The key shift: Stop measuring AI success by adoption volume. A leaderboard ranking teams by how many tokens they consume is optimising for exactly the wrong thing. Measure AI success by value delivered per dollar spent.

The uncomfortable boardroom question

The lesson emerging from Microsoft and Uber is uncomfortable but important: the more successful your AI rollout looks on an adoption dashboard, the harder it may become to explain the cost.

Every organisation deploying AI at scale should be able to answer three questions:

What is our total AI spend this month, broken down by team and use case?
What measurable business value is that spend producing?
What controls do we have in place to prevent runaway consumption?

If you can't answer all three with confidence, you have an AI cost governance gap — and based on the experiences of Microsoft and Uber, that gap can become very expensive, very quickly.

This connects directly to resilience

AI cost governance and AI business continuity are two sides of the same coin. Both come down to the same underlying discipline: treating AI as critical infrastructure that must be actively managed, monitored, and governed — not a magic productivity tool that can be deployed and forgotten.

The organisations that adopt AI wisely — with cost controls, resilience planning, and clear governance — will capture its benefits sustainably. The organisations that don't will keep discovering, the hard way, that unmanaged AI is a liability dressed up as an advantage.

Is your AI spend under control?

AI Bods helps organisations put the governance, cost controls, and resilience frameworks in place to adopt AI sustainably. If your AI bills are climbing faster than your understanding of them, let's talk.

Talk to AI Bods →

Download our free guide — AI-First Without the Risk — for the complete framework on adopting AI wisely, including governance, cost management, and business continuity.

Download the free guide →

Sources & disclaimer: This article references reporting by Fortune ("Microsoft reports are exposing AI's real cost problem", 22 May 2026), The Information, Tom's Hardware, The Next Web, and research from Gartner and Goldman Sachs. All figures are drawn from publicly reported sources. This article is for general informational purposes only and does not constitute professional financial or business advice. AI Bods is an independent consultancy and is not affiliated with, endorsed by, or sponsored by Microsoft, Anthropic, Uber, or any other organisation mentioned.