Big Tech pulls back on AI tokens as $500M bills pile up

Amazon, Meta, Microsoft and Uber have all pulled back on internal AI token consumption metrics in recent weeks, after employees burned through billions of dollars worth of compute with little measurable business output, marking the most significant enterprise AI spending correction since the technology went mainstream.

"Please don't use AI just for the sake of using AI," Dave Treadwell, Amazon's senior vice president of engineering, told staff this week. "Use AI to help you solve customer problems, to help you solve business problems, to innovate."

Amazon shut down KiroRank, an internal leaderboard that tracked AI token usage on its Kiro developer platform, on May 29, according to Business Insider. The dashboard had encouraged employees to inflate their scores through "tokenmaxxing" — running meaningless tasks through AI agents to consume tokens and climb rankings. An Amazon spokesperson confirmed the tool had been "deprecated" and said it "was never intended to promote the use of AI for usage's sake."

The retreat is not isolated. Meta killed its own AI usage leaderboard, called Claudenomics, the same week. That dashboard had tracked token consumption across 85,000 employees and singled out the top 250 users, according to Fortune. Uber's chief operating officer, Andrew Macdonald, recently said the company found no clear relationship between increased AI spending and successful product delivery — after its engineers exhausted the full-year Claude Code budget by April. Microsoft canceled Claude Code licenses across its Experiences and Devices division earlier this month, redirecting engineers to its own GitHub Copilot CLI.

The cost of treating tokens as a productivity metric

Tokens are the units large language models use to process text and generate responses. Under token-based pricing, costs scale with usage, not outcomes. When companies incentivized consumption without measuring results, they created a system where inflating numbers was rational for individuals but destructive for budgets.

One unnamed company accidentally spent $500 million in a single month on Anthropic's Claude after failing to set usage limits, according to Axios. That single client accounted for roughly one-eighth of Anthropic's estimated $4.7 billion annualized revenue run rate. The company has not been publicly identified, though speculation on social media has centered on Amazon, which is spending approximately $200 billion in capital expenditure in 2026, primarily on AI and data centers.

Amazon had set a target for more than 80% of its developers to use AI tools weekly, according to Fortune. The company has now replaced raw token counts with a metric called "normalized deployments," which measures AI-assisted code that actually ships rather than tokens consumed.

Duolingo CEO Luis von Ahn recently acknowledged similar internal friction. "We're not being held accountable for actual results," he said on a podcast in April, after the company withdrew a plan to tie AI usage to employee performance reviews.

What this means for AI infrastructure spending

The pullback does not signal a retreat from AI investment. Amazon's $200 billion CapEx commitment remains intact. Google announced at its I/O conference that Gemini usage had jumped from 480 trillion tokens per month in May 2025 to 3.2 quadrillion tokens per month in May 2026, driven largely by agentic AI and coding tools that burn through far more compute than basic chatbot queries.

But the shift from consumption-based to outcome-based metrics represents a meaningful change in how enterprise AI value is measured. Companies that sell AI tools on a per-token basis — including Anthropic and, to a lesser extent, OpenAI — face growing pressure to demonstrate return on investment as clients tighten budgets. Internal tool providers like Microsoft's GitHub Copilot stand to benefit as companies redirect spending toward integrated platforms.

DeepSeek, the Chinese AI lab, noted in its V4 technical report that its model now outperforms Claude Sonnet 4.5 internally while costing less — a reminder that the next phase of enterprise AI competition will be defined by efficiency, not raw consumption.

For investors, the message is nuanced. Nvidia, whose GPUs power the vast majority of AI training and inference workloads, has been the primary beneficiary of the token arms race. If enterprise clients begin optimizing for fewer tokens per task rather than more, the demand growth curve for AI compute could flatten. Nvidia shares trade at roughly 35 times forward earnings, pricing in continued exponential growth in data center revenue.

"Companies are still figuring things out," Will McGough, chief investment officer at Prime Capital Financial, told the Wall Street Journal. The correction from "use more AI" to "use AI that works" is just beginning.

This article is for informational purposes only and does not constitute investment advice.