Token Discipline Is the New Skill IT Engineers Need

Token Usage Costs vs Output

You’re mid-sprint. Claude Code is running through your backlog — refactoring a 400-line module, generating test cases, surfacing edge cases you missed at 11 pm. For the first time in months, you feel like you are actually ahead of the work rather than managing the pile.

Then comes the memo. Licenses being pulled. Migrate to the in-house tool by June 30.

Not because the tool was broken. Not because the output was wrong. Because it worked so well that nobody had modelled what “working” actually costs at scale. The gap between how AI tools behave and how enterprise budgets are built is the central story of mid-2026.

The tool succeeded at its job. However, the pricing model it ran on didn’t survive real adoption. And if you’re an engineer, tech lead, or delivery practitioner at an Indian IT firm using AI coding tools right now, you’re working inside that same unresolved tension.

Uber’s CTO Burned ₹1 Lakh in Two Hours

In April 2026, Uber CTO Praveen Neppalli Naga made a disclosure that became the most-cited data point in the enterprise AI cost conversation: the company had exhausted its entire 2026 AI budget in just four months, primarily through Claude Code usage.

Uber had deployed Claude Code to roughly 5,000 engineers in December 2025. Adoption was fast and wide. By March 2026, 84% of those engineers were classified as agentic coding users; 95% used AI tools monthly; nearly 70% of committed code was AI-assisted.

$500–$2,000
Monthly Claude Code cost per engineer for Uber’s heaviest users
Uber CTO disclosure via The Information, reported by Fortune

The costs weren’t evenly distributed. Most engineers ran between $150 and $250 monthly. Power users (the ones executing complex, multi-step agentic workflows) generated $500 to $2,000 each month. The CTO himself burned $1,200 worth of tokens in a single two-hour session.

The company had been ranking engineers on internal leaderboards based on Claude Code activity, accelerating adoption faster than the finance team had modelled. “I’m back to the drawing board, because the budget I thought I would need is blown away already,” Naga told The Information. The COO followed weeks later, telling Fortune that “the link is not there yet” between Claude Code spend and useful consumer features shipped.

Microsoft’s followed within months.

What Changed: Token-Based Pricing Meets Agentic Scale

On May 14, 2026, The Verge’s Tom Warren reported that Microsoft was canceling most internal Claude Code licenses across its Experiences and Devices division. This is the same team that is responsible for Windows, Microsoft 365, Outlook, Teams, and Surface hardware. June 30 is the deadline, the final day of Microsoft’s fiscal year. Engineers were directed to migrate to GitHub Copilot CLI, Microsoft’s own command-line AI coding tool.

This isn’t coincidental. Canceling at the fiscal year boundary prevents token costs from rolling into FY27 budgets and gives Microsoft a clean window to rebuild governance frameworks before any future large-scale AI coding rollout. The Claude Code pilot ran for exactly six months — introduced in December 2025, pulled in June 2026.

30×
Cost per AI interaction: agentic workflows in 2026 (~$1.20) vs. linear workflows in 2023 (~$0.04)
Agentic AI Total Cost of Ownership, EY Insights (June 2026)

This is a structural issue and it isn’t Uber or Microsoft-specific. A simple linear AI workflow in 2023 cost roughly $0.04 per interaction. An orchestrated agentic system in 2026 costs approximately $1.20 — about 30 times more — because agents execute reasoning loops, call external tools, and iterate across multi-step tasks in ways that conversational AI never did. Meanwhile, per-token prices have dropped 98% since 2022, but total enterprise AI spend grew an estimated 483% from 2024 to 2026 anyway, driven entirely by volume.

483%
Growth in total enterprise AI spend from 2024 to 2026, even as per-token prices fell 98% in the same period
Industry analyses via The Next Web (June 2026)

Anthropic added its own signal on May 13, 2026: paying Claude subscribers now face a separate monthly credit meter for agent tools and third-party harnesses, billed at API rates from June 15. The flat-fee era for agentic AI is ending at the same moment that agentic adoption is accelerating. Those two facts in the same sentence describe the situation every practitioner is now in.

What This Changes for AI Practitioners in Indian IT

For Engineers and Developers

Infosys and Anthropic announced a formal collaboration at India’s AI Impact Summit in New Delhi in February 2026, integrating Claude Code into Infosys’s Exponential Engineering division to build client-facing AI agents for telecom, BFSI, and manufacturing. Infosys is already deploying Claude Code internally — not as a pilot, but as the operating model for production-grade client AI engagements. Infosys developers are inside this cost dynamic, not watching it from the outside.

Separately, as of June 3, 2026, Microsoft confirmed that TCS, Infosys, and Wipro have collectively rolled out Microsoft 365 Copilot to over 300,000 employees combined. India’s large IT firms aren’t watching this cost crisis unfold — they’re running some of the largest enterprise AI tool deployments in the industry.

For engineers on AI-augmented delivery projects, the immediate implication is that license access is no longer guaranteed just because a tool was approved. If your team can’t demonstrate that AI tool usage produced measurable output — code shipped, test coverage gained, defect rate on AI-assisted modules tracked — the license review conversation is knocking at your door. At Microsoft, it arrived six months after rollout. At Uber, four months.

For Tech Leads and Delivery Managers

When Uber COO Andrew Macdonald said publicly in May 2026 that the company “can’t draw a direct line between Claude Code usage and useful consumer features shipped,” he was describing the exact accountability gap that will show up in delivery review meetings. That gap is no longer a strategic observation. It’s now a line item on the cost side of a P&L, and clients are watching.

Citrini Research’s “The 2028 Global Intelligence Crisis” report in February 2026 documented Fortune 500 procurement managers using AI productivity claims to renegotiate IT services contracts at 30% discounts. That pressure doesn’t stay at the global account level — it travels down into delivery SOWs and project governance structures, and it arrives faster than most tech leads expect.

For Practitioners at Consulting Firms

For practitioners at Deloitte, KPMG, EY, Accenture, and Capgemini, the client-facing dimension is sharpening. Clients are walking into AI governance conversations with real case studies — Microsoft, Uber — and real numbers: $500 to $2,000 per engineer per month. The consultant who can map a specific AI workflow to a specific cost and a specific output is no longer just technically useful. They’re also the ones clients want in the room when governance gets tight.

What Everyone’s Missing in the Enterprise AI Cost Story

Here’s the tension in the data that the dominant narrative glosses over. Research from engineering management platform Jellyfish, cited by The Next Web in June 2026, found that the engineers who used the most tokens were roughly twice as productive as lighter users — but they consumed ten times the tokens to reach that output.

That’s not a waste problem. That’s a measurement problem.

The standard interpretation of the Microsoft cancellation — “AI coding tools are too expensive, slow down” — misreads what the numbers actually show. The highest-cost engineers were the most productive ones. Restricting access wholesale punishes exactly the engineers you most want working at full capacity. And nobody in the mainstream coverage is asking the question that matters most: when you pull AI access from your best engineers, what exactly have you governed?

The right question isn’t “how do we use less AI” — it’s “how do we measure the output vis-a-vis the AI token usage.”

What I keep seeing missed in the Indian IT context is this: the collision isn’t primarily between cost and productivity. It’s between your firm’s official AI narrative and your client’s new skepticism. Infosys has a formal Anthropic partnership, is deploying Claude Code internally, and is selling AI-augmented delivery as a core differentiator. At the same time, enterprise clients just watched Microsoft cancel licenses because engineers used a tool “perhaps a little too much.” When those two realities collide in a delivery governance call (and they will) the AI practitioners who have actual usage data and output metrics will rated much higher than those who can only say they use AI tools. The conversation will be very different.

There’s an uncomfortable corollary here. The enterprise instinct — restrict access, standardize on the cheapest tool, pull the licenses — might be the exact response that slows down the AI practitioners who’d been doing the most useful work. It’s not guaranteed to be wrong. But before treating it as the obvious lesson, it’s worth sitting with the question.

My Take: This Is a Forcing Function, Not a Verdict on AI Tools

This isn’t primarily a threat. At a granular level, it’s a serious attempt to understand how AI actually changes the work, not just how fast it runs. The enterprises struggling right now aren’t struggling because Claude Code doesn’t work. They’re struggling because they deployed at speed without building the measurement infrastructure to know what “working” produces in their specific delivery context.

That gap is the opportunity for AI practitioners who move first.

Here’s my concrete prediction for Indian IT over the next six months: enterprise clients — particularly those in BFSI, telecom, and regulated manufacturing, which are the sectors where Infosys is explicitly deploying Claude Code under its Anthropic collaboration — will begin requiring AI tool cost visibility as a standard element of delivery governance. SOWs and project charters will carry AI tool usage parameters. It will not be restricted to the usual deliverable definitions. The Citrini Research data shows that procurement pressure is already reshaping commercial conversations at the account level. Governance requirements will follow into the delivery layer.

The practitioners who get there first, ones who can map a specific prompt workflow to a specific, verifiable output metric will be the ones retained on AI-augmented engagements when frameworks tighten. The ones who can’t draw that line will be the ones whose access gets reviewed first. Not because they’re bad engineers. Because they never built the case for themselves.

What This Means for You: Three Decisions to Make This Week

1
Start tracking your own AI tool usage — before your firm does it for you. Most enterprise AI tools (GitHub Copilot, Claude Code) surface usage dashboards. Know what you’re spending per sprint, per module, per task type. If you don’t have that number, you can’t defend the access you have when governance arrives.
2
Ask yourself: can you draw a direct line from your AI tool usage this sprint to something your delivery lead can verify? Lines of code reviewed. Test cases generated without additional rework. Defect rate on AI-assisted modules compared to hand-written ones. If that line doesn’t exist yet, building it is more career-protecting than any certification running right now.
3

Try This — under 30 minutes

Pick one task you ran through an AI coding tool this week. Rewrite the prompt with two specific additions: first, a boundary on what NOT to generate (“don’t scaffold boilerplate — fix the logic error between lines 45 and 60 only”); second, the exact output format you need. Run it. Compare the output volume and actual usability against the original prompt. You’ve just run your first prompt efficiency test. This skill determines whether your access survives governance tightening.

The Engineers Who Got Their Licenses Pulled Weren’t the Unproductive Ones

Smart AI practitioners know the cost of (tokens) each output. This is no longer just a productivity habit. It’s how you keep the access in the first place.

Sources

  1. Microsoft cancels most Claude Code licenses in Experiences & Devices division; engineers directed to GitHub Copilot CLI by June 30, 2026 — Notepad newsletter, The Verge (Tom Warren, May 14, 2026); also confirmed by Windows Central
  2. Uber CTO Praveen Neppalli Naga discloses company exhausted entire 2026 AI budget in four months — The Information (via Fortune; Yahoo Finance)
  3. Per-engineer Claude Code costs: $150–$250 average monthly, $500–$2,000 for heavy users; CTO spent $1,200 in single two-hour session — Forbes report, cited in Storyboard18; Yahoo Finance
  4. Agentic workflow cost approximately $1.20 per interaction vs. $0.04 for linear workflows in 2023 (30x increase) — EY Insights: Agentic AI Total Cost of Ownership (June 1, 2026)
  5. Per-token prices down 98% since 2022; enterprise AI spend up 483% from 2024 to 2026 — The Next Web (June 5, 2026)
  6. Anthropic pricing change: separate credit meter for agent tools and third-party harnesses starting June 15, 2026 — Storyboard18; BeInCrypto
  7. Engineers using most tokens were approximately 2x as productive but consumed 10x the tokens — Nicholas Arcolano, Jellyfish, cited in The Next Web (June 5, 2026)
  8. Infosys–Anthropic strategic collaboration, integrating Claude Code into Exponential Engineering division — Anthropic newsroom; Infosys press release (February 17, 2026)
  9. TCS, Infosys, and Wipro deploy Microsoft 365 Copilot to 300,000+ employees combined — Microsoft announcement, Windows News (June 3, 2026)
  10. Fortune 500 procurement managers renegotiating IT services contracts at 30% discount using AI productivity as leverage — Citrini Research “The 2028 Global Intelligence Crisis,” cited in BusinessToday (February 24, 2026)


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *