
Why the AI Service Subscription Model Is Doomed to Fade Away
TechFlow Selected TechFlow Selected

Why the AI Service Subscription Model Is Doomed to Fade Away
The subscription model will be drained—use it while it lasts.
Author | Zhang Yongyi

On June 9, Anthropic launched Claude Fable 5—the strongest publicly available model it has released to date. As usual, this should have been a holiday for paying users: the money you pay each month finally earned you first access to the flagship model.
But one line in the announcement immediately triggered massive controversy after release: “Starting June 22, Fable 5 will be removed from all subscription plans; continued usage will require purchasing usage credits separately.”
In other words, even if you subscribe, you’ll only get 14 days of access to the flagship model.
A model launching with its own built-in “eviction notice” is unprecedented in the large language model industry.
Many viewed this as an error—or arrogance—on Anthropic’s part. I hold the opposite view: This isn’t a mistake—it’s a preview.
AI subscription models are heading toward an inevitable demise—not because any company is greedy, but because the very premise on which subscriptions rely is being dismantled by AI itself.
01 A Flagship Model With a 14-Day Countdown
Let’s lay out the facts first. According to Anthropic’s official schedule (as of June 9, 2026), Fable 5 is included free of charge in Pro, Max, Team, and per-seat enterprise plans from launch day until June 22. Starting June 23, it will be removed from these plans entirely—and every subsequent token will be deducted from prepaid usage credits at the same rate as the API.
That rate is not cheap: $10 per million input tokens, $50 per million output tokens—exactly double the pricing of the previous flagship, Opus 4.8. More subtly, even during the free window, Fable 5 consumes subscription quotas at roughly twice the weight—meaning the same task burns through quota twice as fast as Opus.
User reactions were predictable. On Hacker News, some called this “give-then-take” approach unsettling, suspecting Anthropic was trying to push subscribers toward pay-per-use billing. Developers tested it and found that a single agent programming session under the $100/month Max plan consumed nearly $100 worth of tokens.

Users vented online about insufficient token allowances|Source: Twitter
And this isn’t just Anthropic. Over the past eight weeks, the entire industry has moved in unison: OpenAI shifted Codex from per-message to per-token pricing aligned with its API on April 2, then rolled that change out to all existing enterprise customers.
GitHub froze new sign-ups for Copilot’s individual plan on April 20, announced a full transition to AI Credits billing one week later, and completed the switch on June 1—where the $10/month Pro tier includes exactly $10 in credits.
Anthropic’s moves were most aggressive: starting April 4, third-party agent frameworks like OpenClaw were barred from consuming subscription quotas—such usage now requires pay-per-use billing; on April 21, the “Claude Code” row on the Pro plan pricing page quietly turned red—sparking community outrage before being retracted within 24 hours, with Anthropic citing it as “a small-scale test targeting ~2% of new sign-ups”; on May 14, it officially announced that effective June 15, the Agent SDK and headless API calls would be removed from subscription plans and instead billed via separate credits priced identically to the API.
Three companies. Eight weeks. One direction—not coincidence, but the entire industry delivering the same answer to the same math problem.
So what does that math problem look like?
02 Pricing Was Never About Compute
Research firm SemiAnalysis recently brought that math problem into plain sight. It purchased every subscription tier offered by Anthropic and OpenAI, ran long-duration programming tasks until weekly quotas were exhausted, and then converted those usages into dollar values using API list prices.
The prevailing industry assumption had been that a $200/month plan could generate at most around $2,000 worth of tokens. Real-world testing vastly exceeded expectations: the $20 Claude Pro plan yields up to ~$400 worth of tokens; the $200 Max 20x plan reaches ~$8,000. OpenAI’s results were even more extreme—the $20 ChatGPT Plus plan delivers ~$700, while the $200 Pro 20x plan hits ~$14,000.

The highest subsidy multiplier is 70x|Source: SemiAnalysis
Two fair points need stating upfront: These are upper-bound “quota-exhaustion” values—not typical daily usage for average users; and API list prices include gross margin, so these conversions don’t reflect actual compute costs. Yet pricing must cover the upper bound—just as insurers can’t assume no claims will occur.

SemiAnalysis’s real-world comparison of usage limits across subscription tiers|Source: X @kimmonismus / SemiAnalysis
The subsidies themselves aren’t fatal. Streaming services subsidized, ride-hailing apps subsidized—burning cash for growth is internet’s ancestral craft. What’s truly fatal is a fundamental difference between AI subscriptions and those predecessors.
Netflix dared sell monthly subscriptions based on two things: near-zero marginal cost for adding another title, and the fact that one person has only 24 hours per day to watch. Spotify worked similarly. The implicit premise of flat-rate subscriptions has always been that consumption volume is capped by human physiology—it’s never really pricing content, but human time.
AI chatbots barely satisfied that premise. Even the most talkative user types only so much per day; idle quota from light users comfortably covered overuse by heavy users.
Then agents arrived.
What does a single agent task look like? It reads 20 files, plans, modifies code, runs tests, parses errors, and iterates—one cycle consumes 5–30× more tokens than ordinary conversation. Worse still, it doesn’t require your presence. I experienced this firsthand: recently, I asked an agent to compile flight data from two airports—I went to shower, and when I returned, the task was done and my quota depleted. While you sleep, the meter spins.
Agents don’t remove price caps—they remove consumption caps. And the entire trajectory of AI evolution—longer tasks, greater autonomy, parallel instances—is racing toward one destination:
Removing humans entirely from the consumption loop.
GitHub stated this plainly in its announcement: agent usage “is becoming the default.” That means the remaining niche where subscriptions barely hold up—that is, people sitting at screens typing line-by-line—will occupy an ever-shrinking share of AI’s value map.
At this point, someone might ask: If subsidies run too deep, why not just raise prices?
They tried—and got a worse outcome. Revisit SemiAnalysis’s chart: one anomalous detail stands out—the more expensive the tier, the higher the subsidy multiplier.
For Claude, the $20 tier offers a 20× multiplier, while the $200 tier jumps to 40×; for OpenAI, it climbs from 35× to 70×. Half stems from pricing design—higher tiers scale quotas multiplicatively, effectively discounting for enterprise customers; the other half reflects user behavior—people spending $200 on a 20x plan do so precisely to max it out; light users simply don’t appear in that tier.
In insurance, this phenomenon has a name: adverse selection. When a policy’s pricing attracts only the highest-risk customers, it has no actuarial path forward. Any fixed price will perfectly filter for users whose usage exceeds that price—not an operational issue, but a structural one; price adjustments merely refine the sieve.
Throughout 2025, the industry tested every possible patch. In January, Sam Altman admitted on X that ChatGPT Pro’s $200/month plan was losing money due to far higher-than-expected usage—price hike failure.

OpenAI tried—and failed|Source: X
Mid-year, Cursor switched from per-request to compute-based billing, triggering mass cancellations and forcing its CEO to publicly apologize—mid-course rule changes failed; this summer, Anthropic imposed weekly limits on Claude Code, citing users running agents continuously, consuming compute worth tens of thousands of dollars per person—throttling only sparked fury.
After all patches proved ineffective, came this year’s eight-week collective reckoning. Nick Turley, head of ChatGPT at OpenAI, put it bluntly on the BG2 podcast: “In today’s world, offering unlimited plans may be akin to offering unlimited electricity plans.”
03 The Shell Remains—The Core Is Dead
Of course, a seemingly strong counterargument exists: Subscription models are clearly thriving. ChatGPT Plus remains $20/month, Claude Pro continues selling, GitHub’s code completion even retains flat-rate billing. Is “demise” just alarmist rhetoric?
This counterargument deserves serious attention—because the phenomenon it describes is real. But it misidentifies what’s dying.
The soul of subscription models was never the “charge once per month” formality—but the promise of “fixed price, worry-free usage.” You never needed to calculate the cost of each interaction—that was the entire reason subscriptions defeated pay-per-use decades ago.
What’s happening now is: The billing cadence remains—while the promise has been stripped away.
GitHub Pro’s $10 monthly fee delivers $10 in credits—usage stops when they’re gone. This isn’t subscription—it’s a prepaid recharge card wearing subscription clothing. Anthropic deducts credits at API rates; OpenAI supports auto-recharge. Subscriptions won’t be canceled—they’ll be hollowed out: the shell remains, the core is dead.

GitHub Copilot’s official announcement transitioning to AI Credits billing|Source: GitHub
One true enclave remains: pure chat. It still works on flat-rate billing because it’s the last AI use case where consumption volume remains physically bounded by human time. But moats can’t protect enclaves—the industry’s R&D dollars are laser-focused on shifting AI from “you ask, it answers” to “it proactively completes tasks for you.” Chat subscriptions won’t be killed—they’ll be marginalized: staying put while real value and real revenue gradually migrate into pay-per-use territory.
One timing coincidence is hard to ignore: according to TechCrunch (June 2026), Anthropic is preparing for IPO alongside OpenAI just as Fable 5 launches. For the past three years, subsidies were funded by venture capital; public-market investors won’t accept a P&L where “each additional heavy user deepens losses.” The capital exit timeline ensures this reckoning won’t be indefinitely postponed.
This means different things for different stakeholders. For enterprises, AI spending must now be managed like cloud spend—per The Information, Uber’s CTO wrote internally that the company burned through its entire 2026 AI budget in four months, making budgeting, monitoring, and task-specific model routing mandatory for every team. For individual users, the old dynamic—light users subsidizing heavy ones—has ended. Now, everyone pays their own meter.

Uber’s AI budget overhaul also sparked significant controversy|Source: The Information
Frankly, this isn’t necessarily all bad. With price signals restored, “Is this task worth letting AI run?” becomes a genuine question for the first time—and when an industry starts answering that question seriously, it often marks the shift from burn-rate narratives to sustainable business.
At this point, let me interject: Before meters are installed, today’s subscription model may represent the industry’s most generous moment toward users—use it while it lasts.
The logic lies in SemiAnalysis’s chart. Read it from the user’s perspective, and it’s not a death sentence—it’s an active benefits list: Pay $200 monthly, and the platform co-funds up to $14,000 in compute. That level of subsidy last appeared during ride-hailing wars and food-delivery wars—and we all remember how those ended: once subsidies withdrew, prices never reverted.
So run your heavy workloads now. Fable 5’s subscription window closes June 22—rather than waiting for the credit era to arrive and then optimizing meticulously, seize this chance to run those long-running tasks you’ve long wanted to try but deemed too expensive. This isn’t “gaming the system”—it’s simply being a clear-eyed beneficiary amid a pricing anomaly destined for correction.

Turley’s analogy may run deeper than he intended. Electricity truly became infrastructure not when wires reached every home—but when every home got a meter. From that moment onward, nobody debates “Should electricity be flat-rate?”—they debate electricity pricing.
Subscriptions won’t receive obituaries. They’ll simply fade into quiet obscurity on some unremarkable billing date—reduced to a line item labeled “access fee” in your expense report.
Until then—use it while it lasts.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News