Anthropic’s Triple Crisis: Code Leak, Government Confrontation, and Weaponization

2026.06.16

Anthropic’s Triple Crisis: Code Leak, Government Confrontation, and Weaponization

This company, which touts “security” as its selling point, is turning the security narrative into a commercial moat—and what it’s truly after is the user data held by Microsoft and others.

2026.06.16 - 05:39:44

Anthropic

Navigating Web3 tides with focused insights

This company, which touts “security” as its selling point, is turning the security narrative into a commercial moat—and what it’s truly after is the user data held by Microsoft and others.

Author: Ben Thompson

Translated and edited by TechFlow

TechFlow Intro: Anthropic’s new model Fable was abruptly halted by the U.S. government just two months after its release. Ostensibly over “security leaks,” the incident in fact exposes a dual conflict—between AI labs and the government, and between AI labs and the software industry. This company, which markets itself on “safety,” is turning safety narratives into a commercial moat—and what it truly seeks is user data currently held by Microsoft and others.

I understand the skeptics’ position: they consistently view Anthropic’s public statements—especially those accompanying model releases—as fear-mongering designed for marketing. Two months ago, Anthropic announced Mythos Preview, claiming the model was too dangerous to release publicly, particularly due to its powerful cybersecurity capabilities. Then, two months later, the company publicly launched Fable—the Mythos model with various safety guardrails added.

Based on my limited hands-on experience, Fable is indeed an exceptional model. Objective evaluation has become difficult now that programming performance dominates benchmarking—but subjective impressions remain. I found interacting with Fable extraordinarily compelling; it made other models—including GPT-5.5 and Opus-4.8—feel small and unintelligent. I’ve only felt this way twice before: once with GPT-4, and once with Grok-4—both representing a new generation of foundational models in scale and complexity. I believe Fable stems from a new pretraining effort—the first of its generation.

Thus, I fully accept that Fable/Mythos may indeed be stronger at identifying and exploiting security vulnerabilities, and Anthropic’s cautious rollout makes sense. But the problem with publicly releasing a model is that guardrails can be jailbroken—and evidently, that happened shortly after launch.

Anthropic Faces Off Against the U.S. Government—Again

What followed remains somewhat unclear. In its blog post, Anthropic wrote:

The U.S. government invoked national security authorities to issue an export control order suspending all foreign nationals’ access to Fable 5 and Mythos 5—regardless of whether they are located inside or outside the United States, including Anthropic’s non-U.S. employees. The practical effect of this order is that we were forced to immediately disable Fable 5 and Mythos 5 for all customers to ensure compliance. Access to all other Anthropic models remains unaffected.

We received the government’s directive at 5:21 p.m. Eastern Time today. The letter did not specify the precise nature of the national security concerns. We understand the government believes methods have been discovered to bypass or “jailbreak” Fable 5. We reviewed demonstrations using this specific technique to identify a small number of known, minor vulnerabilities. These vulnerabilities appear relatively straightforward, and we found that other publicly available models could detect them without requiring jailbreaking.

Anthropic then argued that non-universal jailbreaks are inevitable yet narrowly scoped, and there is no evidence of universal jailbreaks. Notably, the jailbreaks reportedly identified appear to have been flagged by Amazon—a detail worth highlighting, given Amazon is both an Anthropic investor and the primary provider of Anthropic’s inference services. As I write this, Anthropic executives are in Washington, D.C., attempting to resolve what the company insists is a misunderstanding—while White House officials suggest it reflects leadership’s indifference to legitimate national security concerns.

Given the abundance of contested facts, I have little to add about the current dispute—but I’m unsurprised it’s happening. As I explained in “Anthropic and Alignment,” conflict between Anthropic and the U.S. government was inevitable. In that context, those who argue Mythos isn’t yet powerful enough to warrant such aggressive government action miss the point: if it isn’t powerful enough yet, the next iteration—or the one after—will be, especially as models grow increasingly capable of building their successors.

Yet this raises another question—one that appears to validate the skeptics’ view: If Mythos is so dangerous, why release Fable in the first place? Why defy the government and do precisely what you claim you want to prevent? In fact, I find Anthropic’s behavior entirely understandable; what makes the company distinctive is how it justifies these actions—justifications that simultaneously fuel skepticism and lend Anthropic its mystique.

Economic Necessity

In AI’s early years, the greatest economic value flowed to compute providers—for obvious reasons: supply failed to meet surging demand, driving prices upward. The biggest beneficiaries were NVIDIA, TSMC, and memory manufacturers (SK Hynix, Samsung, Micron). Meanwhile, Anthropic and OpenAI collectively lost tens of billions of dollars building frontier models—only for those models to be distilled and commoditized by open-source alternatives, largely originating in China.

This represents the labs’ pessimistic scenario—they may never recoup costs because their differentiation is fleeting, and free alternatives quickly become “good enough.” I consider this reasonable. In a world where models are interchangeable, models become commodities—and most value flows elsewhere. Today, it flows to compute; but over time, as compute becomes abundant, the most valuable position in the value chain will revert to what has always been most valuable: owning user touchpoints.

Thus, frontier labs face an economic imperative to move closer to users—a reality long clear to me. Owning user touchpoints creates meaningful lock-in, and the best way to achieve that is to become the canvas for everything users need to do. That, in turn, means frontier labs are inevitably headed toward conflict with software companies: software owns user touchpoints, and the labs’ long-term interest isn’t merely becoming a commodity input to software—it’s directly replacing software.

Meanwhile, software companies are working in the opposite direction. Satya Nadella articulated his vision for how companies should build atop models in a post on X:

Every company must build what I call human capital and token capital. Human capital includes its employees’ knowledge, judgment, relationships, originality, and pattern recognition; token capital is the AI capability the company builds and owns. Crucially, as token capital grows, human capital doesn’t become less valuable—it becomes more valuable! I believe human agency will drive token capital growth. Humans will set ambitious goals, connect dots across domains, build relationships, and identify the most important patterns. Without human guidance, your compute spins uselessly.

This means the real opportunity lies not in selecting the best model, but in building learning loops atop models—enabling human and token capital to compound. You can outsource a task—or even a job—but you can never outsource learning. A company’s future depends on enabling learning to compound between humans and AI. This demands a new architectural approach, allowing every enterprise to build intelligent agent systems that improve over time—while retaining full control over its intellectual property. Companies should be able to swap “generic” models without losing the institutional expertise embedded in their learning systems. This is the defining “test” of control and sovereignty in the coming era.

Nadella opened this vision with a warning:

What none of us wants is a world where every company in every industry surrenders value to a few all-consuming models. If all value accrues to just a handful of models, the political economy simply won’t tolerate it. Society won’t grant permission for an AI future that hollows out entire industries.

Recall what happened in the first phase of globalization: entire industrial economies were hollowed out through offshoring. Superficially, GDP numbers looked healthy—but displacement was real, and its consequences persist today. Let’s not replicate this dynamic in the AI era—letting a few AI systems capture all economic returns while entire industries watch their knowledge commoditized right under their noses.

The flaw in this analogy is that globalization *did* happen—and industrial economies *were* hollowed out. This may not be a warning but a prophecy; no wonder Nadella is sounding the alarm, as Microsoft could well be one of the victims. Likewise, the economic imperative for model makers is precisely to realize this outcome.

Data Necessity

These models—even Mythos—aren’t there yet. Beyond more compute, they require more and better data. Model improvements increasingly stem from reinforcement learning; some signals can be synthetically generated, but for frontier labs, the most potent lever is real-world usage.

I believe this is the primary reason both OpenAI and Anthropic offer heavily subsidized subscription plans. SemiAnalysis recently estimated that a $200 plan delivers $8,000 worth of Claude tokens and $14,000 worth of Codex tokens. Of course, both compete for user and developer mindshare—but they also compete for access to actual usage data to improve their models.

Anthropic doubled down on Fable, announcing it would retain all usage data for 30 days—even for enterprise plans previously guaranteed zero-data retention. The company states it won’t use this data for training, but it hasn’t implemented safeguards to guarantee it won’t do so in the future (e.g., storing data with third parties). If this policy change—when Fable resumes—doesn’t trigger mass customer attrition, I suspect using the data is merely a matter of time: it’s far too valuable to their ultimate goal.

Note also the virtuous cycle of moving up to user touchpoints: the more workflows companies complete directly via Claude or Codex, the more data they generate for feedback into training—making their products stronger and more useful, expanding the range of workflows they can serve, and broadening their data access.

Nadella emphasized the importance of this data in his post—but naturally assumed it should remain independent of the model:

Companies must transform workflows, domain knowledge, and accumulated judgment into AI systems that improve with every use. Private evaluations should measure whether the model truly improves business-critical outcomes—not just external benchmarks! Private reinforcement learning environments should strengthen the model on real organizational trajectories. Its knowledge base makes institutional memory queryable and token usage more efficient.

This loop becomes the company’s new intellectual property. I call it a “hill-climbing machine.” Unlike most assets, it compounds. Every improved workflow generates better training signals, accelerating the accumulation of company-specific tacit knowledge. Early builders of this capability gain an unreplicable advantage—regardless of any future advances in individual model capabilities.

This loop becomes the company’s new IP. I see it as a hill-climbing machine. Unlike most assets, it compounds. Each improved workflow produces better training signals, accelerating the accumulation of company-specific tacit knowledge. Companies that build this capability early gain an unreplicable advantage—no matter how much individual model capabilities advance in the future.

Yet what if companies complying with Anthropic’s data policies already achieve superior results? Or what if incumbent firms resist—leaving an opening for new entrants—or even the model makers themselves—to outcompete them in the market? Anthropic is indeed testing the resolve Nadella called for.

Power Claims

The data retention policy around Fable/Mythos isn’t even the most controversial aspect of the release. Instead, Anthropic stated at launch that Fable’s performance would be quietly degraded if used for LLM development. The system card reads:

We have also introduced safeguards related to cutting-edge LLM development. As discussed in Section 6.1 of our February 2026 Risk Report, we are concerned about risks associated with accelerating the overall pace of AI development—though the severity of these risks remains uncertain. Specifically, our concern—as noted at the time—is “accelerating other AI developers’ ability to build powerful AI systems with risks similar to ours—without corresponding safeguards.”

Given recent models’ demonstrated ability to accelerate their own development, we have implemented new interventions to limit Claude’s effectiveness on requests related to cutting-edge LLM development—for example, building pretraining pipelines, distributed training infrastructure, or ML accelerator design. Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction via technical safeguards helps prevent acceleration by those most inclined to violate those terms.

Unlike our interventions in cybersecurity, biochemistry, and distillation attempts, these safeguards are invisible to users. Fable 5 does not fall back to another model. Instead, effectiveness is constrained via prompt rewriting, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions do not affect the vast majority of programming tasks. We estimate they impact ~0.03% of traffic, concentrated among <0.1% of organizations. When active, we expect these interventions to affect model behavior minimally—beyond limiting effectiveness for cutting-edge LLM development. Claude will still respond helpfully to user requests. We will continue refining detection accuracy following this model’s release.

Anthropic retracted this change—Fable will now route LLM-related requests to Opus-4.8 and disclose the routing to users—but I find the original policy highly revealing. On one hand, I don’t blame Anthropic for refusing to aid competitors; on the other, it’s starkly clear that Anthropic believes *no one else* should build frontier LLMs.

This policy becomes even more striking given it followed by just two months Anthropic’s dispute with the Department of Defense: the latter wanted Claude usable for any lawful purpose, while Anthropic sought stricter controls on surveillance and autonomous weapons. Such downgrading measures reflect both Anthropic’s technical capacity—and willingness—to quietly alter its models to enforce policy preferences. In other words, Anthropic proactively validates critics’ gravest concern: that it poses a supply-chain risk.

Yet the broader conclusion from that episode is that Anthropic believes it should hold final authority over how Anthropic’s models are used. And since it believes only it should develop frontier AI, it effectively asserts final authority over AI *in general*. When combined with the company’s claims that AI could perform *all* economic activity, it becomes clear Anthropic’s leadership ultimately seeks power over everything—and everyone.

The Safety Narrative

Of course, Anthropic would never state this so bluntly. Instead, the story is framed around safety:

I expect Anthropic will increasingly expose its model capabilities to end users via endpoints increasingly tailored to specific workflows—even as it begins restricting API access. This substitution of software and restriction of access will be justified in the name of safety—even as Anthropic fulfills its economic imperative to move closer to end users.

Anthropic’s justification for its major revision to data retention policy is safety. Specifically, the company claims retaining all user data for 30 days is necessary to prevent jailbreaks feared by the U.S. government. I can easily envision a future where safety imperatives compel them to train on this data too—better to defend against malicious use.

Anthropic’s origin story is rooted in its founders’ belief that OpenAI didn’t take safety seriously enough. The company believes only it can control AI—and because it uniquely cares about safety, it feels justified in attempting to control everyone else, including the U.S. government.

The problem with these safety rationales is that—I believe—they work *because* they aren’t rationales for Anthropic. The company genuinely believes it alone takes superintelligence seriously—and thus alone sufficiently prioritizes danger. This justifies decision after decision, policy after policy, confrontation after confrontation—each appearing, to outsiders, as a strange blend of cynicism and naivety.

The contrast with OpenAI is stark: One way to understand how and why OpenAI lost its lead is that, in the years following ChatGPT’s launch, the company internally waged war—its former research lab suddenly burdened with becoming an unexpected consumer tech company. In resolving this tension, OpenAI hemorrhaged talent to firms like Anthropic.

By contrast, Anthropic enjoys perfect alignment among talent, mission, and business. It can sell researchers the vision of creating machine gods—wrapped in the aura of people who care deeply about danger and possess the intelligence to safeguard humanity against it. And every resulting policy change happens to benefit the business—a coincidence of breathtaking elegance.

I both respect and fear this alignment. I respect it because it’s demonstrably effective; the closest analogue may be Apple, which always wraps self-interested actions in the guise of doing what’s right for users—and often actually does. Anthropic does the same. Yet I fear it: letting people who are certain they know best build a smartphone you can accept or reject is one thing; letting them build a superintelligence potentially rivaling—or surpassing—the power of nation-states, or even large corporations, is profoundly more alarming. History shows that brilliant people convinced they know what humanity needs have left a trail of wreckage—precisely because they convinced themselves their intentions were good, thereby rationalizing actions that weren’t.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

Ben Thompson

@benthompson

Anthropic’s Triple Crisis: Code Leak, Government Confrontation, and Weaponization

TechFlow Selected TechFlow Selected