
Meta: Can Afford $100-Billion in Computing Power, but Can’t Retain Key Talent
TechFlow Selected TechFlow Selected

Meta: Can Afford $100-Billion in Computing Power, but Can’t Retain Key Talent
Silicon Valley’s AI arms race has never lacked ultra-buyers waving checks—but what it lacks is people who know how to harness this computing power to forge the future.
By Ada, TechFlow
Pang Ruoming hadn’t even settled into his new desk at Meta before he left.
In July 2025, Mark Zuckerberg lured away this most sought-after Chinese engineer in AI infrastructure from Apple with a multi-year compensation package worth over $200 million. Pang was assigned to Meta’s Superintelligence Lab to build the infrastructure for next-generation AI models.
Seven months later, OpenAI poached him.
According to The Information, OpenAI launched a months-long recruitment campaign targeting Pang. Though he reportedly told colleagues he was “very happy working at Meta,” he ultimately chose to leave. As Bloomberg reported, his compensation package at Meta was tied to performance milestones; leaving early meant forfeiting most of his unvested equity.
$200 million couldn’t buy seven months of loyalty.
This isn’t just another case of job-hopping.
One Person’s Departure, A Signal for Many
Pang wasn’t the first to go.
Last week, Mat Velloso—head of the developer platform product team at Meta’s Superintelligence Lab—also announced his departure. He had joined Meta from Google DeepMind in July last year and stayed less than eight months. Earlier still, in November 2025, Turing Award winner and Meta’s Chief AI Scientist Yann LeCun—a 12-year Meta veteran—announced he was leaving to start a company building the “world model” he’d long championed. Russ Salakhutdinov, Geoffrey Hinton’s protégé and Meta’s VP of Generative AI Research, also recently confirmed his exit.
To understand Meta’s AI talent drain, one must first grasp just how damaging Llama 4 truly was.
In April 2025, Meta unveiled its Llama 4 series—Scout and Maverick—with fanfare. Official benchmark results looked stellar: Meta claimed outright dominance over GPT-4.5 and Claude Sonnet 3.7 on core benchmarks like MATH-500 and GPQA Diamond.
Yet this flagship model—designed to embody Meta’s AI ambitions—was quickly exposed in third-party, open-source community blind tests. Its real-world generalization and reasoning capabilities fell dramatically short of its advertised performance. Facing intense community backlash, Chief AI Scientist Yann LeCun eventually admitted the team had “used different model versions to run different test sets, optimizing final scores.”
In rigorous AI academia and engineering circles, this crossed an unforgivable red line. In effect, the team trained Llama 4 to be a “provincial exam-taker”—excelling only on past exam papers—not a true “top student” equipped with cutting-edge intelligence. Give it a math test, and it deploys the “math expert” version; give it a coding test, and out comes the “coding expert” version. Each sub-test looks strong—but they’re not the same model.
In AI research, this is called “cherry-picking”; in standardized testing, it’s “proxy examination.”
For Meta—long self-proclaimed “the lighthouse of open source”—this scandal directly destroyed its most valuable asset within the developer ecosystem: trust. The immediate consequence? Zuckerberg “completely lost confidence” in the engineering integrity of Meta’s existing GenAI team, triggering a cascade of executive appointments and the sidelining of core infrastructure units.
He spent $14.3–$15 billion acquiring a 49% stake in data-labeling firm Scale AI and appointed its 28-year-old CEO, Alexandr Wang, as Meta’s Chief AI Officer—establishing the Meta Superintelligence Lab (MSL). Turing Award winner LeCun now reports to this 28-year-old. In October, Meta cut roughly 600 roles across MSL—including members of FAIR, the foundational AI research lab LeCun himself founded.
The flagship Llama 4 Behemoth model, originally slated for summer 2025 release, has been repeatedly delayed—from summer to fall—and is now indefinitely shelved.
Meta has pivoted instead to developing “Avocado,” its next-generation text model, and “Mango,” its image/video model. Avocado aims to compete directly with GPT-5 and Gemini 3 Ultra. Originally scheduled for end-2025 delivery, it’s now pushed to Q1 2026 due to underwhelming performance testing and training optimization. Meta is even considering releasing Avocado as closed-source—abandoning Llama’s longstanding open-source tradition.
Meta made two fatal mistakes in its AI model strategy. First, benchmark manipulation—which shattered trust across the developer community. Second, forcing FAIR—a fundamental research unit requiring decade-long patience—into a product organization obsessed with quarterly KPIs. Together, these explain the root cause of today’s talent exodus.
In-House Chips: Another Broken Leg
Talent is fleeing—and so are chips.
According to The Information, Meta scrapped its most advanced in-house AI training chip project last week.
Meta’s chip initiative is called MTIA (Meta Training and Inference Accelerator). Its original roadmap was ambitious: MTIA v4 (“Santa Barbara”), v5 (“Olympus”), and v6 (“Universal Core”) were scheduled for delivery between 2026 and 2028. Olympus was designed as Meta’s first 2nm chiplet-based chip—intended to handle both high-end model training and real-time inference, ultimately replacing NVIDIA in Meta’s training clusters.
Now, that most advanced training chip has been canceled.
Meta hasn’t made zero progress: MTIA has delivered some wins on inference. The MTIA v3 inference chip, codenamed “Iris,” is already deployed at scale across Meta’s data centers—powering recommendation systems for Facebook Reels and Instagram—and reportedly reduced total cost of ownership by 40–44%. But inference and training are entirely different things. Inference runs models; training builds them. Meta can design inference chips—but it cannot yet build training chips capable of rivaling NVIDIA head-on.
This isn’t the first time. In 2022, Meta attempted an in-house inference chip, abandoned it after failing small-scale deployment, and placed a massive order with NVIDIA instead.
Stalled chip development has accelerated Meta’s external procurement frenzy.
$135 Billion in Panic Buying
In January 2026, Meta announced its capital expenditure budget for the year would reach $115–$135 billion—nearly double last year’s $72.2 billion. Most of this money will go toward chips.
Within ten days, three major deals materialized:
On February 17, Meta signed a multi-year, cross-generational strategic partnership with NVIDIA. Meta will deploy “millions” of NVIDIA Blackwell and next-generation Vera Rubin GPUs, plus Grace standalone CPUs. Analysts estimate the deal’s value at several tens of billions of dollars—making Meta the world’s first hyperscaler to deploy NVIDIA’s Grace standalone CPU at scale.
On February 24, Meta signed a chip agreement with AMD valued between $60–$100 billion. Meta will procure AMD’s latest MI450-series GPUs and sixth-generation EPYC CPUs. As part of the deal, AMD issued Meta warrants to purchase up to 160 million shares of AMD common stock—representing ~10% of AMD—at $0.01 per share, vesting in tranches tied to delivery milestones.
On February 26, according to The Information, Meta signed a multi-year, multi-billion-dollar agreement with Google to lease Google Cloud’s TPU chips for training and running its next-generation LLMs. The two companies are also discussing Meta’s direct purchase and on-premises deployment of TPUs starting in 2027.
A social media company placed orders totaling potentially over $100 billion with three chip suppliers—all within ten days.
This isn’t diversification. It’s panic buying.
The Three-Layer Logic of Compute Anxiety
Why is Meta in such a rush?
First, in-house chips are no longer viable. With its most advanced training chip project canceled, Meta must rely on external procurement to meet AI training demands for the foreseeable future. MTIA inference chips handle mature workloads like recommendation systems—but training frontier models like Avocado (targeting GPT-5) requires NVIDIA-grade hardware—or equivalent.
Second, competitors won’t wait. OpenAI has secured massive resources from Microsoft, SoftBank, and Abu Dhabi’s sovereign wealth fund. Anthropic has locked in supply commitments for one million TPUs and Trainium chips each from Google and Amazon. Google’s Gemini 3 was trained entirely on TPUs. Without sufficient compute, Meta risks losing even entry-level access to the race.
Third—and perhaps most fundamentally—Zuckerberg is using “purchasing power” to compensate for deficits in “R&D capability.” Llama 4’s failure, core talent attrition, and stalled chip development have collectively weakened Meta’s AI narrative on Wall Street. Signing big deals with NVIDIA, AMD, and Google sends at least one clear signal: “We have money. We’re buying. We haven’t given up.”
Meta’s current strategy is simple: if software fails, throw hardware at it; if you can’t retain people, buy chips instead. But AI competition isn’t a game won by writing checks. Compute is a necessary condition—not a sufficient one. Without top-tier model teams and a clear technical roadmap, even the most expensive chips sit idle in warehouses.
The Buyer’s Dilemma
Looking back at Meta’s three February deals, one subtle detail has escaped most observers.
Meta bought NVIDIA’s current Blackwell and upcoming Vera Rubin chips; AMD’s MI450 and forthcoming MI455X; and leased Google’s current Ironwood TPUs—with plans to buy them outright next year.
Three suppliers. Three entirely distinct hardware architectures and software ecosystems.
This means Meta must constantly juggle across NVIDIA’s CUDA, AMD’s ROCm, and Google’s XLA/JAX—three radically different low-level stacks. While a multi-vendor strategy helps diversify supply-chain risk and suppress hardware premiums, it exponentially increases engineering complexity.
This is Meta’s most critical soft spot right now: efficiently training trillion-parameter models across three hardware platforms with fundamentally incompatible programming models demands far more than CUDA-savvy engineers—it requires architects capable of building cross-platform training frameworks from scratch.
Fewer than 100 such people likely exist worldwide. Pang Ruoming is one of them.
Spending $100 billion on the world’s most complex hardware stack—while simultaneously losing the very brains needed to wield it—is the most surreal scene in Zuckerberg’s high-stakes gamble.
Zuckerberg’s Bet
Zooming out, Zuckerberg’s AI strategy over the past 18 months eerily mirrors his earlier all-in bet on the metaverse:
Spot the trend, invest heavily, hire aggressively, hit setbacks, pivot hard, then reinvest heavily.
2021–2023 was the metaverse era—costing billions annually and dragging Meta’s stock price from $380 down to $88. 2024–2026 is the AI era—equally reckless in spending, equally frequent in organizational reshuffling, and equally reliant on the “Trust me—I have vision” narrative.
The difference? This time, the AI tailwind is far more tangible than the metaverse ever was. And Meta has deep pockets: its advertising business generates robust cash flow—$59.9 billion in revenue for Q4 2025, up 24% year-on-year.
The problem? Money buys chips, compute, and even people sitting at desks—but it doesn’t buy people who stay.
Pang Ruoming chose OpenAI. Russ Salakhutdinov walked away. LeCun launched his own startup.
Zuckerberg’s current wager is that, with enough chips purchased, enough data centers built, and enough money spent, Meta will eventually find—or cultivate—the people capable of leveraging those resources.
This bet could pay off. Meta remains one of the world’s wealthiest tech companies, with over $100 billion in annual operating cash flow—its sturdiest moat. From OpenAI to Anthropic, from Google to other rivals, Meta continues aggressive hiring. According to Qbit News, nearly 40% of Meta’s 44-person Superintelligence team came from OpenAI.
But the cruelty of AI competition lies in its transparency: compute reserves, talent rosters, and model performance are all public. Llama 4’s benchmark scandal proves that in this industry, PPTs and PR can’t sustain leadership.
Ultimately, the market recognizes only one thing: Is your model good enough?
Position in the Food Chain
As the AI arms race enters 2026, the food chain has begun to crystallize:
At the top sit OpenAI and Google. OpenAI boasts the strongest models, largest user base, and most aggressive funding. Google enjoys full vertical integration—owning its chips, models, and cloud infrastructure. Anthropic follows closely, securing its place in the top tier through Claude’s product strength and dual compute supply lines from Google and Amazon.
Where does Meta stand? It’s spent the most money, signed the most chip contracts, and executed the most frequent reorganizations—yet still hasn’t delivered a frontier model that convinces the market.
Meta’s AI story resembles Yahoo in 2005: then the internet’s richest company, furiously acquiring and spending—but never building a search engine to rival Google’s. Money isn’t everything. Zuckerberg needs to decide what Meta *actually wants to do* in AI—not just chase whatever’s hot.
Of course, writing Meta’s obituary is premature. With 3.58 billion monthly active users, $59.9 billion in quarterly revenue, and the world’s largest social dataset, Meta holds assets no competitor can replicate.
If Avocado delivers on schedule in 2026 and regains top-tier status, every dollar spent and every reorganization executed will be recast as “strategic resolve in the face of crisis.” But if it again falls short, those $135 billion will buy nothing but silicon warehouses humming with electricity.
After all, Silicon Valley’s AI arms race has never lacked super-buyers waving blank checks. What’s scarce is the rare talent that knows how to forge the future from that raw compute.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














