
GPU rental prices have dropped 30% in three weeks, and the AI value chain is undergoing a major shift—from NVIDIA to memory chips.
TechFlow Selected TechFlow Selected

GPU rental prices have dropped 30% in three weeks, and the AI value chain is undergoing a major shift—from NVIDIA to memory chips.
For those holding NVIDIA stock or considering investments in AI infrastructure, a key question emerges: The money flowing into AI isn’t shrinking—it’s simply shifting to new places.
Author: Claude, TechFlow
TechFlow Introduction: The cloud rental price of NVIDIA’s B200 chip has fallen from a peak of $6.11/hour at the end of May to $4.22/hour—a decline of approximately 30% over three weeks. Meanwhile, the semiconductor sector has exhibited an unusual divergence: the SMH Semiconductor ETF rose 15% over the past month, while Micron and SanDisk surged nearly 60% each; NVIDIA, by contrast, declined roughly 3% over the same period. For investors holding NVIDIA stock—or considering AI infrastructure investments—a critical question has emerged: It’s not that money for AI is shrinking—it’s simply moving elsewhere.
NVIDIA is still up about 12% year-to-date in 2026, but market attention appears to have shifted away from it.
Over the past month, the VanEck Semiconductor ETF (SMH) surged 15%, while Micron Technology and SanDisk each jumped nearly 60%. NVIDIA not only failed to keep pace—it declined about 3% against this backdrop. Even more telling, the core metric underpinning NVIDIA’s valuation narrative—the cloud rental price of its B200 chip—has simultaneously softened.
According to Ornn, a GPU compute pricing platform, the B200 hourly rental rate hit a three-month high of $6.11 on May 30, then entered a sustained decline, falling to $4.22 by last weekend—a drop of roughly 30%. Rich Privorotsky, Head of the One-Delta Trading Desk at Goldman Sachs, directly addressed this trend last week: the myth of AI “compute scarcity” may be losing its grip.
B200 Rental Prices Fall 30% in Three Weeks—The “Compute Scarcity” Narrative Under Pressure
NVIDIA’s B200 is the dominant compute chip powering today’s hyperscale data centers; its rental price serves as a barometer for AI infrastructure supply and demand. Data from multiple third-party tracking platforms confirm that B200 pricing is softening.
Ornn data shows the B200 hourly rental price declined steadily from its $6.11 high on May 30 to $4.22 by last weekend. AIMultiple’s monthly price index—compiled from 63 cloud service providers—shows a median B200 quote of $6.11/hour, yet “neocloud” vendors have already pushed their floor price down to $3.44. GetDeploying’s tracking of 26 B200 cloud providers reveals even starker figures: an average price of $4.99/hour, with the lowest quoted rate at just $2.25/hour (for a three-year reserved contract).

Three drivers underlie this price correction: improved yield on TSMC’s 4NP process has reduced B200 production costs; SK Hynix and Micron’s HBM3e supply has noticeably eased in Q2 2026; and more neocloud providers—including RunPod, Lambda, Nebius, and Spheron—have secured B200 inventory and launched spot offerings, intensifying competition and pressuring prices downward.
Downward pressure will intensify in H2. As NVIDIA’s next-generation Blackwell Ultra B300 enters the spot market, part of the B200 capacity will shift from on-demand to spot pricing. B300 spot rates have already dipped as low as $2.45/hour—cheaper than the lowest listed B200 rate. Institutions including Spheron and Thunder Compute forecast B200 on-demand pricing could stabilize between $2.50–$3.00/hour by Q4 2026.
For NVIDIA shareholders, weakening rental prices signal margin pressure on NVIDIA’s downstream customers—cloud providers and neocloud platforms—whose purchasing appetite directly dictates NVIDIA’s order cadence.
Semiconductor Sector Divergence: Memory Soars While NVIDIA Lags
The data behind this divergence is striking.
NVIDIA is up ~12% YTD in 2026 but down ~3% over the past month. In contrast, the SMH Semiconductor ETF is up 84% YTD and +15% over the past month. Micron surged nearly 60% over the past month, hitting a record high of ~$1,089 per share, with YTD gains exceeding 700% and market cap surpassing $1.2 trillion. SanDisk rose nearly 60% over the past month and over 4,400% over the past 52 weeks.
The market isn’t turning bearish on AI—it’s shifting focus to where bottlenecks are migrating along the AI value chain.
The prior logic ran: “GPU scarcity → NVIDIA pricing power → upstream winners.” Today’s logic is: GPU supply is easing, but AI models’ demand for high-bandwidth memory (HBM) and storage is surging—making memory the new bottleneck.
Micron’s latest quarterly report (Q2 FY2026) showed revenue of $23.8 billion, nearly tripling YoY (vs. $8 billion a year ago). Following its spin-off from Western Digital, SanDisk posted $5.95 billion in revenue for FY2026 Q3—a 97% YoY increase.
TrendForce’s June 16 report shows memory contract prices soared over 100% in H1 2026, with structural shortages expected to persist through H2. Apple CEO Tim Cook acknowledged last week—in an interview—that Apple can no longer absorb rising memory cost pressures. When even Apple—the strongest negotiator in the industry—publicly admits it “can’t take it anymore,” memory vendors’ pricing power becomes unmistakable.
Micron reports its Q3 earnings tomorrow (June 24) after market close. Consensus expectations point to another record-breaking result—making this report a key test for whether the “memory supercycle” can sustain momentum.
Goldman Sachs Trading Head: Rental Price Is the Core Metric
Rich Privorotsky, Head of the One-Delta Trading Desk at Goldman Sachs, laid out a clear analytical framework last week:
If compute resources were truly scarce, rental prices would remain resilient—and sustained capital expenditure would be justified. But if supply increases and rental prices continue falling, the foundational assumption underpinning valuations across the entire AI hardware stack—“compute shortage”—must be reevaluated.
He further noted this pressure will first manifest on the hardware side. Real beneficiaries are companies selling full systems and monetizing usage—not upstream suppliers selling “picks and shovels.” The greatest risk lies with upstream segments of the hardware and infrastructure stack, whose valuations still rest on assumptions of persistent scarcity.
This message is unambiguous: NVIDIA’s business model centers on chip sales (“picks and shovels”), not usage-based billing. If downstream rental prices fall while NVIDIA’s chip ASPs hold steady, margins get squeezed—ultimately translating into slower order growth.
Citadel Securities’ recent “Tokenomics” report echoes this view: the core constraint on AI adoption has shifted from “model capability” to “cost and compute scarcity”; users are rapidly migrating to cheaper models. The Token Price Index has declined for seven consecutive days—the longest such streak this year.
Seoyoung Kim, Finance Professor at Santa Clara University, put it bluntly: Most buyers don’t know how much compute they’ll need next year; suppliers don’t know how many GPUs to order; and NVIDIA doesn’t know how many to produce. All three are guessing—and when those guesses collectively pivot from “not enough” to “possibly too much,” prices come under pressure.
SpaceX-Google $30B Mega-Contract: The Long-Term Contract Market Remains White-Hot
Rental spot prices are falling—but the long-term contract market tells a different story.
According to an SEC filing submitted by SpaceX on June 5, Google agreed to pay SpaceX $920 million per month from October 2026 through June 2029, leasing approximately 110,000 NVIDIA GPUs plus associated processors, memory, and other components. The total contract value is ~$30 billion. Earlier in May, Anthropic signed a similar agreement with SpaceX, committing $1.25 billion monthly to lease all available compute capacity from SpaceX’s Colossus 1 data center in Memphis—valued at nearly $45 billion in total.
These contracts follow SpaceX’s February 2026 merger with xAI, which converted xAI’s previously self-built Colossus supercomputing cluster into a commercial, rentable asset—locking in substantial revenue ahead of its planned IPO (target valuation: $1.75 trillion).
For NVIDIA, this presents a contradictory signal. On one hand, the 110,000-GPU long-term contract confirms large customers are still locking in massive compute capacity at scale. Following the announcement, RBC Capital Markets stated NVIDIA is “in the strongest position among peers,” noting these GPU leasing agreements alleviate near-term concerns about ASICs eroding NVIDIA’s market share.
On the other hand, Google’s need to rent from SpaceX arises precisely because its own in-house capacity cannot keep pace with demand. Google’s 2026 capex stands at $180–190 billion; SpaceX’s $920 million monthly payment accounts for less than 6% of that annual budget—essentially functioning as “bridge capacity.” Once these hyperscalers’ owned data centers come online in 2027–2028, it remains uncertain whether external rental demand can sustain current volumes.
The contract also includes a 90-day notice period for early termination—a clause more typical of deals negotiated when buyers retain flexibility, rather than during periods of “extreme compute scarcity.”
NVIDIA’s Risk Lies Not in Demand—but in Pricing Power
Connecting these threads reveals NVIDIA’s core challenge: profit allocation across the AI value chain is shifting.
On the GPU supply side, TSMC’s yield improvements, broader vendor access to inventory, and the imminent mass rollout of the B300 are collectively easing the extreme shortages seen in 2024–2025. On the demand side, hyperscalers continue large-scale procurement—but procurement behavior has evolved from “pay-any-price scramble” to “price comparison, long-term volume lock-in, and exit options retained.” On the margin front, downstream cloud providers’ rental prices are already falling; if NVIDIA fails to align its chip ASPs downward, margin compression at the intermediate layer will ultimately feed back into weaker order volumes.
The rise of memory chips reflects the flip side of this value-chain migration.
The larger the AI models and the heavier the inference workloads, the more inelastic the demand for high-bandwidth memory becomes. GPUs can boost efficiency via architectural upgrades (e.g., B200’s FP4 precision halves bytes-per-parameter), but memory bandwidth remains a hard physical constraint—with no shortcuts. Micron’s HBM capacity is fully sold out for all of 2026—creating a “money can’t buy it” scenario starkly contrasting with the softening B200 rental prices.
Micron’s upcoming earnings report provides the next critical data point. If revenue and guidance again exceed expectations, the narrative of “AI value chain migration from GPUs to memory” will gain further traction. For investors, this isn’t about being bearish on AI—it’s about reassessing who holds strengthening pricing power, and who is losing it, along the AI chain.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News













