
Jensen Huang’s Latest Interview: Forcing DeepSeek to Deeply Bind with Huawei Is “Too Scary” for the U.S.
TechFlow Selected TechFlow Selected

Jensen Huang’s Latest Interview: Forcing DeepSeek to Deeply Bind with Huawei Is “Too Scary” for the U.S.
Regarding exports to China, he criticized the extreme export control policies as naive.
Compiled by: Xiao Xiao, NetEase Intelligent
Jensen Huang, CEO of NVIDIA, recently sat down for an in-depth interview with Dwarkesh Patel, host of a prominent U.S. tech podcast, addressing key topics including the company’s moat, competition from Google’s TPUs, and AI chip exports to China.
He emphasized that NVIDIA’s moat now extends deep into the supply chain—forged through hundreds of billions of dollars in procurement commitments with TSMC and memory suppliers.
On TPU competition, Huang noted that Anthropic is a unique case—not a trend—for ASIC growth. NVIDIA’s accelerated computing spans far beyond AI, covering molecular dynamics, data processing, fluid dynamics, and more; CUDA’s high programmability enables annual performance leaps of 10x to 50x.
He also explained why NVIDIA does not become a hyperscale cloud provider itself. Despite strong cash flow, NVIDIA adheres to its principle of doing only what is essential—and as little as possible—choosing instead to invest in ecosystem partners like CoreWeave, OpenAI, and Anthropic, rather than compete directly with its customers. He candidly admitted that failing to invest in Anthropic earlier at scale was his own misstep. Moreover, he stressed that even if the AI revolution had never occurred, NVIDIA would still be a massive company thanks to accelerated computing in physics, chemistry, and data processing.
Regarding exports to China, he criticized extreme export controls as “naïve.” Huang pointed out that AI compute is the fusion of chips and energy: although restricted by EUV lithography tools, China possesses vast 7nm chip manufacturing capacity. Given that today’s leading large models are still primarily trained on the Hopper architecture, China can fully compensate for per-chip performance gaps by leveraging abundant electricity and scaling up chip cluster sizes.
Beyond that, China’s massive AI research teams are boosting model performance through more efficient computer science. Citing DeepSeek as a warning, Huang stated this progress is anything but trivial. If such outstanding open-source models are forced to optimize deeply—and run best—exclusively on domestic hardware like Huawei’s, it will objectively erode the global advantage of the U.S. tech stack. He believes that voluntarily abandoning the world’s second-largest market will compel China to build an independent foundational computing architecture. As these open-standard-based technologies gradually spread to the Global South, the U.S. risks falling behind in the long-term battle for AI ecosystem standards.
Full transcript of Jensen Huang’s interview:
Is supply-chain control NVIDIA’s biggest moat?
Patel: Many software companies’ valuations are falling because people believe AI will commoditize software. One view is that NVIDIA simply sends design files to TSMC, which manufactures logic chips and switches; SK Hynix, Micron, and Samsung then supply HBM memory for packaging; finally, ODMs in Taiwan assemble rack-scale systems. Fundamentally, NVIDIA does software while others build hardware. If software becomes commoditized, won’t NVIDIA be too?
Huang: Ultimately, someone must convert electrons into tokens. That conversion process is extremely difficult to fully commoditize. Making one token more valuable than another is like making one molecule more valuable than another—it demands immense technical expertise, engineering rigor, scientific insight, and invention. These disciplines remain poorly understood and far from complete. I don’t believe full commoditization will happen.
But we do make that process more efficient. The way you framed your question mirrors my mental model of the company: input is electrons, output is tokens, and NVIDIA sits in between. Our guiding principle is to do only what’s necessary—and as little as possible. “As little as possible” means outsourcing tasks I don’t need to perform myself, turning them into parts of our ecosystem.
Today, NVIDIA may be the company with the largest partner ecosystem—including upstream and downstream supply chains, all PC manufacturers, application developers, and model vendors. AI is like a five-layer cake, and we have our own ecosystem at every layer. We minimize what we do ourselves—but the part we *must* do is extraordinarily difficult, and I don’t believe it will be commoditized.
Nor do I think enterprise software companies will be commoditized. Most software firms today are toolmakers—Excel, PowerPoint, Cadence, Synopsys. My view runs counter to many: the number of AI agents will grow exponentially, and so will the number of tool users. Instances of these tools could explode.
For example, Synopsys’ Design Compiler will be used by numerous agents for layout and design rule checking. Today, engineers are the bottleneck; tomorrow, each engineer will be backed by a swarm of agents. We’ll explore design space in unprecedented ways—using today’s tools. High-frequency tool usage will accelerate software companies’ growth. This hasn’t happened yet because agents aren’t yet proficient enough with tools. Either these software companies build their own agents—or agents will become skilled enough to use those tools effectively. I believe both will occur.
Patel: In your latest filings, you show nearly $100 billion in procurement commitments to foundries, memory suppliers, and packaging providers. Semiconductor research firm SemiAnalysis estimates this figure could reach $250 billion. One interpretation is that NVIDIA’s moat lies in locking up scarce components for years ahead—others may have accelerators, but they can’t access memory or logic chips. Is this your primary moat for the next few years?
Huang: This is one of the things we can do—and others find extremely hard. We’ve made enormous upstream commitments: some explicit, like the ones you mentioned; others implicit—such as advising CEOs on the industry’s size and trajectory, helping them see what I see, prompting their investments.
Why do they invest for me—not for others? Because they know I can buy their entire output and sell it through my downstream channels. NVIDIA’s downstream demand and supply chain are so vast that upstream players willingly invest.
Look at GTC: people marvel at its scale and energy. It’s the entire AI community gathering—to exchange ideas and be seen. I bring them together so downstream sees upstream, upstream sees downstream, and everyone witnesses AI’s progress. They meet AI-native startups and all early-stage companies—allowing them to verify firsthand what I tell them. I spend enormous time—directly and indirectly—ensuring the supply chain, partners, and ecosystem grasp the opportunity before them.
Some say my keynote feels like a lecture—and a grueling one. That’s intentional. I must ensure the entire supply chain, upstream and downstream, and the broader ecosystem understand what’s coming, why it’s coming, when it’s coming, how big it’ll be—and how to think about it systematically, just as I do.
Our moat isn’t static—it’s forward-looking. If we truly scale to a trillion-dollar business in the coming years, we’ll naturally command a matching supply chain. But that requires today’s scale, influence, and rapid business velocity—just as cash flow has velocity, so does the supply chain. If business velocity were slow, no one would build a supply chain for an empty shell. Our current scale is sustained by overwhelming downstream demand. When they see, hear, and realize this is real—that’s what lets us achieve what we’re achieving today.
Patel: Let’s dig deeper into whether upstream can keep up. You’ve doubled revenue year after year—and delivered over triple the compute annually.
Huang: Doubling at this scale is truly astonishing.
Patel: But look at logic chips. You’re TSMC’s largest customer on N3—and one of the largest on N2. SemiAnalysis found AI will consume 60% of N3 capacity this year—and 86% next year. If you already dominate, how do you double—year after year? Are we entering a phase where AI compute growth must slow due to upstream constraints? Do you see solutions? Ultimately—how do we double wafer-fab capacity year after year?
Huang: At any given moment, instantaneous demand can exceed total global upstream and downstream supply—even limited by the number of plumbers, which has actually happened.
Patel: Plumbers should be invited to next year’s GTC.
Huang: Great idea. But it’s actually a good sign. You want an industry’s instantaneous demand to exceed total supply—the reverse is worse. If a component shortage grows severe, the whole industry rushes to solve it. You hardly hear about CoWoS anymore—because we tackled it aggressively over the past two years, and the situation is now much improved. TSMC now knows CoWoS supply must scale alongside logic and memory demand. They’re expanding CoWoS—and future packaging technologies—at the same pace as logic. That’s excellent, because CoWoS and HBM memory were once niche; now they’re mainstream computing technologies.
We now influence a broader supply chain. I’ve said all this before—the AI revolution began, and I predicted this five years ago. Some believed me and invested—like Sanjay Mehrotra, CEO of Micron, and his team. I remember that meeting clearly—I spelled out exactly what would happen, why, and how things stand today. They doubled down. We collaborated on LPDDR and HBM memory; their massive investment yielded tremendous success. Others joined later—but they’re all here now.
Every bottleneck receives intense focus. We anticipate bottlenecks years in advance—like our recent investments in Lumentum, Coherent, and the silicon photonics ecosystem, reshaping the supply chain. We built the entire supply chain around TSMC—and co-developed the silicon photonics integration platform COUPE, inventing new technologies and licensing patents openly to the supply chain.
We strengthen the supply chain by inventing new technologies, new processes, new test equipment (e.g., double-sided probing), and investing in companies to help them scale. We’re actively shaping the ecosystem—so the supply chain can sustain this scale.
Patel: Some bottlenecks seem easier to solve—like scaling CoWoS.
Huang: We take responsibility for overcoming the hardest ones.
Patel: Which ones?
Huang: Plumbers and electricians. This is where I worry about doomers. They claim jobs will vanish. If we stop people from becoming software engineers, we’ll run out of them. The same prediction was made a decade ago. Some doomers said, “Whatever you do, don’t become a radiologist”—you can still find videos online claiming radiology will be the first profession to disappear, that the world won’t need radiologists anymore. Guess what we’re short of today? Radiologists.
Patel: Some things scale; others don’t. How do you manufacture twice as many logic chips yearly? Ultimately, both memory and logic depend on EUV lithography machines. How do you get twice as many EUVs year after year?
Huang: These capacities scale rapidly—within two or three years. You just need to signal demand to the supply chain. If you can make one, you can make ten; if ten, then a million. Replication isn’t hard.
Patel: How far into the supply chain will you go? Will you go straight to ASML and say, “NVIDIA aims for $2 trillion in annual revenue in three years—we need far more EUVs”?
Huang: Some conversations are direct; others indirect. If you convince TSMC, ASML follows. We identify critical bottlenecks—but once TSMC is convinced, you’ll have enough EUVs within a few years.
My view: no bottleneck lasts longer than two or three years. Meanwhile, we’re improving computational efficiency by 10x, 20x—Hopper to Blackwell is 30x–50x. Because CUDA is flexible, we constantly invent new algorithms and techniques—boosting capacity and efficiency simultaneously. None of this worries me. What does worry me is downstream: energy policy blocking energy expansion. Without energy, you can’t build new industries—or new manufacturing. We need to reindustrialize America: bring back chip manufacturing, computer manufacturing, packaging, EVs, robots, AI factories. None of this happens without energy—and energy takes time. Chip capacity? Solvable in two to three years. CoWoS capacity? Also two to three years.
Will TPUs break NVIDIA’s grip on AI compute?
Patel: Two of the world’s top three models—Claude and Gemini—are trained on Google TPUs. What does that mean for NVIDIA?
Huang: What we do is fundamentally different. NVIDIA builds accelerated computing—not just tensor processors. Accelerated computing applies broadly: molecular dynamics, quantum chromodynamics, data processing, structured/unstructured data, fluid dynamics, particle physics—and yes, AI too.
Accelerated computing is vastly broader. While AI dominates headlines—and is undeniably important and impactful—computing is far wider. NVIDIA reshaped computing—from general-purpose to accelerated. Our market scope dwarfs any TPU or ASIC; we’re the only company accelerating *all* applications. We have a massive ecosystem—every framework and algorithm runs on NVIDIA.
Because our computers are designed for others to operate, any operator can buy our systems. Most in-house systems require you to be the operator—they lack flexibility for external use. Because anyone can deploy and operate our systems, we exist in every cloud—including Google, AWS, Azure, and Oracle.
If you rent infrastructure, you need a broad base of customers across industries as anchor tenants. If you use it yourself, we can help you operate it—as we do with xAI for Musk. And we enable operators across any company or industry: you can build a supercomputer for Lilly for scientific research and drug discovery—and we’ll help operate it across the entire drug-discovery and biosciences domain.
There’s a huge list of applications TPUs *can’t* handle. NVIDIA’s CUDA is an excellent tensor processor—but also handles every stage of data processing, computation, AI, and more. Our market opportunity is vastly larger and broader. Because we support every application worldwide, you can deploy NVIDIA systems anywhere—knowing customers will follow. That’s a completely different landscape.
Patel: Your revenue is staggering—but it doesn’t come from pharma or quantum computing. It comes overwhelmingly from AI—a technology growing at unprecedented speed. So the real question is: what’s best for AI itself? TPUs are essentially massive systolic arrays, superb at matrix multiplication. GPUs are more flexible—ideal for tasks with heavy branching or irregular memory access. But what *is* AI doing? Fundamentally, it’s repeating highly predictable matrix multiplications, over and over. So why reserve chip area for thread schedulers or memory-switching logic—when that space could be fully dedicated to matrix multiplication? TPUs are purpose-built for exactly this exploding workload. What do you think?
Huang: Matrix multiplication is vital—but not the whole story. If you invent a new attention mechanism, try a different decomposition, or create a novel hybrid state-space model (SSM), you need a universally programmable architecture. If you fuse diffusion and autoregressive models, you need universal programmability too. We run *anything* you dream up. That’s our edge. Programmability makes inventing new algorithms far easier.
The ability to invent new algorithms is *why* AI advances so fast. TPUs—and everything else—follow Moore’s Law: ~25% annual improvement. To achieve 10x or 100x yearly leaps, you must fundamentally reinvent algorithms *and* compute every year.
That’s NVIDIA’s core advantage. Blackwell is 50x more energy-efficient than Hopper. When I first said 35x, no one believed me. Later, someone wrote that I’d held back—and it was actually 50x. Moore’s Law alone can’t deliver that. We leverage new models like Mixture-of-Experts (MoE), parallelized and distributed across the system. Without CUDA—and deep kernel-writing capability—this is nearly impossible.
It’s the fusion of programmable architecture and NVIDIA’s extreme co-design prowess. We even offload computation to the network itself—NVLink or Spectrum-X. We simultaneously evolve processors, systems, networking, libraries, and algorithms. Without CUDA, I wouldn’t know where to begin.
Patel: This raises an interesting question about NVIDIA’s customers. 60% of your revenue comes from five hyperscale cloud providers. In another era, customers were professors running experiments—they needed CUDA and couldn’t use other accelerators. They just ran PyTorch on CUDA, fully optimized. But hyperscalers *can* write their own kernels. In fact, to squeeze out that last 5% performance on a specific architecture, they *must*. Anthropic and Google run mostly on their own accelerators—TPUs and Trainium. Even OpenAI, which uses GPUs, built Triton—because they need custom kernels. They avoid cuBLAS and NCCL, using their own software stack—which also compiles to other accelerators. If most of your customers *can* and *are* building CUDA alternatives, how central is CUDA to cutting-edge AI running on NVIDIA hardware?
Huang: CUDA is a rich ecosystem. If you develop on *any* computer, choosing CUDA first is wise—because the ecosystem is unmatched. We support every framework. If you write custom kernels, our contributions to Triton are massive—its backend relies heavily on NVIDIA tech.
We gladly help every framework improve. Frameworks abound—Triton, vLLM, SGLang. Now, reinforcement learning frameworks are exploding—verl, NeMo RL. Post-training and RL are booming. So if you’re developing on a specific architecture, CUDA makes the most sense—you know its ecosystem is mature.
You know that if something breaks, the issue is far more likely in your code—not in the mountain of underlying systems. Don’t forget the sheer scale of code involved. When systems fail, ask: “Did I err—or did the computer?” You always hope *you* erred—because only then can you trust the computer. Obviously, we have bugs too. But crucially, our systems have been tested relentlessly—you can build confidently atop them. That’s point one: ecosystem richness, programmability, and capability.
Point two: if you’re a developer, you want massive installed base. You want your software to run on countless machines—not just your own, but clusters you build for others, as a framework developer. NVIDIA’s CUDA ecosystem *is* its greatest asset.
We have hundreds of millions of GPUs deployed—across every cloud: A10, A100, H100, H200, L-series, P-series—in every shape and size. If you’re a robotics company, you want that CUDA stack running natively inside your robot. We’re virtually everywhere. This installed base means once you develop software or models, they run *anywhere*. That value is incalculable.
Finally, we exist in every cloud—making us truly unique. If you’re an AI company or developer, you’re unsure which cloud you’ll partner with—or where you’ll deploy workloads. No problem—we’re everywhere, including your own data center. Ecosystem richness, installed-base breadth, deployment diversity—these combine to make CUDA priceless.
Patel: That makes sense. But I’m asking: how important are these advantages to your *largest* customers? CUDA may be hugely valuable to many. But your revenue largely comes from customers who *can* build their own software stacks—especially if AI enters domains verifiable via rigorous reinforcement learning. Then the question becomes: who writes the fastest matrix multiplication and attention kernels on massive clusters? It’s a highly verifiable optimization problem.
Hyperscalers absolutely can write these custom kernels. Of course, NVIDIA’s cost-performance may still be better—so they may still choose NVIDIA. But then the question shifts: is it just about raw hardware specs—and dollars-per-FLOP and dollars-per-bandwidth?
Historically, NVIDIA’s CUDA moat let it sustain >70% gross margins on AI hardware and software. But if your biggest customers can bypass that moat, can you maintain such margins?
Huang: We assign staggering numbers of engineers to these AI labs—working side-by-side to optimize their software stacks. Why? Because no one understands our architecture better than we do. These architectures aren’t generic CPUs. CPUs are like Cadillacs—easy to drive, cruise control, simple. NVIDIA GPUs and accelerators are like F1 cars. Sure, anyone can drive 160 km/h—but pushing to the limit demands deep expertise. We use massive AI to write kernels.
I’m confident we’ll remain indispensable for a long time. Our expertise often lifts AI lab partners’ performance by 2x effortlessly. Optimizing a kernel—or the entire stack—commonly boosts model speed by 50%, 2x, or even 3x. Given their massive Hopper and Blackwell clusters, that’s a huge number. Doubling performance equals doubling revenue.
NVIDIA’s compute stack delivers the world’s best total cost of ownership (TCO)—no platform beats us. No one shows me better performance-per-TCO. Dylan’s InferenceMAX benchmark is public—anyone can run it. But TPUs won’t test it—and neither will Trainium. I encourage them to use InferenceMAX to prove their “ultra-low inference cost.” But it’s hard—because no one wants to.
Then there’s MLPerf—I’d love to see Trainium demonstrate its claimed 40% advantage. Or TPUs show their cost advantage. But from first principles, their claimed advantages make no sense. So our success boils down to unbeatable TCO.
Second, you say 60% of our customers are the top five cloud providers—but most of that business is *external*. For example, most NVIDIA chips in AWS serve external customers—not internal ones. Azure’s customers are clearly external—and Oracle’s too. They choose us because of our influence: we bring them the world’s best customers—all built on NVIDIA. And those customers build on NVIDIA because of our influence and versatility.
So the flywheel is: installed base, architectural programmability, ecosystem richness—and thousands of AI companies worldwide. If you’re an AI startup, which architecture do you pick? The richest? That’s us. The largest installed base? That’s us. The most mature ecosystem? That’s us. That’s the flywheel.
Put it all together: our performance-per-dollar is best—and customers’ token cost is lowest. Our performance-per-watt is world-leading. So if a partner builds a 1-gigawatt data center, it should maximize revenue and tokens—directly equaling income. You want maximum tokens for maximum income—and we deliver the highest tokens-per-watt architecture. Plus, if you’re renting infrastructure, we have the world’s largest customer base. That’s why the flywheel spins.
Patel: Interesting. I think the crux is the *actual* market structure. Even with other players, imagine a world where thousands of AI companies each hold roughly equal compute share. Reality? Even via the top five clouds, real compute users on AWS are Anthropic, OpenAI, and major foundational labs. These big players have the resources and capability to run diverse accelerators.
If your claims about cost-performance and per-watt performance are true—why did Anthropic just announce a multi-gigawatt TPU deal with Broadcom and Google, shifting most of their compute there? For Google, TPUs dominate their compute. So if these major AI companies shifted from all-NVIDIA to mixed—why, if your advantages are real?
Huang: Anthropic is an outlier—not a trend. Think: without Anthropic, where would TPU growth come from? 100% from Anthropic. Same for Trainium—100% from Anthropic. It’s an open secret. It’s not that ASIC opportunities are booming—it’s that there’s just *one* Anthropic.
Patel: But OpenAI has a deal with AMD—and is building its own Titan accelerator.
Huang: Yes—but everyone acknowledges most of their compute still runs on NVIDIA. We’ll continue collaborating closely. I don’t mind others trying alternatives—if they don’t try, how do they know ours is best? Sometimes you need reminding. We must keep earning our position.
People make bold claims. Look how many ASIC projects got canceled. Just building an ASIC isn’t enough—you must build something *better* than NVIDIA. That’s not easy. In fact, it’s unreasonable—unless NVIDIA has a flaw. But our scale and speed are undeniable—we’re the only company launching new products yearly, with massive leaps each time.
Patel: I suspect their logic is: it doesn’t need to beat NVIDIA—just not fall far behind the 70% margin you charge.
Huang: No—remember, ASICs have high margins too. Say NVIDIA’s margin is 70%; ASICs might be 65%. What did you save?
Patel: You mean like Broadcom’s?
Huang: Yes. You pay someone else. To my knowledge, ASIC margins are very high—they boast about their astonishing ASIC margins.
So why? Long ago, we simply *couldn’t*. I didn’t fully grasp how hard it is to build a foundational AI lab like OpenAI or Anthropic—and how much supplier investment they need. We couldn’t invest billions in Anthropic for compute commitment. Google and AWS *could*. They invested massively upfront—and Anthropic used their compute. We just couldn’t.
My mistake? Not realizing they had *no choice*: no VC would fund $5–10B into an AI lab hoping it becomes Anthropic. That was my error. Even if I’d understood, I doubt we could’ve done it then. But I won’t repeat that mistake.
I’m eager to invest in OpenAI—and help them scale. I believe it’s necessary. Later, when able, Anthropic approached us—I was thrilled to invest and help them scale. We simply couldn’t then. If I could rewind, with today’s NVIDIA scale, I’d absolutely do it.
Why doesn’t NVIDIA become a hyperscale cloud provider?
Patel: For years, NVIDIA has been the AI company that profits—and profits massively. Now you’re investing: reportedly $30B in OpenAI, $10B in Anthropic. Their valuations soared—and will likely keep rising. Since you’ve supplied their compute and seen their trajectory—and since their value was just 1/10th a year or two ago—why not act sooner? You have the cash. Why not launch your own foundational lab—or strike deals earlier at lower valuations?
Huang: We acted as soon as feasible—and as soon as we *could*. If I could, I’d have acted sooner. When Anthropic needed us, we simply *couldn’t*—it wasn’t even on our radar.
Patel: Why not? Cash?
Huang: Yes—scale of investment. We’d never invested externally before—let alone at that magnitude. We didn’t realize we needed to. I assumed they’d raise VC funding like any company. But what they wanted—VCs couldn’t deliver. Neither could OpenAI’s vision. I recognize that now—but didn’t then.
That’s their genius—realizing early it *had* to be done that way. I’m glad they did. Even if it meant Anthropic went elsewhere, I’m glad it happened. Anthropic’s existence benefits the world—and I celebrate that.
Patel: You’re still printing money—more each quarter. With all that cash, what *should* NVIDIA do? One answer is the emerging ecosystem of intermediaries—converting capital expenditure into operational expense for labs to rent compute. Chips are expensive—but generate massive lifetime revenue as AI models improve. Token value rises—but deployment costs stay high. NVIDIA has capital to spend. Reports say you’re backing CoreWeave with up to $6.3B—and already invested $2B. Why not become a cloud provider yourself? Why not be a hyperscaler—and rent compute directly?
Huang: It’s our company philosophy—and I believe it’s wise. Do only what’s necessary—and as little as possible. Meaning: in building our compute platform, if *we* don’t do it, I truly believe *no one will*. If we don’t take the risks we take—if we don’t build NVLink our way, the full software stack, the ecosystem our way, invest 20 years in CUDA—and lose money most of that time—*no one will*.
If we don’t build all CUDA-X libraries for specific domains—we started domain-specific libraries 15 years ago, because if we didn’t, no one would: ray tracing, image generation, early AI, data processing, structured data, vector data. If we didn’t build cuLitho for computational lithography—*no one would*. Accelerated computing advanced *because* we did these things.
So we *should* do that—and commit fully. But there are many clouds—if I don’t build one, someone else will. Hence our “do only what’s necessary—and as little as possible” ethos guides us daily. Every decision I make reflects this lens.
For clouds: without our support, new clouds—AI clouds—wouldn’t exist. Without helping CoreWeave, they wouldn’t exist. Without supporting Nscale, they wouldn’t achieve what they have. Without Nebius, they wouldn’t be where they are. Now, they’re thriving.
It’s a business model. We do what’s necessary—and as little as possible. So we invest in our ecosystem—because I want it to thrive. I want this architecture—and AI—to connect as many industries and countries as possible—building the entire planet on AI—and on the U.S. tech stack. That’s the vision we pursue.
One more thing: many excellent foundational model companies exist—and we strive to invest in *all* of them. Another thing we do. We don’t pick winners—and *must* support everyone. It’s essential to our business—and fun. But we go to great lengths *not* to pick winners—so if I invest in one, I invest in all.
Patel: Why deliberately avoid picking winners?
Huang: First, it’s not our job. Second, when NVIDIA started, there were 60 3D graphics companies—and we were the *only* survivor. If you’d guessed who’d win among those 60, NVIDIA would rank *dead last*.
That was long ago. NVIDIA’s graphics architecture was *fundamentally wrong*—not slightly wrong. We built a completely flawed architecture—developers couldn’t support it—and it would never succeed. We reasoned well from first principles—but reached the wrong solution. Everyone excluded us—but we survived.
So I have enough humility to know: don’t pick winners. Either let them fend for themselves—or support *all*.
Patel: I’m confused: you say you don’t prioritize new clouds just because they’re new—you want to nurture them. Yet you listed new clouds saying they wouldn’t exist without NVIDIA. How do these reconcile?
Huang: First, they must *want* to exist—and ask for our help. When they have a business plan, expertise, and passion—they clearly believe they have capabilities. But if they ultimately need investment to launch, we’ll support them. The sooner they spin the flywheel, the better.
Your question: do we want to be in the financing business? No. Others do financing—we’d rather partner than become financiers. Our goal is to focus on what we do—and keep our business model simple—while supporting our ecosystem.
When organizations like OpenAI need $30B pre-IPO—and we’re convinced they’ll be extraordinary, that the world needs them, and that they’ll thrive—we support them and help them scale. We make these investments *because they need us*. But we don’t aim to do *as much as possible*—we aim to do *as little as possible*.
Patel: This may be obvious—but we’ve endured GPU shortages for years—and shortages worsen as models improve.
Huang: We *do* face GPU shortages.
Patel: Yes. NVIDIA is known for allocating scarce quotas—not just to the highest bidder—but ensuring new clouds survive: giving CoreWeave some, Crusoe some, Lambda some. What’s NVIDIA’s benefit? First—do you agree with this description of market segmentation?
Huang: No—I disagree entirely. Your premise is fundamentally flawed. First: without purchase orders, talk is meaningless. Before receiving orders, what can we do? So step one is working closely with everyone on forecasting—because these products take long lead times—and so do data centers. Forecasting coordination is our top priority.
Second: we forecast with as many people as possible—but eventually, orders *must* be placed. Maybe for whatever reason, you don’t place one—what can I do? At some point, it’s first-come, first-served. Beyond that—if your data center isn’t ready, or certain components aren’t available to activate it—we may serve other customers first. It’s about maximizing our factory throughput—and we may adjust accordingly.
Beyond that—priority is first-come, first-served. You *must* place an order. Sure, there are stories—like that article about Larry Page and Musk dining with me begging for GPUs. That never happened. We did dine—and it was pleasant—but they never begged for GPUs. They just needed to place orders. Once they did, we allocated capacity—no complexity.
Patel: Okay. So there’s a queue—and allocation depends on data-center readiness and order timing. But that still isn’t “highest bidder wins.” Any reason not to do that?
Huang: We *never* do that.
Patel: Why not?
Huang: Because it’s bad business practice. You set a price—and people decide whether to buy. I understand other chip companies raise prices when demand surges—but we don’t. It’s never been our approach. You can trust us. I’d rather be reliable—and be the industry’s bedrock. No guesswork afterward. If I quote a price—it’s that price. Period. If demand spikes, prices stay stable.
Patel: That’s probably why your relationship with TSMC is so strong?
Huang: Yes—NVIDIA and TSMC have done business for nearly 30 years. No formal contracts—but a rough fairness. Sometimes I’m right; sometimes wrong. Sometimes I get a good deal; sometimes not. But overall, the relationship is excellent. I trust them completely—and rely on them utterly.
One thing you can trust NVIDIA on: this year’s Vera Rubin will be incredible. Next year’s Vera Rubin Ultra arrives. The year after, Feynman. After that—I haven’t named it yet. You can trust us *every year*. Go find any other ASIC team worldwide—pick any—and ask: “Can I bet my entire business on you delivering *every year*?” Can you say your token cost drops an order of magnitude *every year*—and trust you like a clock?
I said similar things to TSMC. Could you say that about *any* historical foundry? No. But today—you *can* say it about NVIDIA. You can trust us *every year*. Want a $1B AI factory? Fine. $100M? Fine. $10M—or one rack? Fine. One GPU? Fine. A $100B order? Fine. We’re the *only* company in the world today you can say that about.
I can say the same to TSMC: one chip—or a billion—fine. We just need planning—and all mature people do that. So NVIDIA became the world’s AI-industry bedrock—a status earned over decades. It’s a massive commitment—and massive dedication. Our company’s stability and consistency matter deeply.
Should AI chips be sold to China?
Patel: Let’s discuss China. I’m not sure I support selling chips to China—but I enjoy playing devil’s advocate. Dario Amodei supports export controls—I asked him why the U.S. and China can’t both have brilliant minds in data centers. Since you’re on the other side, I’ll flip it.
One angle: Anthropic just released the Mythos preview. They haven’t publicly launched it—because it has potent cyberattack capabilities, and the world isn’t ready. They’ll wait until they patch those zero-day vulnerabilities. But they say it discovered *thousands* of high-risk vulnerabilities across all major OSes and browsers. It found one in OpenBSD—a system specifically designed to prevent zero-days—and that bug existed for 27 years.
So if Chinese companies, labs, or the government access AI chips to train a Claude Mythos-level attack model—and run millions of instances with more compute—is that a threat to U.S. companies and national security?
Huang: First, Mythos was trained on fairly ordinary compute—and at an ordinary scale. It’s the excellence of the training company that matters. The compute type and scale used is *plentiful* in China. Chips *exist* there.
They manufacture over 60% of mainstream chips globally—this industry is massive for them. They possess one of the world’s finest computer scientists. As you know, most researchers in AI labs worldwide are Chinese—accounting for 50% of global AI researchers. So if you’re genuinely concerned—given their assets—abundant energy, vast chips, and nearly half the world’s AI talent—what’s the *best* way to build a safer world?
Crushing them—turning them into enemies—is likely not the answer. They’re competitors—and we want the U.S. to win. But dialogue—research dialogue—may be safest. Our current stance toward China in this field is glaringly absent. Dialogue between U.S. and Chinese AI researchers is *critical*. We must jointly define what AI *should not* be used for—this is vital.
Finding vulnerabilities in software is precisely what AI *should* do. Will it find many vulnerabilities? Yes. There are many. AI software has many too. That’s AI’s job—and I’m thrilled AI has reached a level that boosts productivity so dramatically.
One underestimated fact: the ecosystem around cybersecurity, AI cybersecurity, AI safety, and AI privacy is rich. A whole AI startup ecosystem is building the future where thousands of AI agents protect one powerful AI agent—ensuring its safety. That future *will* arrive.
The idea of a lone AI agent roaming freely—unwatched—is crazy. We know this ecosystem must flourish. It turns out this ecosystem needs open source—and open models and open software stacks—so *all* AI researchers and top computer scientists can build equally powerful AI systems—and ensure AI safety. So we *must* keep the open-source ecosystem vibrant—this can’t be ignored. Much of it comes from China—and we shouldn’t stifle it.
Of course, we want the U.S. to have as much compute as possible. Energy is a constraint—but many are solving it; energy mustn’t become a national bottleneck. But we also want *all* global AI developers building on the U.S. tech stack—and contributing AI advances—especially open-source ones—to the U.S. ecosystem. Creating *two* ecosystems—one open, running only on foreign tech stacks; one closed, running on U.S. tech stacks—would be incredibly foolish. I believe it would be catastrophic for the U.S.
Patel: That’s a lot—let me summarize. China has compute—but some estimates say export controls on EUV lithography mean their actual Flops are just 1/10th of the U.S.’s. So can they ultimately train a Mythos-level model? Yes. But the issue is: with more Flops, U.S. labs reach these capabilities first—Anthropic did.
Also, even if they train such a model, large-scale deployment matters. A hacker with a million instances is far more dangerous than one with a thousand. So inference compute is critical. And they have so many excellent AI researchers—that’s precisely frightening—because what makes engineers more efficient? Compute.
If you talk to any U.S. AI lab, they’ll say compute is their bottleneck. DeepSeek’s founder and Qwen’s leadership have said the same—they’re compute-constrained. So isn’t it better that U.S. companies reach Mythos-level capabilities first—with more compute—and society prepares? While China reaches it later—due to less compute?
Huang: Our goal should *always* be to arrive first—and have more compute. But for your scenario to hold, you must push it to the extreme—that they have *zero* compute. As long as they have *some*, the question becomes: how much is *enough*? Factually, China’s compute is *massive*. You just called it the world’s second-largest computing market. If they concentrate compute on one goal—they’re fully capable.
Patel: But is that true? Some estimate SMIC lags in process nodes.
Huang: Their energy is astonishing, right? AI is a parallel computing problem—isn’t it? Why can’t they use nearly-free energy to aggregate 4x or 10x more chips? They have abundant energy. They have fully vacant, fully powered data centers. Their infrastructure capacity is huge. If they want, they’ll aggregate more chips—even 7nm ones.
Their chip-manufacturing capability is among the world’s largest—semiconductor industry knows they monopolize mainstream chips. They have excess capacity—and overcapacity. So the idea that China can’t access AI chips is pure nonsense. Of course, if *no* compute existed globally, the U.S. would lead—but that’s not reality. They already have massive compute. The threshold you fear—they’ve met—and surpassed.
So I think you misunderstand: AI is a five-layer cake—the bottom layer is *energy*. When energy is abundant, it compensates for chip limitations. When chips are abundant, they compensate for energy limits. For example, the U.S. faces energy scarcity—so NVIDIA must push architectures and extreme co-design—delivering outrageous per-watt throughput despite limited chip shipments and tight energy.
But if your wattage is fully abundant—and nearly free—do you care about per-watt performance? You’ll have plenty. You can use older chips. 7nm chips are essentially Hopper. Let me tell you: most models today are trained on Hopper. So 7nm chips are already sufficient. Abundant energy is *their* advantage.
Patel: But can they manufacture *enough* chips?
Huang: They can. Evidence? Huawei just had its best year ever.
Patel: How many chips did they ship?
Huang: Millions—far more than Anthropic owns.
Patel: But the question is how many logic chips—and how much memory—can SMIC produce?
Huang: Let me give you the facts: they have massive logic chips—and massive HBM2 memory.
Patel: But as you know, training and inference bottlenecks are often bandwidth. So if they use HBM2—I don’t recall exact numbers—but bandwidth may be an order of magnitude lower than your latest products. That’s huge.
Huang: Huawei is a networking company.
Patel: But that doesn’t change the fact that you need EUV to manufacture the most advanced HBM.
Huang: Completely wrong. You can aggregate them—as we do with NVL72. They’ve demonstrated silicon photonics—linking all compute into one giant supercomputer. Your premise is entirely flawed.
Fact is: their AI development is progressing smoothly. The world’s best AI researchers—despite limited compute—propose brilliantly efficient algorithms. Remember: Moore’s Law improves ~25% yearly. Yet through excellent computer science, we still boost algorithmic performance 10x. Excellent computer science *is* the lever.
Undoubtedly, MoE is a great invention. All those incredible attention mechanisms reduce compute. We must acknowledge: most AI progress comes from algorithmic advances—not raw hardware. If most progress stems from algorithms, computer science, and programming—then isn’t their army of AI researchers their fundamental advantage? We see it. DeepSeek is *not* trivial. If DeepSeek-like breakthroughs appear first on Huawei platforms—that would be terrible for our country.
Patel: Why? Because today, open-source models like DeepSeek run on any accelerator. Why won’t that hold tomorrow?
Huang: Suppose it’s optimized for Huawei—and their architecture—that puts us at a disadvantage. You describe a scenario I consider *good news*: a company develops software and an AI model—and it runs best on the U.S. tech stack. I call that good news. You frame it as bad. Let me tell you the *real* bad news: *all* AI models are developed—and run best—on *non-U.S.* hardware.
Patel: I just don’t see evidence of huge differences preventing accelerator switching. U.S. labs run models across all clouds—and all accelerators.
Huang: I *am* the evidence. Take a model optimized for NVIDIA—and try running it on something else.
Patel: But U.S. labs *do* that.
Huang: And they don’t run *better*. NVIDIA’s success is perfect proof. AI models are *built* on our software stack—and run *best* on it. Why is that illogical?
Patel: Anthropic’s models run on GPUs—and on Trainium and TPUs.
Huang: It takes massive work to port them. But go to the Global South—or the Middle East. Out of the box—if all AI models run best on others’ tech stacks—you’re making an absurd claim that this benefits the U.S.
Patel: But I don’t understand this argument. Suppose a Chinese company launches the next Mythos first. They find all security vulnerabilities in U.S. software—but run it on NVIDIA hardware—and deploy globally. How is that good?
Huang: It’s *not* good. So let’s prevent it.
Patel: Why do you think it’s fully substitutable—if you deny them compute, Huawei will fully replace you? They’re behind, right? Their chips are inferior.
Huang: Evidence exists: their chip industry is massive.
Patel: You can directly compare H200 vs. Huawei 910C—Flops, bandwidth, memory. Huawei’s is roughly 1/2 to 1/3 of H200’s.
Huang: They compensate with quantity.
Patel: So your argument is they have all this ready energy—and need chips to fill it.
Huang: And they excel at manufacturing.
Patel: I believe they may eventually surpass everyone in manufacturing—but the next few years are critical.
Huang: Which years are “critical”?
Patel: The next few years—we’ll have models capable of launching various cyberattacks.
Huang: In that case—if the next few years are critical—we *must* ensure all AI models are built on the U.S. tech stack.
Patel: If built on the U.S. tech stack—how do we prevent them from launching Mythos-level attacks if they gain equivalent capabilities?
Huang: There are no guarantees—period.
Patel: But if we get it first, we can prepare.
Huang: Listen: why sacrifice one layer of the AI industry—by losing an entire market—to benefit another? There are five layers—and *all* must succeed. The layer needing success most is AI *applications*. Why obsess over *one* AI model—or *one* company? For what?
Patel: Because these models achieve incredible attack capabilities—and you need compute to run them.
Huang: Energy, chips, and the AI researcher ecosystem make it possible.
Patel: Okay—let’s step back. China must build enough 7nm capacity. Remember—they’re stuck at 7nm, while you move to 3nm, 2nm, 1.6nm—like Feynman. You’ll be on 1.6nm while they’re on 7nm. They’ll rely on quantity to offset per-chip performance gaps—and they have abundant energy. The more chips you sell them, the more total compute they wield.
Huang: Listen—I just think your stance is too absolute. The U.S. *should* lead. U.S. compute dwarfs any other region by 100x. The U.S. *should* lead. Okay—the U.S. *does* lead now. NVIDIA builds the most advanced tech. We ensure U.S. labs learn first—and buy first. If funds are short, we even invest. The U.S. *should* lead—and we do everything possible to ensure it. Do you agree on point one? We *are* doing it.
Patel: But if their bottleneck is compute—how does shipping chips to China help the U.S. lead?
Huang: No. We have Vera Rubin for the U.S. Vera Rubin is *for the U.S.* Now—am I part of the U.S.? Do you count me as American?
Patel: Yes.
Huang: What about NVIDIA? You consider NVIDIA an American company, right? First—why can’t we adopt a more balanced regulatory approach—so NVIDIA wins globally—instead of ceding the global market? Why hand the world to others?
The chip industry is part of the U.S. ecosystem—part of U.S. tech leadership—part of the AI ecosystem—and part of AI leadership. Why do your policies and beliefs steer the U.S. to abandon such a massive chunk of the global market?
Patel: Amodei quoted: “It’s like Boeing boasting about selling nuclear bombs to adversaries—but the missile casing is Boeing-made.” This supports the U.S. tech stack. Fundamentally, you’re enabling adversaries.
Huang: Equating AI with those things is absurd.
Patel: But AI is like enriched uranium—it has positive and negative uses. We still don’t ship enriched uranium abroad.
Huang: It’s a poor—and illogical—analogy.
Patel: But if this compute can run a model exploiting zero-days across all U.S. software—how is it *not* a weapon?
Huang: First: solve this by dialoguing with researchers—China—and all nations—to ensure tech isn’t used that way. This dialogue *must* happen.
Second: we must ensure U.S. leadership—Vera Rubin, Blackwell—massively available and stacked in the U.S. Our results will reflect this. We have massive compute—and brilliant AI researchers.
Yet we must recognize AI isn’t just a model. AI is a five-layer cake. Every layer matters—and we want the U.S. to win *every* layer—including chips. Abandoning the entire market won’t let the U.S. win the chip-layer tech race—or the full compute-stack race—long term. That’s fact.
Patel: The key question is: how does selling chips to China *help* us win long-term? Tesla sold EVs in China for years. iPhones sell well there too. But that didn’t lock China into the U.S. tech ecosystem. They built their own EVs—and now dominate globally. Smartphones too.
Huang: At the start of our conversation, you acknowledged NVIDIA’s position is unique. You used the word “moat.” For our company, the ecosystem’s richness—centered on developers—is paramount. 50% of AI developers are in China. The U.S. shouldn’t abandon that.
Patel: But the U.S. has many NVIDIA developers—and that hasn’t stopped U.S. labs from adopting other accelerators. In fact, they already do—and that’s fine. Why wouldn’t China be the same—if you sell them NVIDIA chips, just as Google uses TPUs *and* NVIDIA?
Huang: We must keep innovating. You likely know our market share is *growing*—not shrinking. You implicitly assume that even if we compete in China, we’ll inevitably lose. I’m not the kind of person who wakes up thinking I’ll lose. That loser mindset—and loser premise—means nothing to me.
We’re not building cars. Cars are easy to switch brands—today this, tomorrow that. Computing isn’t like that. x86 survives for a reason—and ARM’s stickiness has reasons too. These ecosystems are hard to replace. Switching takes massive time and effort—and most refuse. So our task is to keep nurturing this ecosystem—and pushing tech—so we compete in the market.
You argue from the premise that we’ll inevitably lose—so we should abandon a market. I can’t accept that logic. It makes no sense. I don’t think the U.S. is a loser. Our industry isn’t a loser.
The key is your extremity. Your argument starts at an extreme: giving them *any* compute at a critical moment means we lose *everything*. That’s naïve.
Patel: Let me clarify my point. I don’t posit a critical compute threshold—just that *any marginal compute helps*. More compute means better models.
Huang: I just want you to acknowledge that *any* marginal sale to the U.S. tech industry is beneficial.
Patel: If AI models running on these chips have cyberattack capabilities—or if chips train such models and run more attack instances—it’s not a nuke—but it *enables* a weapon.
Huang: By your logic, apply it to microprocessors and DRAM. You could even apply it to electricity.
Patel: But we *do* impose export controls on advanced DRAM manufacturing tech—and on many chip-making tools for China.
Huang: We sell massive DRAM and CPU volumes to China—and I believe that’s correct.
Patel: This circles back to the core question: *Is AI different?* If you have tech that finds zero-days in software—do we want to minimize China’s ability to reach—and widely deploy—it first?
Huang: We want the U.S. to lead—and we *can* control that.
Patel: If chips are already there—and they’re training that model—how do we control it?
Huang: We have massive compute—and massive AI researchers—and we’re racing as fast as possible.
Patel: Again—we have more nukes than anyone—but we don’t ship enriched uranium anywhere.
Huang: We’re not enriched uranium—and it’s a chip they can manufacture themselves.
Patel: But they buy from you for a reason. We have quotes from Chinese founders saying they’re compute-constrained.
Huang: Because our chips are better. Overall, our chips are superior—no doubt. Without our chips—could Huawei have had its record year? Could a wave of chip companies have gone public? Can you admit that?
Patel: Yes.
Huang: Can you also admit we once held a huge market share there—and no longer do? Can we admit China accounts for ~40% of the world’s tech industry? Abandoning this market for the U.S. tech industry harms our nation—
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














