
Jensen Huang’s 2026 GTC Taipei Keynote: The Era of AI Agents Has Arrived—Computation Is Revenue
TechFlow Selected TechFlow Selected

Jensen Huang’s 2026 GTC Taipei Keynote: The Era of AI Agents Has Arrived—Computation Is Revenue
“Computation is revenue; computation is profit. Without revenue and profit, it’s a loss.”
Compiled & Translated by TechFlow

Guest: Jensen Huang, CEO of NVIDIA
Podcast Source: Bonnie Blockchain
Original Title: 7 Key Takeaways from Jensen Huang’s 2026 GTC Taipei Keynote — A Quick Guide to NVIDIA’s Latest Strategy! [Bonnie Blockchain]
Air Date: June 2, 2026
Executive Summary
In his 2026 GTC Taipei keynote, Jensen Huang positioned NVIDIA’s next-phase strategy around a central thesis: AI has evolved beyond content generation into the era of “working agents.” Tokens are no longer mere technical metrics—they are now units of revenue, profit, and GDP generation. Anchored on this shift, NVIDIA unveiled Vera Rubin, Vera CPU, an enterprise-grade agent toolkit, a new generation of PCs co-developed with Microsoft, and Cosmos 3, Alpamayo 2, and Isaac GR00T—designed for physical AI. Huang emphasized that the computing paradigm for the next decade will be defined by four pillars: models, agent frameworks, tool skills, and runtimes—and will extend across cloud, enterprise, local PCs, robots, factories, satellites, and edge devices. For Taiwan’s supply chain, this signals that AI factories, power efficiency, infrastructure delivery speed, and full-stack collaboration capability will become critical drivers of the next wave of industrial growth.
Key Insights
The Dawn of the AI Agent Era
- "Useful AI has arrived. AI is now a profit generator—and a GDP generator. Behind it lies not just large language models (LLMs), but an entirely new computing paradigm: agents."
- "Agents consist of LLMs plus agent frameworks—the frameworks act like operating systems, connecting memory, tools, reasoning, planning, and action."
- "The breakthrough in agent systems stems from two converging advances: LLMs can now think, reason, plan, and use tools; and agent frameworks can manage memory, coordinate workflows, and orchestrate tools."
- "Every company will become an agent company. Every company will run agents internally—and every company will need its own agent operating system."
Tokens, AI Factories, and Infrastructure Economics
- "Tokens are now profitable revenue units. As AI companies seek to produce more tokens, they build more AI factories—explaining the surge in computing demand across Taiwan."
- "Compute equals revenue. Compute equals profit. Without revenue or profit, it’s a loss."
- "If an AI factory has only 1 gigawatt (GW) of power capacity, that’s its absolute ceiling. Within that constraint, throughput per watt equals revenue—because each token carries economic value."
- "Choosing the wrong architecture solely because chips are cheaper won’t translate into real gains. You must optimize for revenue per watt. Buy more—and earn more."
Vera Rubin and NVIDIA’s Infrastructure Transformation
- "Vera Rubin is not a chip—not even just a GPU—but a complete, end-to-end engineered system."
- "NVIDIA began as a GPU company, evolved into a systems company, and is now transforming further into an infrastructure company—helping customers build AI factories."
- "Vera Rubin is NVIDIA’s most ambitious engineering project to date—40,000 engineers across the company contributed, and Taiwan’s supply chain co-created this system."
- "Grace Blackwell was built for AI—especially inference. Vera Rubin was built for running agents."
Vera CPU and the Computational Needs of Agents
- "All previous CPUs were built for humans. This CPU is built for agents."
- "Agents have zero patience. Their world operates in nanoseconds—not seconds. When using tools or querying databases, agents demand instant responses."
- "The Vera CPU is purpose-built for agents—prioritizing single-thread performance, instructions-per-clock (IPC), per-core bandwidth, and total system bandwidth."
- "This market will inevitably dwarf the last one—because there will be far more agents than humans, and agents are profoundly impatient. This is the NVIDIA Vera CPU."
The Next-Generation Personal Computer
- "The future agent computing paradigm will run in AI clouds, inside enterprises—and on your PC."
- "The next OS will be today’s OS augmented with an LLM. In many ways, the LLM is the modern-day DirectX—a smart extension of the computer itself."
- "Applications will be replaced by agent runtimes. Modern apps will become agents."
- "NVIDIA and Microsoft are jointly redefining the PC—introducing a new generation of Windows machines spanning desktops, laptops, and workstations."
Physical AI, Autonomous Driving, and Robotics
- "Language models are trained on human-perspective data—but robots must understand the world from their own perspective. The biggest challenge in physical AI is data."
- "Cosmos 3 is the frontier foundational model for physical AI—it can understand, reason, generate, simulate in closed-loop, and even serve as policy itself."
- "With AI, compute itself becomes data. Cosmos 3 can train additional AI models—and be enhanced into your proprietary model."
- "Whether cloud agents, PC agents, autonomous driving systems, or humanoid robots—the underlying computing paradigm remains identical: models, frameworks, tool skills, and runtimes."
Jensen Huang Names Taiwan Snacks the AI Supply Chain
Jensen Huang:
The scale of Taiwan’s ecosystem today is truly extraordinary. When most people talk about ecosystems, they first think of our software stack—or the developer ecosystem built atop NVIDIA’s computing platform. But NVIDIA’s ecosystem extends much further: upward into Taiwan’s supply chain—the very origin of everything—and downward into data centers, ultimately reaching end users.
Today, we’ll touch nearly every link in this ecosystem. There are countless individuals and organizations deserving of gratitude. I love this ecosystem—it hosts so many companies and some of my favorite ecosystem partners. Taiwan possesses an exceptionally rich ecosystem—the world’s best supply-chain ecosystem.
The AI Agent Era Has Arrived
Jensen Huang:
Two years ago, when I stood here, I began discussing with you how AI would evolve from generative AI to the next wave—agent AI. Today, we can confidently declare: Agent AI has arrived—and useful AI has arrived.
From an industry standpoint, this means explosive demand for tokens. If AI can truly perform tasks, people will want to produce more of that capability. Tokens are now profitable units—revenue-generating units. Since they generate income, AI companies will strive to build more tokens, generate more tokens, and construct more AI factories—explaining the surge in computing demand across Taiwan.
This is why everyone is so busy—and why business performance is so strong. Indeed, it’s reflected in the stock prices of some of your companies. The computing paradigm has changed—everything has changed.
First key point: Useful AI has arrived—and AI is now both a profit generator and a GDP generator. Behind it lies an entirely new computing paradigm—not just LLMs, but agents. Nearly everything we discuss today builds upon this foundation.
Let me clarify what I mean. Here is an agent—an agent application. Previously, this space housed applications: code running on top of an operating system. Today, it’s an agent—comprising one or more LLMs, hosted within an agent framework that coordinates work to deliver productive outcomes.
When input enters the system, the agent must understand, observe, reason, act—and use tools. Tools may include spreadsheets, web browsers, data processing engines, or database engines. Every information flow—whether contextual processing, situational understanding, reasoning about next steps, or generating executable plans—requires software coordination.
Thus, at its core, an agent is such a system. It manages short-term (working) memory and long-term memory—just like humans. Memory management therefore becomes critically important. The entire system is called an agent. The LLM handles thinking; the agent framework connects everything—like an operating system.
This is the new computing paradigm—and why agents achieve astonishing results. It’s a major breakthrough: LLMs now excel at thinking, reasoning, planning, and using tools—and agent frameworks now effectively manage memory, coordinate workflows, and invoke tools. As a result, we can now accomplish many things previously impossible.
What Is a Token in an AI Factory?
Jensen Huang:
Tokens, DSX, GPUs, CPUs, Vera—we’ve built the next-generation system, Vera Rubin. Vera Rubin is not a chip—not even just a GPU. It starts with a GPU, but goes far beyond it. The entire end-to-end system—that is Vera Rubin.
It integrates GPUs, Vera Rubin NVLink 72, and the Vera CPU I’ll introduce shortly. It also includes the revolutionary Vera storage system, CX9, our DOCA software stack, and an embedded security processor. All data—in static storage, in transit, or actively in use—is encrypted. The entire system is secure because AI models are extremely valuable—hence the system fully adheres to confidential computing principles.
Each component of this system, taken alone, would constitute a revolution. Vera Rubin is NVIDIA’s most ambitious engineering endeavor to date. All 40,000 of NVIDIA’s engineers contributed to Vera Rubin—and many of you in this room co-created the entire system. Vera Rubin is truly a miracle: not a chip, but a system composed of many interlocking components.
And it goes even further. Long ago, NVIDIA was a GPU company. Over the years, we evolved into a systems company. What you see now is the most complex system we’ve ever designed from scratch. Yet in the end, our customers and partners don’t want to buy computers—they want to build AI factories.
That’s why NVIDIA is transforming again. You’ll see our technologies scaling to full infrastructure dimensions. Our partners now operate at infrastructure scale too: power plants, cooling systems, grid providers, and numerous industrial enterprises—all part of our ecosystem. Ultimately, we’re building a full technology stack—as we once did with GPUs, Grace Blackwell, and NVLink 72—now expanding to full-stack systems enabling customers to build world-class AI infrastructure.
Helping customers build and deploy AI factories has become critically important—simply because compute equals revenue, compute equals profit. Without revenue or profit, it’s a loss.
You must understand this: when an AI infrastructure goes live, it may launch quickly—or take months. Its throughput may be high—or low. Its elasticity and reliability may be excellent—or poor. Its effective lifespan may be long—or short. Because these represent $50B, $60B, or even $100B investments, this curve is extraordinarily important.
That’s why NVIDIA is a strong partner. We possess full integration capability—not just PowerPoint slides, but real infrastructure built end-to-end, all interconnected, and stress-tested at massive scale to ensure robust operation. Thus, our time-to-first-token, time-to-first-inference, and training startup times are faster.
Second, our throughput per watt—and tokens per watt—are world-class. That’s because we integrate everything, design everything from scratch, simulate the entire system, and apply extreme co-design. As shown earlier with the Vera Rubin rack—every design choice targets extraordinary throughput.
If your data center—or your factory—has 1 GW of power, that’s its hard ceiling—its total available generation capacity. Within that 1 GW, throughput per watt equals revenue—because every token generates profit, every token is revenue.
This is the future. Compute equals revenue. Performance per watt equals your revenue. Choosing the wrong architecture solely because chips are cheaper won’t yield real returns—you must optimize for revenue per watt. Buy more—and earn more.
I stand before you today to announce: Vera Rubin is now in full production. The supply chain we’ve built for Vera Rubin is twice the scale of Grace Blackwell’s. Previously, assembling one Grace Blackwell rack took two hours—now it takes five minutes. So not only is capacity higher, but production throughput is dramatically faster—and we need all of it to meet surging demand.
This ecosystem is extraordinary. To support Grace Blackwell—and ramp Vera Rubin—millions of square feet of new capacity have come online. Thank you all. Vera Rubin is in full production. Thank you.
Introducing the Vera Rubin System
Jensen Huang:
Vera Rubin wasn’t built just for AI. Vera Rubin wasn’t built just to run AI—it was built to run agents. It is an agent-native system. Consider its complexity—and understand why agents represent the final frontier of computer science. It took decades to realize their potential and make them truly useful. A computer capable of running them must therefore be the world’s most advanced.
This is Vera Rubin. Let’s bring it onstage.
This is Vera Rubin, Vera Rubin NVLink 72. It’s part of the next-generation system—I’ll share more at the next GTC. Today, we have much more to cover. This is the Vera CPU rack—256 CPUs, fully liquid-cooled. I’ll introduce Vera shortly. This is the Vera BlueField storage processing system—and security system. Of course, there’s our Mellanox networking—the world’s first CPO. This is Vera Rubin: a breathtaking convergence of technologies.
When we built Hopper, it was for pretraining—the dominant workload and most critical load at the time. Later, when we built Grace Blackwell, people said: “Jensen, NVIDIA excels at pretraining—so inference must be simple.” Do you remember? Many claimed: “Inference is simple—we can do it too.”
But you know: inference equals money. Models are incredibly complex—and achieving excellence simultaneously in ultra-low latency, rapid interaction, and high throughput is extremely difficult. That’s why we created NVLink 72.
Today, NVIDIA’s token cost is the lowest globally—not 10% lower, but orders of magnitude lower. This stems entirely from our extreme co-design, our deep understanding of inference’s computational model—and our creation of NVLink 72.
With Vera Rubin, we’ve moved beyond inference. Now it’s inference within agent systems. This is Vera Rubin. No cables. No hoses. No fans. Last time I showed it, cables were everywhere.
VERA CPU: The CPU for AI Agents
Jensen Huang:
The Vera CPU is built for the AI era. Until now, all CPUs were built for humans. We were users—and tenants. Humans interact with CPUs in a world measured in seconds—we rent CPU cores in the cloud, and more cores mean more rentable resources. The usage patterns and economics of legacy CPUs differ entirely from those of agents.
Agents have zero patience. Their world operates in nanoseconds—not seconds. When using tools, agents demand the fastest possible response; when querying databases, they expect instantaneous results. Each moment an agent waits blocks the next step—and the next—and the next. Therefore, our CPU must be as low-latency and interactive as possible.
This is why we built the Vera CPU for the AI era. Within our system, it serves three roles. First, it’s used for thinking in Vera Rubin racks—each rack already contains two CPUs. You know we’re manufacturing and selling millions of Vera Rubin racks—and have already sold millions of Grace Blackwell systems. NVIDIA is now among the world’s largest CPU manufacturers.
One of the two CPUs in a Vera Rubin rack coordinates and manages GPUs, handles KV cache, and runs various software across the rack. We also deploy Grace BlueField for security and isolation. The Vera compute portion handles the agent framework—coordinating AI models, tool usage, and database access.
This data server is the Vera BlueField—the world’s fastest storage server and storage system. It’s essential because agents access memory at extreme speeds. Storage servers and CPUs now sit on the most expensive critical path in the data center.
That path is expensive for good reason: the economics of AI factories center on tokens—and tokens are created here. Naturally, you want to manufacture and generate as many tokens as possible. Economic value concentrates here—and CPUs and storage systems must never be bottlenecks.
Hence, the Vera CPU places enormous pressure on CPU architecture—driving us to build an entirely new architecture from scratch. This is a CPU the world has never seen—called Vera. It’s built for agents. All previous CPUs were built for humans—this CPU is built for agents.
First, Vera’s instructions-per-clock (IPC) must be exceptional—because we need to shrink latency and processing time. We prioritize single-thread performance—not raw throughput. Single-thread performance must be world-class—the best. So Vera’s IPC is extremely high—the highest globally: fetching, decoding, and executing 10 instructions per clock cycle.
Second, bandwidth for data moving in and out of the CPU must be world-class. This includes both per-core bandwidth and total system bandwidth. As I noted earlier, agent systems are inherently decoupled and distributed. When computation is decoupled and distributed, networking becomes critical—so we must move data between CPU cores, between CPU and storage, and between CPU and GPU as fast as possible.
Bandwidth around the system—and inside CPU cores—must be world-class, because CPU cores communicate with each other at extremely high bandwidth. They aren’t rented core-by-core—they collaborate as a unified whole. Vera’s cross-sectional bandwidth is astonishing. It’s the first system supporting PCIe Gen 6—and pioneers LPDDR5, delivering 1.2–2 TB/s bandwidth—2–3x that of top-tier CPUs.
This is the CPU for agents. This market will inevitably dwarf the last—because there will be far more agents than humans, and agents are profoundly impatient. This is the NVIDIA Vera CPU.
The Most Important Computing Paradigm for the Next Decade
Jensen Huang:
This is truly the most important slide. Its core conclusion is: this is the application model—and computing model—for the next decade. Agents, agent frameworks, and LLMs coordinated by those frameworks—every company will run them. Every company will become an agent company—every company will run agents internally—and every company will discover agents require their own operating system.
Every company asks us: How do we run agents securely? How do we build agents for our workloads? So we offer the NVIDIA Enterprise AI Agent Toolkit. You’ve watched me build it publicly, step by step.
Nearly everything NVIDIA does is transparent. If you revisit my GTC keynotes from five or ten years ago, you’ll see I’ve been speaking about today’s topics for years—because we’ve been preparing for this moment.
To build agent-as-a-service—or operational agents—enterprises need four things. First, models—ideally smarter, cheaper, and faster LLMs. Second, a framework to coordinate the entire system. Third, tools with skills—models want to use tools. Earlier, I showcased CUDA-X libraries—they’ll become powerful tools for agents. Fourth, a runtime—the OS that binds everything together.
This is the NVIDIA Agent Toolkit. It includes modifiable models—NVIDIA’s world-class open-source models. I’ll show more. You can run agents from anywhere—powerful ones like Claude Code or Codex—inside a framework called Open Shell, enabling highly secure internal deployment.
This Shell protects agents—enforcing security policies at all times. Privacy is protected. Permissions and privileges are explicitly assigned. Identity is safeguarded. Thus, Open Shell is being adopted globally. NVIDIA Open Shell is open-source—and you’ll see widespread adoption by companies including Red Hat, Canonical, and Microsoft.
This is a critical runtime—and it’s fully optimized for NVIDIA’s ubiquitous AI platform. You can run Open Shell on any cloud, on-premises, or even on-device. Now you have tools and libraries agents can use, models you can modify or deploy directly, and agent frameworks. These frameworks can now run locally—or anywhere else.
One of my favorite agent use cases is chip design—the most critical work at NVIDIA. So naturally, we partnered with Cadence to build a chip-design super-agent. It’s orchestrated by Codex or Claude Code, accepts inputs like RTL, architecture diagrams, schematics, or specifications—and helps fix what needs fixing. We co-developed several super-agents—optimized for NVIDIA’s runtime using Nemotron.
NVIDIA is committed to building open models—empowering you and all of us to create our own agents. Today, we announce Nemotron 3 Ultra—the next-generation open model, and remarkably intelligent. Nemotron models provide not just the model—but all the data we used to train it.
Thanks to our strong partner alliance, you see all listed partners here. We collaborate and contribute data collectively. Through these partnerships—from models to training scripts to data—everything is fully open-sourced for you. This is the best form of open models—and the world’s strongest open-model system policy. The goal is simple: take everything, build upon it, improve it, and make it your own.
Nemotron 3 Ultra is 5x faster and 30% cheaper—and fully open. We’re deeply committed. This is Nemotron 3—and we’re already developing Nemotron 4. It’s precisely this complete toolkit—models, frameworks, tool skills, and runtimes—that enables every enterprise worldwide—like Cadence with its super-agent—to build its own agents.
NVIDIA’s Next-Generation Personal Computers
Jensen Huang:
Microsoft and NVIDIA will jointly reinvent the PC. This will be the new PC. Tomorrow night—our tomorrow night—I’ll join Satya to discuss the work we’ve advanced together over the past three years. Microsoft and NVIDIA spent years thoroughly rethinking how PCs operate—preparing for this moment.
As I mentioned earlier, the agent computing paradigm will run in AI clouds, inside enterprises—and on your PC. What happens when your PC hosts an autonomous agent? It assists you—and understands you. You can speak to it—and it sees you. You can ask it to read documents or conduct research. It does much more—I’ll demonstrate shortly.
The new OS is, of course, the traditional OS augmented with an LLM. In many ways, the LLM is the modern DirectX. It has inputs and outputs, understands prompts, interprets computer vision, generates video and audio—and serves as the modern intelligent extension of the PC—and of computing itself.
Above this, as I noted earlier, applications will be replaced by agent runtimes—and modern applications will be agents.
Ladies and gentlemen—the NVIDIA RTX Spark laptop. Thank you. My pockets are full. Here it is—the world’s most astonishing chip. This is N1X, co-developed with MediaTek. I think I just saw Rick. This is N1X—a beautiful chip. Frankly, it took 33 years to build.
Why? Because 100% of NVIDIA’s software stack runs natively here. Want digital biology? Done. Seismic processing? Done. Astrophysics? Done. Everything CUDA-related—physics, biology, genomics, AI—works flawlessly. All computer graphics? Done.
Every application NVIDIA has ever created—and every application Windows has ever run—has been meticulously optimized by Microsoft and NVIDIA to run perfectly on this machine. And now, it runs agents too. This is an incredible computer—and I’m deeply proud of it.
This machine can host a local Nemotron 3 Ultra model—or the Nemotron 3 Super Model—and connect to cloud-based models like Claude Code, Codex, or others—or even models on the internet. It works—and achieves astonishing results. RTX Spark is a reinvention of the laptop—but in truth, Microsoft and NVIDIA are reinventing the entire PC.
Today, we announce a new product line: three revolutionary Windows machines—covering desktops, laptops, and workstations. They’re 100% Windows-compatible, 100% CUDA-enabled, and 100% powered by NVIDIA AI Tensor Cores. Everything you run on any NVIDIA platform worldwide—runs here.
We’ve prepared a roadmap. This is a brand-new product family. Each architectural generation will feature desktops, laptops, and workstations—and the next generation will too. I’m thrilled—and honored—that 100% of the global PC industry has joined us to reinvent the PC. This is a new product line—and a new beginning.
Cosmos 3: The Foundational Model for Physical AI
Jensen Huang:
In language modeling, we train on English and other languages scraped from the internet—written and read by humans. But to create data for AI robots, we must capture data from the robot’s own perception and perspective. Most video data in the world is third-person—not first-person.
Therefore, data is the hardest problem for agent systems, robotic systems, and physical AI. You’ve seen us climb this ladder. We started with teleoperation—essentially human demonstration. This isn’t different from human feedback breakthroughs in reinforcement learning. Then we leveraged simulation—where Omniverse shines. This parallels verifiable reward in reinforcement learning.
We use these systems to bootstrap AI models—specifically physical AI models. Eventually, we learn from third-person perspectives—and reproject them into first-person views. Through this bootstrapping process, we arrive at a world foundational model—one that understands the physical world from any perspective you desire: third-person, first-person, outside-in, inside-out. This is truly a breakthrough.
Today, we announce Cosmos 3. Cosmos 3 is the frontier of physical AI. We lead in language models—and many research them. But in physical AI—we are unequivocally the world’s strongest. I’m immensely proud of what our team achieved.
This is the foundational model for all your work. Whether you’re building robots, factory robots, or robots working in factories—if it involves the physical world, you now have a partner: Cosmos 3. It can understand and reason, generate, simulate in closed-loop—and even serve as policy itself. It leads global benchmarks. I’m deeply proud of Cosmos—and today, we announce Cosmos 3.
Previously, data + compute = AI. Now that we have AI, compute itself becomes data. So, use Cosmos 3 to train a broad suite of AI models. Cosmos is an outstanding open-model system—identical to Nemotron. We open-source the model, the data, and even the training methodology—so you can enhance it yourself—and make Cosmos your proprietary model.
Alpamayo 2: Autonomous Driving Inference
Jensen Huang:
Today, we announce Alpamayo 2—an open model for autonomous vehicles. We’re partnering with global automakers. Looking at brands already adopting NVIDIA Hyperion—and building NVIDIA Hyperion cars—they represent roughly 80% of global auto production. In other words, these manufacturers account for ~80% of vehicles worldwide.
There will be vast numbers of NVIDIA Hyperion systems—capable of running Alpamayo, and any other autonomous driving stack. We also connect to mobility services—globally, ~97% of mobility services are integrating with us. So when we deploy Alpamayo on Hyperion’s runtime and Halos OS, we connect to these global services.
Isaac GR00T: Humanoid Robots
Jensen Huang:
NVIDIA Isaac GR00T is our humanoid robotics stack—encompassing models, data generation, simulation, runtimes, and operating systems. It represents the GR00T platform—the Isaac GR00T platform.
You’ll see every system follows the same pattern: whether cloud-based agent systems, PC-based agent systems, robotic systems for autonomous vehicles, or humanoid robotic systems—the pattern is identical.
Of course, in each case, we build everything end-to-end. We vertically integrate, fully co-design, apply extreme co-design—and then open it up for anyone to use any part as needed. Want to use something? We’ll even help you modify it.
Yet one thing remains missing: robotics needs a reference platform. These systems are too complex—packed with motors and sensors—and too fragile. We need a way to deliver reference platforms—just as we did for PCs, DGX, cloud, and autonomous vehicles—now for robotics too.
Today, we announce NVIDIA Isaac GR00T—a fully integrated humanoid robotics reference platform. Each hand has 25 degrees of freedom; the robot body has 31; it stands 6 feet tall and weighs 150 pounds. Just like me—except the first number is smaller, the second larger, and otherwise quite similar.
This platform runs the new Thor chip—and our full software stack, data generation stack, simulation stack, and runtime. Everything is integrated into one robot platform—available to all. We built it for higher education and university researchers—because building such a platform themselves is simply too difficult.
Recap and Summary
Jensen Huang:
The computer industry has been utterly transformed over the past six months. This change occurred because agents have finally been realized—and converged with the latest frontier models—making AI truly capable of useful work.
This computing pattern repeats endlessly: an agent comprises a model and a framework, uses tools with skills, and runs on a runtime. The runtime depends on whether it’s in the cloud, on-premises, on a PC, or in a robot—but the computing pattern is identical.
You’ll choose frameworks based on preference—and models based on preference. You’ll adapt them for proprietary use. You’ll build super-agents—and rent them to others to help get work done. This agent platform—this agent paradigm—is exactly what the NVIDIA Enterprise AI Toolkit supports. For you, it’s a great way to engage with AI. For us, it’s a massive growth opportunity.
Vera Rubin is now in full production. Grace Blackwell was built for AI—especially inference. Vera Rubin was built for running agents. It’s now in full production. It’s far more than a GPU—it’s a fully decoupled, distributed agent-processing system.
NVIDIA has truly become an infrastructure company. Not just a GPU company—not just a systems company—but an infrastructure company. Our mission is to help you maximize revenue and profit—and do it as quickly as possible.
In the agent world, this new computing paradigm means CPUs must be built for agents—not for humans. Agent-specific CPUs have unique requirements. Our NVIDIA Vera is revolutionary. I’m thrilled by its ramp and order traction—it will be NVIDIA’s fastest and most successful product launch in history.
NVIDIA and Microsoft have jointly created a new PC product line. This is a new beginning. Of course, the same agent processing paradigm—the same agent computing paradigm—I described earlier will run across devices. I mentioned PCs—but soon it will appear in robots, satellites, base stations, factories, cloud, on-prem, and edge devices. This agent AI system—and agent computing paradigm—will replicate across all computing platforms. Our understanding of the personal computer may well change.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














