
From Computing Power Race to Algorithmic Innovation: The New AI Paradigm Led by DeepSeek
TechFlow Selected TechFlow Selected

From Computing Power Race to Algorithmic Innovation: The New AI Paradigm Led by DeepSeek
"We can only see a short way into the future, but far enough to discover that there is much work to be done."
Author: BadBot, IOBC Capital
Last night, DeepSeek released an update to its V3 model on Hugging Face—DeepSeek-V3-0324, featuring 685 billion parameters and significant improvements in coding capabilities, UI design, and reasoning performance.
At the recently concluded 2025 GTC conference, Jensen Huang gave high praise to DeepSeek. He emphasized that the market’s previous belief—that DeepSeek's efficient models would reduce demand for NVIDIA chips—is mistaken. Future computing demands will only grow, not shrink.
As a star product representing algorithmic breakthroughs, what exactly is DeepSeek’s relationship with NVIDIA’s computing supply? First, I’d like to discuss the significance of computing power and algorithms in industry development.

Symbiotic Evolution of Compute and Algorithms
In AI, advances in compute enable more complex algorithms by providing the foundation to process larger datasets and learn more intricate patterns. Conversely, algorithmic optimization allows for more efficient use of compute, enhancing resource utilization.
This symbiotic relationship between compute and algorithms is reshaping the AI industry landscape:
Technical Diversification: Companies like OpenAI pursue ultra-large compute clusters, while others like DeepSeek focus on algorithmic efficiency, forming distinct technical schools.
Industry Chain Restructuring: NVIDIA dominates AI compute through its CUDA ecosystem, while cloud providers lower deployment barriers via elastic compute services.
Resource Allocation Shifts: Enterprises are balancing investments between hardware infrastructure and efficient algorithm development.
Rise of Open-Source Communities: Open-source models like DeepSeek and LLaMA enable shared innovation in algorithms and compute optimization, accelerating technological iteration and diffusion.
DeepSeek’s Technological Innovations
DeepSeek’s success is inseparable from its technological innovations. I’ll explain them in simple terms so most readers can understand.
Model Architecture Optimization
DeepSeek adopts a hybrid architecture combining Transformer with Mixture of Experts (MoE), enhanced by Multi-Head Latent Attention (MLA). This setup functions like a super team: the Transformer handles routine tasks, while MoE acts as specialized expert subgroups, each excelling in specific domains. When a particular problem arises, the most qualified expert processes it, greatly improving efficiency and accuracy. The MLA mechanism enables the model to flexibly focus on different critical details during information processing, further boosting overall performance.
Training Method Innovation
DeepSeek introduced the FP8 mixed-precision training framework—a smart resource allocator that dynamically selects appropriate computational precision based on training phase requirements. It uses higher precision when accuracy is crucial and lowers precision where acceptable, saving computational resources, accelerating training speed, and reducing memory usage.
Improved Inference Efficiency
During inference, DeepSeek employs Multi-token Prediction (MTP) technology. Traditional methods predict one token at a time, step by step. MTP, however, predicts multiple tokens simultaneously, significantly speeding up inference and lowering costs.
Breakthroughs in Reinforcement Learning Algorithms
DeepSeek’s new reinforcement learning algorithm, GRPO (Generalized Reward-Penalized Optimization), enhances model training. Reinforcement learning is akin to assigning a coach to the model, who guides learning through rewards and penalties. Traditional approaches often consume excessive compute, but DeepSeek’s GRPO is more efficient—improving model performance while minimizing unnecessary computation, achieving an optimal balance between capability and cost.
These innovations are not isolated techniques but form a cohesive technical system that reduces compute demands across the entire pipeline—from training to inference. Powerful AI models can now run on consumer-grade GPUs, dramatically lowering the barrier to AI adoption and enabling more developers and enterprises to participate in AI innovation.
Impact on NVIDIA
Many believe DeepSeek bypasses the CUDA layer, thus freeing itself from dependence on NVIDIA. In reality, DeepSeek directly optimizes algorithms using NVIDIA’s PTX (Parallel Thread Execution) layer. PTX is an intermediate representation language between high-level CUDA code and actual GPU instructions. By operating at this level, DeepSeek achieves finer-grained performance tuning.
The impact on NVIDIA is twofold: On one hand, DeepSeek becomes even more tightly integrated with NVIDIA’s hardware and CUDA ecosystem, and the lowered AI application barrier could expand the overall market size. On the other hand, DeepSeek’s algorithmic optimizations may shift market demand for high-end chips—AI models that once required H100 GPUs might now run efficiently on A100s or even consumer-grade graphics cards.
Significance for China’s AI Industry
DeepSeek’s algorithmic optimization offers a pathway for technological breakthroughs within China’s AI sector. Amid restrictions on advanced chip imports, the “software compensating for hardware” approach reduces reliance on cutting-edge foreign semiconductors.
Upstream, efficient algorithms ease pressure on compute demand, allowing compute service providers to extend hardware lifecycle and improve ROI through software optimization. Downstream, optimized open-source models lower the threshold for AI application development. Numerous small and medium enterprises—without access to massive compute resources—can now build competitive applications based on DeepSeek, fostering a wave of vertical AI solutions.
Profound Implications for Web3 + AI
Decentralized AI Infrastructure
DeepSeek’s algorithmic advances provide new momentum for Web3 AI infrastructure. Its innovative architecture, high efficiency, and low compute requirements make decentralized AI inference feasible. The MoE architecture is naturally suited for distributed deployment—different nodes can host different expert networks, eliminating the need for any single node to store the full model. This drastically reduces per-node storage and compute demands, enhancing flexibility and efficiency.
The FP8 training framework further lowers the bar for high-end compute, enabling more devices to join the network. This not only reduces entry barriers for decentralized AI computing but also increases the network’s collective computational capacity and efficiency.
Multi-Agent Systems
Smart Trading Strategy Optimization: Collaborative agents—including real-time market data analysis, short-term price forecasting, on-chain transaction execution, and result monitoring agents—work together to maximize user returns.
Automated Smart Contract Execution: Agents for contract monitoring, execution, and outcome verification operate in tandem, enabling automation of more complex business logic.
Personalized Portfolio Management: AI analyzes users’ risk preferences, investment goals, and financial conditions to identify optimal staking or liquidity provision opportunities in real time.
“We can only see a short way ahead, but enough to realize there is much work to be done.” DeepSeek exemplifies how algorithmic innovation can overcome compute constraints, carving out a differentiated development path for China’s AI industry. By lowering application barriers, driving AI–Web3 convergence, reducing reliance on high-end chips, and empowering financial innovation, these impacts are reshaping the digital economy. The future of AI is no longer just a race in compute—it’s a race in co-optimization of compute and algorithms. On this new track, innovators like DeepSeek are redefining the rules with Chinese ingenuity.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News












