
τ Scaling: Huawei’s New Growth Engine for the Post-Moore Era
TechFlow Selected TechFlow Selected

τ Scaling: Huawei’s New Growth Engine for the Post-Moore Era
From “shrinking size” to “compressing time”
For the past 60 years, the semiconductor industry has advanced by shrinking transistor dimensions—Moore’s Law—making transistors smaller, denser, and cheaper.
But this path has now hit a wall:
- Diminishing returns below the 7nm node
- Astronomical photolithography equipment costs
- Design costs for a single advanced-node chip exceeding $1 billion
- Rising cost per transistor—not falling
Huawei’s semiconductor team validated a new direction over six years, across 381 mass-produced chips:
Stop competing on size—start competing on time.
They introduced the τ Scaling theory:
Treat “time” as the core optimization metric, compressing characteristic time τ across the entire stack—from transistor switching (picoseconds) to datacenter task completion (seconds)—spanning 12 orders of magnitude.
In simple terms:
Instead of racing to shrink, we now race to speed up—reduce latency, boost efficiency.
I. What Exactly Is τ Scaling?
τ represents latency or time constants at each layer—four layers in total:
- Transistor: switching speed
- Circuit: signal propagation delay
- Chip: compute and memory access latency
- System: end-to-end communication and synchronization time
The goal is to compress τ holistically across the full stack—optimizing process, circuit design, architecture, and system with one unified metric—ending siloed, disjointed optimization.
II. Mobile Deployment: LogicFolding
Without upgrading process nodes, chips are vertically stacked using ultra-precise hybrid bonding to distribute critical paths across multiple layers—essentially “adding floors” to the chip.
- Transistor density: increased from 155 → 238 million/mm²—a 55% gain
- Energy efficiency: improved by 41%; peak frequency up nearly 13%
- SRAM frequency: increased by over 40%
- Kirin 2026 achieves 3.1 GHz; target: 4 GHz by 2029
III. AI Datacenter Deployment: End-to-End Latency Compression
In AI clusters, 80% of energy consumption and 70% of cost stem from data movement—the key is compressing communication time.
1. Unified Bus
Eliminates multi-layer protocols, slashing remote-access latency from tens of microseconds down to ~100 nanoseconds—500× faster.
2. Hi-ONE Optical Interconnect
8 Tb/s per module; replaces copper with fiber, extending reach from 1 meter to 100 meters—enabling support for 10,000-GPU clusters.
3. 3D Folding
Solves the 2.5D packaging bottleneck—where die area scales rapidly but interconnect bandwidth lags—by relocating memory, power delivery, and optical I/O onto vertical surfaces, scaling them in lockstep with compute.
- Prediction: AI hardware integration density will increase >100× by 2035
IV. Reintegration of Logic and Memory
Historically, CPUs and memory evolved separately. In the AI era, data movement is more critical than computation—logic and memory must be tightly integrated in 3D. Industry influence is shifting toward memory and packaging players.
V. Remaining Challenges
- EDA tools must adapt to 3D stacking design
- Wafer-to-wafer process variation and vertical interconnect losses require optimization
- New energy-efficiency and benchmarking standards must be developed
Conclusion
The era of Moore’s Law—defined by dimensional scaling—is over. The era of time scaling has begun.
We no longer need to obsess over cutting-edge lithography machines. With 3D stacking, system architecture innovation, and interconnect optimization, sustained gains in performance and energy efficiency remain fully achievable.
This will be the semiconductor industry’s defining roadmap for the next decade.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News













