
a16z "Disciple" Kuzco Practical Guide II: From Solo Operation to Cluster Deployment
TechFlow Selected TechFlow Selected

a16z "Disciple" Kuzco Practical Guide II: From Solo Operation to Cluster Deployment
There are currently half a month of preparation time remaining before the start of Epoch Two.
By J1N, Techub News
Introduction: From Epoch One to Two
Kuzco is a decentralized computing network dedicated to providing AI large language model (LLM) inference power through mining. It was selected for a16z's Crypto Startup Accelerator (CSX) fall program launched in New York on September 9, receiving at least $500,000 in funding and operational support from a16z. The accelerator program has now concluded.
On November 16, Kuzco announced that its first phase (Epoch One) incentive program will end on November 18, 2024. All operations will be paused, data snapshots will be permanently stored, and final point rankings will be published on a new leaderboard.
According to official disclosure, Epoch One launched on March 6, 2024, peaking with over 8,000 devices running Meta’s Llama-3 8B AI model, collectively generating over one trillion tokens of inference output.
The team also revealed plans to release fundraising details and project roadmap updates in the coming weeks, while announcing that the second phase (Epoch Two) will begin on December 9. Epoch Two will introduce several enhancements, including improved throughput and reliability for NVIDIA hardware, incentives for users to connect high-end GPUs such as A100 and H100, and expanded support for image generation and multimodal vision-language models (VLMs).
With two weeks remaining before the launch of Epoch Two, this article explores:
-
Personal mining practices and outcomes, transitioning from single-device setups to cluster deployment.
-
The full process of securing funding through research and hands-on experience, then building high-spec mining rigs.
-
Hardware configuration alignment with project requirements, addressing common investor questions.
Review of Epoch One: Solo Mining
Configuration
My hardware setup includes RTX series GPUs: 2060, 2070S, 3080, 4060, 4060Ti, four units of 4070S, and two Apple M2 and M3 devices. These are distributed across multiple desktops, laptops, and one dedicated mining rig.
Cost
Notably, these GPUs were originally purchased annually based on gaming needs rather than specifically for mining. Therefore, when calculating costs, I exclude hardware acquisition expenses and only account for actual electricity consumption. Here, I use the mining rig detailed in my previous article, “a16z Disciple Kuzco Practical Guide: How to Efficiently Mine AI Compute Power?” as an example.
Rig specifications:
-
Motherboard: Z490 (later replaced with industrial-grade board)
-
CPU: 10th Gen i9
-
GPUs: 2060, 2070S, 3080, 4060Ti, 4070S

DIY mining rig
The figure below shows the electricity consumed by this rig in October and November—totaling 564 kWh—and approximately 600 million KZO Points earned. Across all machines combined, about 1.1 billion points were accumulated. Actual electricity costs vary depending on local rates; this serves only as a reference.


Far right of the chart: total of 1 billion points earned
Preparing for Epoch Two: Cluster Deployment
Based on insights shared in my earlier article and extensive firsthand experience assembling, debugging, and deploying mining environments, I successfully secured financial backing, which was fully reinvested into building high-performance mining rigs to scale up compute capacity and improve operational efficiency.

Transitioning from solo DIY to clustered deployment
High-Spec Machine Configuration and Selection Logic
Leveraging practical experience from Epoch One, I optimized every component—including motherboard, CPU, GPU, PSU, platform, and networking—to achieve better compatibility, stability, security, and efficiency. I also prioritized hardware with strong resale value in the secondhand market, effectively lowering overall investment risk and offering future participants a more cost-effective path forward.

Motherboard
I chose an industrial-grade motherboard instead of mainstream options like B85, based on a comprehensive evaluation of performance, stability, and cost-effectiveness.
In terms of performance, running Kuzco’s Llama-3 model requires launching multiple Docker containers simultaneously, placing significant demand on CPU resources. CPUs compatible with the B85 chipset cannot meet this requirement.
Moreover, industrial motherboards offer clear advantages in long-term stable operation, heat resistance, and manufacturer warranty, along with higher liquidity in the used market, making them the optimal choice.
GPU
I selected the RTX 4070S as the primary GPU for several key reasons:
AI computing performance advantage: Compared to 30-series GPUs, 40-series deliver significantly greater improvements in AI workloads than in gaming. This is because AI computation heavily relies on CUDA core count, where 40-series GPUs have a substantial lead over their predecessors.
Energy efficiency advantage: After conducting detailed tests on various GPUs to measure average power consumption per token generated:
-
4060Ti (160W): 0.125 Tokens/W
-
3080 (330W): 0.22 Tokens/W
-
4090 (450W): 0.26 Tokens/W
-
4070S (220W): 0.38 Tokens/W
The results show that the 4070S achieves the best balance between performance and power consumption. Its superior energy efficiency directly reduces electricity costs, making it the most cost-effective option.
Secondhand market pricing and liquidity: As a mid-to-high-end GPU, the 4070S maintains strong resale value and market liquidity, further reducing ownership costs and enabling flexible future upgrades.
CPU
As previously mentioned, running Kuzco’s Llama-3 model involves launching multiple Docker instances, resulting in heavy CPU utilization—especially under multi-GPU configurations, where CPU usage can reach 80%–90%. Thus, multi-core, multi-thread processing capability becomes critical. A high-performance, multi-threaded, and stable CPU not only supports smooth multitasking but also ensures consistent mining stability and efficiency.


A 13th-gen i5 reaches over 70% utilization under full GPU load
Network Environment

The soft router is the square box shown in the image
Network conditions are equally crucial in mining. Even with top-tier GPUs, poor network optimization can severely degrade compute performance. Based on my testing, insufficient bandwidth may reduce effective compute output by up to 70%, while low-quality network nodes might prevent connection to the Kuzco network altogether—both unacceptable outcomes. To address this, I adopted a soft router solution, which is easy to configure and runs efficiently with minimal intervention once set up. In theory, it can support unlimited device connections. Readers are encouraged to research specific implementation methods according to their own needs.
Power Supply

Classic Great Wall 2000W "nuclear" PSU
Special attention must be paid to peak power draw when selecting PSUs. That’s why, despite the rated total power of seven 4070S GPUs being only 1,540W, I still opted for dual 2000W PSUs, totaling 4,000W. This is not wasteful—it’s essential for system stability and safety.
GPUs exhibit transient peak power draws during operation—brief spikes reaching 1.5x or more of their rated power—before returning to normal levels. If the PSU cannot handle these peaks, it may trigger automatic shutdowns or even damage the GPUs, posing a fatal threat to mining operations.

4070S power consumption profile
For instance, although the 4070S has a TDP of 220W, its peak power can exceed 400W. For seven such cards, combined peak draw could surpass 3,000W. Hence, using dual 2000W PSUs ensures reliable operation. Users with multiple 4090s should pay particular attention: each 4090 has a TDP of 450W, but peak power can spike to 770W. In multi-GPU setups, even two PSUs may fall short, often requiring three to maintain stability.

4090 power consumption profile
Additional Notes
Topics such as BIOS settings, hardware compatibility, and remote management are not covered in depth here. Numerous free tutorials are available online, and following established guides typically resolves most issues. It’s recommended to consult relevant resources tailored to your specific configuration for efficient problem-solving.
Risks and Returns
To answer the most frequently asked question: How much can you mine per day? Frankly, there’s no definitive answer—risk and reward always coexist. Let me share one clear insight: In both crypto and traditional industries, if a project allows precise daily profit calculation, chances are you’ve already missed the big gains. Unless you possess exclusive advantages—such as extremely low electricity rates or access to deeply discounted mining equipment—you likely won’t gain a meaningful edge. And such advantages aren't accessible to everyone.
I deliberately chose hardware with strong resale liquidity to minimize investment risk and financial pressure. In Kuzco mining, costs mainly consist of hardware depreciation and electricity. Therefore, your maximum potential loss is limited to these fixed costs. Participation without a cost advantage renders any investment decision meaningless. It must be emphasized that early-stage mining inherently lacks predictable returns—but that uncertainty is precisely what gives pioneer miners their upside potential.
Subjectively speaking, this sector holds immense market potential. On one hand, Kuzco has secured backing from a16z; on the other, demand for LLMs is rapidly expanding. Consider how widely LLMs are used today—platforms like OpenAI’s ChatGPT, Meta’s Llama, and Musk’s XAI continue raising massive funding rounds, clearly signaling the industry’s growth trajectory.
For ordinary individuals, direct entry into the AI field is difficult. High technical barriers and enormous resource demands for training AI models place participation out of reach for most people. However, by joining the Kuzco AI compute network, everyday users can participate in this high-growth domain at manageable cost, contribute to AI infrastructure, and earn rewards in return.
Additionally, Bitcoin is nearing the $100,000 mark—an increase from $16,000 in 2022—carrying significant downside risk. Investing directly in AI-related tokens carries similar volatility risks. In contrast, participating in an AI compute network offers a more stable alternative: costs are transparent and controllable, allowing relatively low-risk exposure to the rapid growth of the AI industry. Under current conditions, this represents one of the most feasible pathways for average individuals to enter the AI space.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














