Where is the way out for homogeneous AI infrastructure?

2024.07.30

Where is the way out for homogeneous AI infrastructure?

This study aims to explore which areas of artificial intelligence are most important for developers, and which areas in Web3 and artificial intelligence might represent the next emerging opportunity.

2024.07.30 - 09:58:09

Navigating Web3 tides with focused insights

This study aims to explore which areas of artificial intelligence are most important for developers, and which areas in Web3 and artificial intelligence might represent the next emerging opportunity.

Authored by: IOSG Ventures

Special thanks to feedback from Zhenyang@Upshot, Fran@Giza, Ashely@Neuronets, Matt@Valence, Dylan@Pond.

This research aims to explore which areas of artificial intelligence are most critical for developers and identify potential breakout opportunities at the intersection of Web3 and AI.

Before sharing new insights, we're excited to announce that we participated in the first round of financing for RedPill, totaling $5 million. We’re thrilled about this collaboration and look forward to growing together with RedPill!

TL;DR

As the convergence of Web3 and AI becomes a focal point in the crypto space, AI infrastructure in crypto is booming. However, there are still few actual applications leveraging AI or built specifically for AI, and homogenization among AI infrastructure projects is becoming apparent. Our recent participation in RedPill’s first funding round has provided deeper insights.

Main tools for building AI Dapps include decentralized OpenAI access, GPU networks, inference networks, and agent networks.
GPU networks are even hotter than the "Bitcoin mining era" because: the AI market is larger, growing rapidly and steadily; AI supports millions of applications daily; AI requires diverse GPU models and server locations; technology is more mature than before; and the customer base is broader.
Inference networks and agent networks share similar infrastructures but focus on different aspects. Inference networks primarily allow experienced developers to deploy their own models, and running non-LLM models does not necessarily require GPUs. Agent networks, however, focus more on LLMs—developers don’t need to bring their own models but instead emphasize prompt engineering and connecting different agents. Agent networks always require high-performance GPUs.
AI infrastructure projects promise significant potential and continue rolling out new features.
Most native crypto projects remain in testnet phase, suffering from poor stability, complex configurations, limited functionality, and still needing time to prove security and privacy.
Assuming AI Dapps become a major trend, many untapped domains remain, such as monitoring, RAG-related infrastructure, Web3-native models, decentralized agents with built-in crypto-native APIs and data, evaluation networks, etc.
Vertical integration is a clear trend. Infrastructure projects aim to offer one-stop solutions, simplifying development for AI Dapp builders.
The future will be hybrid. Some inference will happen on the frontend, while key computations occur on-chain—balancing cost and verifiability.

Source: IOSG

Introduction

The fusion of Web3 and AI is one of the most watched topics in the current crypto landscape. Talented developers are building AI infrastructure for the crypto world, aiming to bring intelligence into smart contracts. Building AI dApps is an extremely complex task—developers must manage data, models, computing power, operations, deployment, and integration with blockchains. To address these needs, Web3 founders have developed early-stage solutions such as GPU networks, community-based data labeling, community-trained models, verifiable AI inference and training, and agent marketplaces.
Despite this flourishing infrastructure, there are still few real-world applications utilizing AI or built specifically for AI. When developers search for tutorials on building AI dApps, they find few resources related to native crypto AI infrastructure—most only cover calling OpenAI API from the frontend.

Source: IOSG Ventures

Current applications fail to fully leverage blockchain's decentralization and verifiability—but this will soon change. Most AI-focused crypto infrastructure projects have launched testnets and plan to go live within the next six months.
This report details the main tools available in AI infrastructure within the crypto domain. Let’s prepare for crypto’s GPT-3.5 moment!

1. RedPill: Decentralized Access to OpenAI

The previously mentioned RedPill, in which we invested, serves as a great entry point.

OpenAI offers several world-class models like GPT-4-vision, GPT-4-turbo, and GPT-4o—ideal choices for building advanced AI Dapps.

Developers can integrate OpenAI API into dApps via oracles or frontend interfaces.

RedPill aggregates various developers’ OpenAI API keys under a single interface, delivering fast, affordable, and verifiable AI services globally—democratizing access to top-tier AI model resources. RedPill’s routing algorithm directs developer requests to individual contributors. API calls are executed through its distribution network, bypassing potential restrictions from OpenAI and solving common problems faced by crypto developers, such as:

TPM (Tokens Per Minute) limits: New accounts face usage caps that cannot meet the demands of popular, AI-dependent dApps.
Access restrictions: Certain models restrict access for new accounts or users from specific countries.

By simply changing the hostname while using the same request code, developers can access OpenAI models affordably, scalably, and without limitations.

2. GPU Networks

Besides using OpenAI APIs, many developers choose to host models themselves. They can rely on decentralized GPU networks such as io.net, Aethir, Akash—popular platforms enabling developers to build GPU clusters and deploy powerful internal or open-source models.

These decentralized GPU networks harness computing power from individuals or small data centers, offering flexible configurations, wider server location options, and lower costs—enabling developers to experiment with AI affordably. However, due to their decentralized nature, such networks still face limitations in functionality, usability, and data privacy.

Over the past few months, demand for GPUs has surged beyond the previous Bitcoin mining boom. Key reasons include:

Growing customer base: GPU networks now serve AI developers who are numerous and loyal, less affected by cryptocurrency price volatility.
More diverse models and specs compared to mining-specific hardware, better meeting varied needs. Large models require higher VRAM, while smaller tasks have suitable GPU options. Decentralized GPUs also enable proximity to end-users, reducing latency.
Maturing technologies: GPU networks leverage high-speed blockchains like Solana for settlement, Docker virtualization, and Ray computing clusters.
Better ROI outlook: The AI market is expanding with abundant opportunities for new apps and models. Expected return on H100 GPUs is 60–70%, whereas Bitcoin mining is more competitive and constrained by limited output.
Bitcoin mining firms like Iris Energy, Core Scientific, and Bitdeer are now supporting GPU networks, offering AI services, and actively purchasing AI-optimized GPUs like H100.

Recommendation: For Web2 developers who don't prioritize SLA, io.net offers a simple, user-friendly experience and excellent cost-effectiveness.

3. Inference Networks

This is the core of native crypto AI infrastructure. These networks will support billions of AI inference operations in the future. Many AI layer1s or layer2s provide developers with native on-chain AI inference capabilities. Market leaders include Ritual, Valence, and Fetch.ai.

These networks differ in several key aspects:

Performance (latency, computation time)
Supported models
Verifiability
Pricing (on-chain gas cost, inference cost)
Developer experience

3.1 Goal

Ideally, developers should be able to easily access custom AI inference services from anywhere and in any form, with minimal friction during integration.

Inference networks provide all essential foundational support needed by developers: on-demand proof generation and verification, inference computation, relay and validation of inference data, Web2 and Web3 interfaces, one-click model deployment, system monitoring, cross-chain operations, synchronized integration, and scheduled execution.

Source: IOSG Ventures

With these capabilities, developers can seamlessly integrate inference services into existing smart contracts. For example, when building a DeFi trading bot, it could use machine learning models to identify optimal buy/sell moments for specific trading pairs and execute strategies on underlying platforms.

In an ideal scenario, all infrastructure would be cloud-hosted. Developers upload their trading strategy models in standard formats like Torch, and the inference network stores and serves them for both Web2 and Web3 queries.

Once model deployment is complete, developers can directly invoke model inference via Web3 API or smart contract. The inference network continuously executes these trading strategies and feeds results back to the base smart contract. If managing large community funds, verification of inference results may be required. Upon receiving verified results, the smart contract executes trades accordingly.

Source: IOSG Ventures

3.1.1 Asynchronous vs Synchronous

Theoretically, asynchronous inference offers better performance—but may hinder developer experience.

With async execution, developers must first submit tasks to the inference network’s smart contract. Once completed, the result is returned by the contract. This splits logic into two parts: inference invocation and result handling.

Source: IOSG Ventures

Nested inference calls and extensive control logic make things worse.

Source: IOSG Ventures

Asynchronous programming makes integration with existing smart contracts difficult, requiring extra code, error handling, and dependency management.

Conversely, synchronous programming is more intuitive for developers but introduces challenges in response time and blockchain design. For instance, if input data like block timestamp or price changes rapidly, the data might be stale by the time inference completes—potentially causing transaction rollback. Imagine making a trade based on outdated pricing.

Source: IOSG Ventures

Most AI infrastructures adopt async processing, but Valence is actively working to solve these issues.

3.2 Reality Check

In reality, many new inference networks are still in testing phases, such as Ritual Network. According to public documentation, current functionality is limited (e.g., verification and proof features not yet live). Instead of providing full cloud infrastructure for on-chain AI computation, they currently offer frameworks for self-hosted AI computation with results passed to chain.

Here’s an architecture for running AIGC NFTs: diffusion models generate NFTs uploaded to Arweave, and the inference network mints the NFT on-chain using the Arweave address.

Source: IOSG Ventures

This process is highly complex—developers must self-deploy and maintain most infrastructure, including customized Ritual nodes, Stable Diffusion nodes, and NFT smart contracts.

Recommendation: Current inference networks are quite complex for integrating and deploying custom models, and most do not yet support verification. Using AI in the frontend offers a relatively simpler alternative. If you absolutely need verification, ZKML provider Giza is a solid choice.

4. Agent Networks

Agent networks allow users to easily customize autonomous agents—entities or smart contracts capable of independently executing tasks, interacting with each other, and engaging with blockchain networks without direct human intervention. These primarily target LLM technology. For example, they could offer a GPT chatbot deeply knowledgeable about Ethereum. Currently, such bots have limited tooling, and developers cannot build complex applications atop them yet.

Source: IOSG Ventures

But in the future, agent networks will equip agents with more tools—not just knowledge, but also external API calling and task execution abilities. Developers will be able to connect multiple agents to build workflows. For example, writing a Solidity smart contract could involve specialized agents: protocol design agent, Solidity coding agent, code security review agent, and deployment agent.

Source: IOSG Ventures

We coordinate cooperation among these agents using prompts and scenarios.

Examples of agent networks include Flock.ai, Myshell, Theoriq.

Recommendation: Most current agent functionalities are relatively limited. For specific use cases, Web2 agents offer better service with mature orchestration tools like Langchain and Llamaindex.

5. Differences Between Agent Networks and Inference Networks

Agent networks focus more on LLMs and provide tools like Langchain to integrate multiple agents. Typically, developers don’t need to develop ML models themselves—the agent network abstracts away model development and deployment. They simply link necessary agents and tools. Often, end-users interact directly with these agents.

In contrast, inference networks serve as the underlying infrastructure for agent networks, offering lower-level access. End-users typically don’t interact directly with inference networks. Developers must deploy their own models—not limited to LLMs—and can access them either off-chain or on-chain.

Agent and inference networks aren't entirely separate products. We’re already seeing vertically integrated offerings that combine both capabilities due to shared infrastructure dependencies.

6. Emerging Opportunities

Beyond model inference, training, and agent networks, there are many promising new frontiers in Web3:

Datasets: How can blockchain data be transformed into machine-learning-ready datasets? ML developers need specific, thematic datasets. For example, Giza provides high-quality DeFi datasets tailored for ML training. Ideal datasets should go beyond tables to include graph data representing interactions in the blockchain world. Current efforts are lacking here. Projects like Bagel and Sahara incentivize individuals to create new datasets while ensuring data privacy.
Model Storage: Large models pose storage, distribution, and versioning challenges—critical for on-chain ML performance and cost. Pioneering projects like Filecoin, AR, and 0g are making progress.
Model Training: Distributed and verifiable model training remains challenging. Gensyn, Bittensor, Flock, and Allora have made notable advances.
Monitoring: Since model inference occurs both on and off-chain, new infrastructure is needed to help Web3 developers track model usage and detect anomalies or biases. With proper monitoring tools, ML developers can adjust and optimize model accuracy in real time.
RAG Infrastructure: Distributed RAG requires new infrastructure demanding high-performance storage, embedding computation, and vector databases—all while preserving data privacy. This differs significantly from current Web3 AI setups where RAG often relies on third parties like Firstbatch and Bagel.
Web3-Native Models: Not all models suit Web3 contexts. Most require retraining for applications like price prediction or recommendations. As AI infrastructure matures, we expect more Web3-native models serving AI applications. For instance, Pond is developing blockchain GNNs for price forecasting, recommendations, fraud detection, and anti-money laundering.
Evaluation Networks: Evaluating agents without human feedback is hard. As agent creation tools spread, countless agents will flood the market. A system is needed to showcase agent capabilities and help users determine which performs best in given scenarios. Neuronets is one player in this space.
Consensus Mechanisms: PoS may not be ideal for AI tasks. Challenges include computational complexity, verification difficulty, and lack of determinism. Bittensor created a novel “intelligent” consensus mechanism rewarding nodes contributing to ML models and outputs.

7. Future Outlook

We observe a clear trend toward vertical integration. By building a foundational compute layer, networks can support multiple ML tasks—including training, inference, and agent services—aiming to provide Web3 ML developers with comprehensive, one-stop solutions.

Currently, on-chain inference remains costly and slow but offers strong verifiability and seamless integration with backend systems like smart contracts. I believe the future lies in hybrid architectures: some inference will run on the frontend or off-chain, while critical, decision-making inference happens on-chain. This pattern already exists in mobile computing—leveraging device capabilities to run lightweight models locally while offloading complex tasks to the cloud using large LLMs.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

IOSG Ventures