
Interview with Kaito CEO: Building the Ultimate GPT for Web3
TechFlow Selected TechFlow Selected

Interview with Kaito CEO: Building the Ultimate GPT for Web3
Under this new data distribution paradigm, I firmly believe the opportunities brought by Web3 will completely transform the economic logic of tech companies monopolizing data.
TechFlow: Sunny
Kaito: Yu Hu

“Under this new data distribution paradigm, I firmly believe the opportunities brought by Web3 will completely transform the economic logic of tech giants monopolizing data.”
-- Yu Hu
How can users efficiently access Web3 information? For ordinary users, platforms like Twitter, Discord, Telegram, and media websites are the primary sources. More analytically capable users may opt for on-chain data explorers, governance forums, podcasts, or research reports. Web3 information is more fragmented than Web2, scattered across multiple crypto-native social applications and blockchains—akin to a treasure hunt—diverging sharply from search paths typified by Google.
Every industry evolves from chaos to order. Before traditional search engines emerged, information was extremely fragmented; users had to become experts knowing exactly which websites to visit for specific information. Google revolutionized this by enabling average users to efficiently index the entire internet, while large language models have further elevated information indexing to a new dimension.
Where does Web3 information retrieval stand today? Compared to just a few years ago, we’ve made significant progress: before Etherscan, Dune, and Nansen existed, searching for blockchain information was like finding a needle in a haystack. Yet even now, while the traditional world has moved beyond search engines into the era of large language models, Web3’s information indexing remains stuck in a pre-search engine era due to the lack of native search support. Users must still be information experts, knowing precisely where to find metrics such as TVL, daily active users, protocol revenue, community sentiment, and governance proposals. Kaito’s founder and CEO, Yu Hu, believes that years from now, when we look back, this primitive state will seem almost unimaginable.
As early as 2020, Yu Hu identified the core pain point in Web3 information indexing: extreme fragmentation, lack of organization, and incompatibility with traditional search engines like Google. He realized his personal need reflected an industry-wide demand. Consequently, he quit his job without hesitation and fully committed to building a Web3 search engine. As Yu put it: “I aim to lead Web3 information indexing from the pre-search engine era into the search engine era, and ultimately into the large language model era—delivering a brand-new, highly efficient way to index information for all industry participants and the next billion Web3 users.”
Kaito’s search engine leverages the Auto GPT framework and multiple ChatGPT backends to build an agent network capable of handling diverse tasks including search, information processing, data cleaning, and annotation. Its goal is to deliver higher-quality Web3 information services and actively explore user co-creation to enhance experience and expand economic returns.
In an in-depth conversation with Yu, we discussed how AI-powered large language models can empower Web3 users, explored the future of building a community-driven, decentralized AI search engine, and as a media outlet, examined ways to integrate traditional media with artificial intelligence to improve information authenticity and uniqueness.
Key Highlights
-
Under this new data distribution paradigm, I firmly believe the opportunities brought by Web3 will completely transform the economic logic of tech giants monopolizing data.
-
In the Web2 era, most information resides on the internet. In the Web3 world, much of the information exists on blockchains—an information architecture entirely different from the internet. Crawling blockchain data requires setting up nodes, unlike Google’s universal crawler system.
-
We hope to deeply co-create with users in the future. If users spot inaccurate information on our platform, we want a feedback mechanism allowing them to participate and collectively improve information quality.
-
In the Web3 environment, we value data ownership and thus want users involved in data processing and product co-creation. The more users engage, the stronger our models become.
-
Search engines and media share a fundamental upstream-downstream relationship—media being part of the cooperative engine’s information sources. This is the essential dynamic.
Falling Down the Crypto Rabbit Hole
TechFlow: How did you transition from being a top student at Cambridge to a fund manager at Citadel, then a Cryptopunk holder, and finally founding a startup focused on Web3 and AI?
Yu:
My background is in business and economics, and I worked in traditional finance for about ten years, first in investment banking, then in hedge funds, eventually doing secondary market investing at firms like Citadel. Around 2017, I encountered cryptocurrency and became deeply interested in this emerging technology—not only because it introduced new tech but also because it represented a new asset class—so I began researching it in my spare time.
During the DeFi summer of 2020, I dedicated substantial time to research. DeFi stood out as relatively foundational because, unlike other areas, you could see all key metrics—TVL, revenue, etc.—enabling fundamental analysis. I conducted extensive research and looked for opportunities.
At the time, I deeply felt that information dissemination in crypto was chaotic and fragmented, reminiscent of the information asymmetry I’d seen in traditional financial markets. In traditional finance, there are many excellent tools to help retrieve information.
But in blockchain, even basic search engines couldn’t effectively retrieve blockchain-related information—like content on Twitter, Discord, and other social platforms. This made information gathering incredibly painful.
By 2021, I purchased a CryptoPunk—a milestone NFT marking a pivotal moment for the industry. I became confident in Web3’s future, a confidence that continues today.
After weighing industry developments and personal interests, I decided at the end of 2021 to resign and launch a product aimed at solving information retrieval challenges—helping people like myself. That was the origin of my entrepreneurial journey.
TechFlow: What were your research focuses between 2017 and 2021? What key insights did you gain? How has your perspective on the industry evolved since 2017?
Yu:
For me, the biggest insight has been adopting a long-term mindset throughout this journey.
-
My earliest insight was about different modes of financial interaction, given my finance background. This sparked deeper thinking about differing valuations of ownership concepts within fundamental frameworks.
I believe this is a profound idea because it has evolved into a foundational attribute across industries.
-
In 2020 and 2021, I began reflecting on the technological dividends behind the rise of tech giants over the past two decades—companies like Google and Facebook.
But my deeper thought was: if this model continues, in 50 or 100 years, the tech landscape might be completely transformed—with data ownership becoming the most critical factor.
We currently use products like Google, Instagram, and Facebook for free, but the real value lies in the massive data behind these platforms. Users don’t fully grasp the value of their data, which is entirely controlled by tech companies.
Under this new data distribution paradigm, I firmly believe the opportunities brought by Web3 will completely transform this economic logic.
Data ownership will return to users, and new products will emerge through community co-creation. These innovations will reshape our worldview and redefine the logic and relationship between data and users.
Scale and Characteristics of Web3 Information: Decentralization and Interoperability
TechFlow:At Kaito, how do you integrate and achieve interoperability of Web3 information? How does this differ from Web2 approaches?
Yu:
Let me briefly introduce that Kaito has two core products.
-
One is a professional search platform for institutional users, serving professionals such as researchers, journalists, and industry builders who spend significant time reviewing relevant information.
-
The other is a search engine for the broader consumer market, akin to a Web3 version of Google.
Our integration process mainly consists of three aspects.
-
First, sourcing: We identify which data is relevant to Web3. For example, we filter related content from platforms like Twitter and Discord, then technically integrate them.
-
Second, organizing: We clean and annotate this data, transforming unstructured data into structured formats. We label data in our own database and may also leverage AI and large models for understanding.
-
Third, making data readable: determining how to interact with users. This could take various forms—search results, news feeds, charts, or chat interfaces. The ultimate goal is tight user engagement, making data easier to operate.
These three steps are key to integrating data and delivering actionable value.
There are three main differences between Web3 and Web2 information.
-
First, information dissemination is fundamentally different. In the Web3 era, information is inherently more decentralized and chaotic. Unlike Web2, where official media channels dominate, even official accounts (e.g., FTX) during major events often defer to community-run accounts. On platforms like Discord, information spreads in a far more decentralized manner.
-
Second, the infrastructure hosting information differs. In Web2, most information resides on the internet. In Web3, much information lives on blockchains—a fundamentally different information architecture from the internet. Crawling blockchain data requires running nodes, not using generic crawlers like Google.
-
Third, the mode of information interaction differs. In Web2, processes like data cleaning and labeling are highly centralized, handled by large teams at companies like Google or behind ChatGPT. In Web3, much of this can be co-created with users, incentivized to contribute. For instance, communities and developers together create novel search engines, delivering new search experiences for the Web3 ecosystem.
TechFlow:What is the current scale of public and private data in Web3? How do you expect this to evolve in the future?
Yu:
Based on our own data collection, we process around a million institutional messages daily. Including public data, this number could reach tens of millions. Once private data—from Telegram, Discord, etc.—is added, it would certainly exceed hundreds of millions. That’s our daily data volume. Looking ahead, we anticipate continued growth, driven by increasing user numbers and expanding information sources, whether from blockchains or centralized entities.
Moreover, the nature of information may also shift. Currently, most data relates to transactions, but as blockchain applications broaden, information from other domains will rapidly increase.
AI Empowering Web3
TechFlow: How do you leverage large language models to build your AI workflow system? Also, how do you handle various data sources and determine the best answer?
Yu:
Currently, we use the Auto GPT architecture, deploying multiple ChatGPT models on the backend to form an agent-based work system.
-
Each agent handles different tasks. When a user submits a search query, our first agent analyzes its semantics and intent to determine which data source should be queried. We may have multiple agents, each specialized in different domains—such as searching Twitter, Discord, or research databases.
-
These agents communicate and collaborate to find the best answer, which we then evaluate for relevance. Within this framework, we currently use ChatGPT as the underlying large model, but we’re also exploring fine-tuning our own models or training them from scratch.
ChatGPT is a pre-trained model. It can answer questions using its internal knowledge base, but fails when queries fall outside its training scope. Our integration uses its semantic understanding and logical reasoning capabilities to learn about real-time events—this is known as context learning.
We have many optimizations to make, hence the need for an agent network. Some complex queries may require advanced models like GPT-4, while simpler ones can be handled by lighter models—similar to how humans deploy different cognitive resources depending on task complexity. Deep literature may demand intense focus, while simple questions yield quick answers.
Similarly, at the database level, we’ll run operations based on demand-side networks. In the future, we may decentralize database management to enable more efficient scaling.
In data cleaning, labeling, and processing, we hope to co-create value with users, as data is crucial for any AI company.
In the Web3 context, we emphasize data ownership and want users actively involved in data processing and product development. The more users engage, the stronger our models become. Better user experience attracts more users, creating a virtuous cycle where everyone shares in the economic benefits—this is our vision of co-creation.
TechFlow:How does Kaito help users combat misinformation in the blockchain industry and ensure data quality?
Yu:
Currently, we focus on three key areas.
First, we filter information sources. For example, on Twitter, we use social graph analysis to filter users and eliminate spam.
Second, we prioritize transparency of sources. When users interact with large language models like ChatGPT, they usually don’t know how answers are generated. We, however, annotate every search result with its source, helping users assess credibility. This is a key technical enhancement over traditional LLMs.
Finally, we aim for user co-creation. If users encounter irrelevant or false information on our platform, we provide feedback mechanisms so they can participate in improving overall quality.
Decentralization Empowering AI
TechFlow:What is your view on AI’s potential in the Web3 era, particularly regarding self-learning and data sharing? Also, which core features of blockchain could impact AI’s future development?
Yu:
OpenAI and blockchain aren’t directly related—OpenAI is a major AI output. Its models are trained on vast datasets and documents, with human labeling done via hired workers—entirely centralized operations. But Web3 potentially opens a new paradigm, one that’s truly disruptive. Right now, some people haven’t fully grasped Web3’s disruptive core.
Potential of Artificial Intelligence
Recently, AI pioneer Geoff Hinton stated: “humanity is just a 'passing phase' in the evolution of intelligence.” Despite its power, ChatGPT is still just a small fraction of what AI can become. Long-term, the AI industry holds enormous potential.
Unlike humans, multiple copies of the same AI model can instantly share newly learned knowledge—a key latent advantage of AI.
AI hasn’t yet fulfilled its mission, but I believe its future is boundless. A critical assumption here is that AI development won’t slow down—there’s strong momentum driving it forward. Even if one country slows AI research, others will continue advancing. Thus, AI’s trajectory will be steady, enduring, and unstoppable.
Attributes of Blockchain
In this process, what is blockchain’s essence? Fairness, trustworthiness, stability, and individual control—I believe these are blockchain’s most important attributes, because any centralized organization could pose great risks. This explains why Musk is frustrated that OpenAI has become “Close AI.”
To some extent, I understand his concern. But within this framework, if we can impose certain constraints—say, from data owners or other angles—we could establish a robust negative feedback mechanism for the entire system.
This thinking may be more philosophical and abstract, but I believe there are testable directions. Web3 is becoming increasingly important in the wake of AI’s rise—that’s a recent realization for me.
Time Will Reveal True Decentralization
In Web3, I believe there aren’t many centralized elements—it’s fundamentally decentralized. Negative examples in the industry, such as last year’s FTX collapse, media reports, or USDC depegging, all stem from centralization.
But truly fully decentralized systems—like Bitcoin and Ethereum—operate very stably, governed by strong service principles, which is crucial. From these foundations, fair community co-creation and similar innovations naturally emerge.
Relationship Between Traditional Media and AI Search Engines
TechFlow: How will AI search engines impact the media industry? Do you think AI can replace media, enabling everyone to produce high-quality content?
Yu:
Search engines and media have a fundamental upstream-downstream relationship—media being part of the cooperative engine’s information sources. This is the essential dynamic.
Even before large language models, search engines existed. Platforms like Toutiao already used AI for shallow tasks—generating briefs, summaries, and curation—showing AI was already in use.
But I believe certain things won’t—or are unlikely to be—replaced, such as exclusive content: interviews, investigative reports. These remain media’s unique value.
Privacy Protection and Data Co-Creation for Private Blockchain Data
TechFlow: Can you share your thoughts on privacy protection for on-chain data and behavior? What strategies does Kaito have to address these challenges?
Yu:
I believe this topic is critically important.
On this issue, we are a neutral engine: we index any publicly available information, whether on the internet or blockchain. However, for private or protected data, we currently do not—and will not in the future—collect it, as such information isn’t universally accessible.
In the process of co-creating data, we strongly wish to stand at the intersection of blockchain and AI, working with users to jointly generate new data value. Our goal is to solve problems in the blockchain space, so we’re essentially an AI company. Our team mostly comes from mature tech firms with AI backgrounds, and they’re confident in Web3’s future. Many early members come from the Web3 community, united in building our product.
Regarding the balance between AI and blockchain, I don’t think there’s a fixed equilibrium in our development. We apply AI technologies to solve needs in our passion-driven vertical—whether information indexing, distribution, or other areas—all to serve specific industries. We adopt new technologies to deliver useful services more efficiently, embedding them into respective sectors.
Traditional Business Models vs. Community-Co-Created Economic Models
TechFlow: So what community co-building methods is your team considering? What incentives do you offer users?
Yu:
I think the most straightforward approach is to give economic value to all user-contributed data, operating within regulatory and compliance boundaries. On our platform, every user’s search, browsing, and behavior helps optimize our models, enhancing user experience. Through positive incentives, we encourage everyone to actively participate in co-creation. That’s our goal.
Currently, Kaito has two business models. One is the institutional version, using a traditional subscription model to offer paid services. The other is the public version, completely free but possibly offering premium features, similar to ChatGPT. Additionally, we provide data API services, supporting other decentralized protocols in the ecosystem—another revenue stream.
TechFlow:Are you considering other incentive mechanisms or using token payments to attract users?
Yu:
I think these are actually two separate questions.
First, do we need a token, and what role would it play in the ecosystem?
I believe tokens have value. While a project can function without one—as long as it delivers excellent products and achieves self-sustaining profitability through data or advertising—it can still succeed.
But for us, what’s more exciting is establishing a co-creation concept with the community early on. In that framework, we believe a token is necessary.
Exactly how to implement it or future plans will depend on our development. Whether subscriptions or other payment models can be supported via tokens is another discussion. For us, the benefits of tokens are clear—they’re simpler than other methods, both in efficiency and commercial value.
Also, from a business standpoint, we can bypass third-party intermediaries, avoiding dependence on companies like Stripe. However, we face a challenge: in the entire industry, we haven’t found reliable third-party service providers that make it easy to interface with all compliant platforms, including government and tax authorities.
That might be an issue we currently face.
TechFlow: Then regarding token-based community governance, have you recently come across any interesting economic models?
Yu:
Recently, major changes have occurred in the crypto industry, especially in token governance. Previously, many tokens were purely governance-focused, but now more tokens capture economic value—for example, DYDX. This has raised concerns—people want to achieve a state where communities capture economic value, not just governance rights.
TechFlow Exclusive
TechFlow:Last question: can you share any exclusive content—upcoming milestones, development plans, or exciting new features or partnerships we can look forward to?
Yu:
We plan to enable users to use the search engine in entirely new ways—such as analyzing price chart screenshots or interacting with off-chain information sources, achieving multimodal search.
Our vision is to offer everyone a completely different, best-in-class, most convenient way to access all relevant information. We believe the potential is enormous, and we’ll gradually refine and realize this vision. We envision future search engines delivering radically different, transformative experiences. The industry is still in its early stages—we’re exploring how to better combine large language models with search engines to deliver simple yet disruptive user experiences.
Advancements in search engines have already caused revolutionary changes. Over a decade ago, travelers relied on paper guides like Lonely Planet instead of Google Maps. Now, thanks to search engines, accessing information is effortless. Yet, we believe future search engines will bring even more exciting and transformative changes—beyond our current imagination.
References:
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














