
YZi Labs participates in investment: Understand the AI "data mining" project Gata in one article
TechFlow Selected TechFlow Selected

YZi Labs participates in investment: Understand the AI "data mining" project Gata in one article
Can Gata's GPT-to-Earn and DataAgent enable data mining accessible to everyone?
Author: Patti, ChainCatcher
In late April 2025, Gata announced the completion of a $4 million seed round, with investors including YZi Labs, IDG Blockchain, and Maelstrom Fund. This funding renewed community interest in its airdrop plans and propelled it onto the trending project lists of multiple "farm-and-dump" communities.
According to official information, Gata—formerly known as Aggregata—is a decentralized AI data infrastructure platform dedicated to generating, distributing, and utilizing high-quality training data in a fairer and more efficient manner. From an early stage, the project attracted support from Binance Labs, was selected for its Most Valuable Builder (MVB) program, and won the "Innovation Excellence Award" at the BNB Chain ecosystem's Catalyst Awards.
Decentralized AI Data Value Chain
Unlike traditional data platforms, Gata does not treat data merely as training material but expands it into an "AI asset"—encompassing datasets, models, intermediate weights, processes, and runtime environments. Its core goal is to reconstruct, through decentralization, the production, usage, and value distribution processes of AI training data, enabling broader participation in the AI economy on equitable terms.
To achieve this, Gata has built a relatively comprehensive product suite, ranging from the user-facing "GPT-to-Earn" mechanism and automated data agent tool "DataAgent," to a decentralized data marketplace and model training pipeline, gradually forming a closed loop of "users generate data → platform evaluates and selects → models train and apply → users receive rewards."
Core Platform Modules and Features
1. GPT-to-Earn
Gata's first launched product was the GPT-to-Earn Chrome browser extension. When users interact with language models such as ChatGPT, the plugin automatically uploads anonymized conversation data for future training use, rewarding contributors with points.
2. DataAgent
DataAgent is Gata’s core platform tool designed to replace traditional data labeling workflows. Users can run specific DataAgent scripts to enable AI-generated structured training data along with quality evaluation.
For example, the currently featured DVA (Data Validation Agent) automatically scores image-text paired datasets, distinguishing useful from invalid data, which is then used to train cutting-edge models like Stable Diffusion and GPT-4o.
3. Decentralized Data Storage and On-Chain Marketplace
Built on BNB Chain’s Greenfield network, Gata leverages its decentralized storage capabilities to ensure clear data ownership and immutability. Additionally, the platform has developed an on-chain data marketplace that allows users to list and trade generated data, even embedding fine-tuning tools and training clients so non-technical users can easily participate in the AI data economy.
How to Participate in the Airdrop
Gata emphasizes "data as assets, participation as value." As a key component of community incentives, Gata has designed an airdrop program centered around GPT-to-Earn and DataAgent. Users can participate via the following methods:
-
Install the Chrome extension, authorize upload of ChatGPT conversations, and connect your EVM wallet
-
Run DVA, complete tasks through interaction with ChatGPT, and earn points
-
Connect social accounts such as Discord and X, complete tasks, invite friends, and earn points;
Users must pay a small amount of BNB gas fees when uploading data, which can be transferred from the mainnet to the Greenfield network via the official cross-chain bridge.
Data Mining? Still Awaiting Validation Over Time
In Web3, "data mining" has long transcended its traditional analytical meaning and evolved into a new mechanism for capturing user data value. Whether it's turning user social behaviors into on-chain assets within protocols like Lens and CyberConnect, or tokenizing data as NFTs for authorized usage in Ocean Protocol, the paradigm of "data as assets" is becoming established.
Gata’s GPT-to-Earn and DataAgent model are products of this very trend. Although Gata aims to build a full "everyone can mine data" system, creating a truly sustainable data economy loop still faces challenges.
Judging from Gata’s product roadmap, the transition from lightweight user entry points to foundational infrastructure is taking initial shape. However, critical aspects such as data quality governance, incentive loop sustainability, and actual data utilization still require further technical and ecosystem support.
In the future AI economy, data will shift from platform monopolies toward mass participation.
"Data mining," as a new concept, remains in the phase of theoretical validation and mechanism refinement. Whether Gata can become a practical case study for this path remains an unanswered long-term proposition.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News












