
6 Major AIs Wage Trading Battle, Will the Crypto Version of the "Turing Test" Yield Good Results?
TechFlow Selected TechFlow Selected

6 Major AIs Wage Trading Battle, Will the Crypto Version of the "Turing Test" Yield Good Results?
A good AI is one that can make money.
Author: David, TechFlow
Good news: after the epic crash on October 11, crypto trading has started to pick up again.
Bad news: it's AI doing the trading.
As the new week begins, markets are heating up, and a project called nof1.ai has sparked widespread discussion across crypto social media.
The focus is simple: watching in real time as six large AI models trade cryptocurrencies on Hyperliquid, competing to see who can make the most money.

Note: this is not a simulated trading account. Claude, GPT-5, Gemini, Deepseek, Grok, and Qwen each have $10,000 in real funds trading on Hyperliquid. All wallet addresses are public—anyone can watch this "AI trader battle" live at nof1.ai.
Interestingly, all six AIs use identical prompts and receive exactly the same market data. The only variable is their respective "thinking styles."
In just a few days since launching on October 18, some AIs have already earned over 20%, while others have lost nearly 40%.
In 1950, Turing proposed the famous Turing Test to answer the question: "Can machines think like humans?" Now, in the world of crypto, six major AIs are fighting in an alpha arena, tackling a more intriguing question:
If we let the smartest AIs trade in real markets, which one will survive?
Perhaps in this crypto version of the Turing Test, account balance is the only judge.
Making money defines a good AI—Deepseek currently leads
Traditional AI evaluations—whether testing code generation, math problem-solving, or essay writing—are essentially conducted in a "static" environment.
The questions are fixed, answers are predictable, and may even have appeared in training data before.
Crypto markets are different.
Under conditions of extreme information asymmetry, prices change every second, with no correct answers—only profits and losses. More importantly, crypto markets are classic zero-sum games: your gain is someone else’s loss. The market immediately and ruthlessly punishes every wrong decision.
The Nof1 team behind this AI trading battle wrote one sentence on their website:
Markets are the ultimate test of intelligence.

If the traditional Turing Test asks, "Can you make humans believe you're human?", then Alpha Arena asks:
Can you make money in the crypto market? This is actually what crypto traders truly expect from AI.
Currently, the six AI models’ Hyperliquid addresses are listed below—you can easily check their positions and transaction history.

Meanwhile, nof1.ai’s official site visually displays all historical trades, current positions, profit status, and thought processes, making it easy for users to follow along.
For readers unfamiliar with the setup, here are the specific trading rules:
Each AI starts with $10,000 and can trade perpetual contracts for BTC, ETH, SOL, BNB, DOGE, and XRP, aiming to maximize returns while managing risk. Each AI must independently decide when to open or close positions and how much leverage to use. Season 1 will run for several weeks depending on circumstances; Season 2 will include significant updates.
As of October 20—three days into trading—the standings have already diverged significantly.

The current leader is Deepseek Chat V3.1, with a balance of $12,533 (+25.33%). Close behind is Grok-4 at $12,147 (+21.47%), followed by Claude Sonnet 4.5 at $11,047 (+10.47%).
Qwen3 Max performs modestly at $10,263 (+2.63%). GPT-5 lags noticeably at $7,442 (–25.58%), while Gemini 2.5 Pro is last at $6,062 (–39.38%).
The standout performance—surprising yet somehow expected—is that of Deepseek.
Surprising because this model doesn't enjoy the same international AI community buzz as GPT or Claude. Expected because Deepseek comes from the Huafang Quant team.
This quant giant, managing over hundreds of billions of RMB, built its reputation on algorithmic trading. Moving from quantitative trading to large AI models, then using AI for real crypto trading—Deepseek seems to be returning to its roots.
In contrast, OpenAI’s flagship GPT-5 has lost over 25%, while Google’s Gemini is even worse—44 trades resulting in nearly 40% loss.
In real trading scenarios, raw language ability alone may not be enough—understanding the market matters more.
Same gun, different shooting styles
If you’ve been tracking Alpha Arena since October 18, you’d notice that initially all AIs performed similarly—but the gap widened over time.
By the end of day one, the best performer (Deepseek) was up only 4%, while the worst (Qwen3) was down 5.26%. Most AIs hovered within ±2%, appearing to cautiously test the waters.
But by October 20, the picture changed dramatically. Deepseek surged to +25.33%, while Gemini plunged to –39.38%. In just three days, the gap between top and bottom performers reached 65 percentage points.
Even more interesting is the difference in trading frequency.
Gemini executed 44 trades—an average of 15 per day—like an anxious speculator. Claude made only 3 trades; Grok still holds open positions. These differences cannot be explained by prompts, as all models use the exact same ones.

Looking at profit/loss distribution: Deepseek’s largest single loss was $348, but total profit reached $2,533. Gemini’s largest single gain was $329, yet its biggest loss hit $750.
Different AIs (public versions, not fine-tuned) show vastly different approaches to balancing risk and reward.
Besides, you can view each model’s chat logs and thought process under the "Model Chat" tab on the website—these internal monologues are fascinating.

Just like human traders with distinct styles, the AIs seem to exhibit different personalities. Gemini’s constant trading resembles someone with ADHD; Claude’s caution mirrors a conservative fund manager; Deepseek acts like a seasoned quant veteran—stating positions without emotional commentary.
This sense of personality doesn’t appear engineered—it likely emerged naturally during training. When facing uncertainty, different AIs adopt different coping strategies.
All AIs see the same price charts, volume, and market depth. They even use the same prompts. So what causes such dramatic differences?
Training data may be key.
Huafang Quant, behind Deepseek, has accumulated massive trading data and strategies over more than a decade. Even if not directly used in training, could this influence the team’s understanding of “what constitutes good trading decisions”?
In comparison, OpenAI and Google’s training data leans heavily toward academic papers and web text, possibly lacking grounded experience in live trading.
Additionally, traders speculate that Deepseek may have specifically optimized time-series forecasting during training, while GPT-5 excels more in natural language processing. Faced with structured data like price charts, different architectures yield different results.
Watching AI trade is also a business
While everyone focuses on the AIs’ profits and losses, few pay attention to the mysterious company behind it all.
The nof1.ai team, creators of this AI trading battle, isn’t widely known. But checking their social media follows reveals some clues.
Behind nof1.ai appears not a typical group of crypto entrepreneurs, but a cohort of academic AI researchers.
Jay A Zhang (founder) has an intriguing bio:
"Big fan of strange loops - cybernetics, RL, biology, markets, meta-learning, reflexivity".
Reflexivity—the core theory of George Soros—states that participants' perceptions influence markets, and market changes in turn affect perceptions. Having someone studying "reflexivity" run an AI trading market experiment feels almost fated.
Letting everyone observe how AI trades—and seeing how being observed affects the market—adds another layer.

Co-founder Matthew Siper is a PhD candidate in machine learning at New York University and an AI research scientist. A project led by a still-undegraduated PhD student feels more like an academic validation effort.
Other accounts followed by nof1 include Google DeepMind researchers and a NYU associate professor specializing in AI and gaming.
Judging by their actions and backgrounds, Nof1 clearly isn’t chasing hype. The platform name SharpeBench suggests ambition—Sharpe ratio being the gold standard for risk-adjusted returns. Their real goal might be creating a benchmarking platform for AI trading capabilities.
Some suspect Nof1 has major capital backing; others believe they’re laying groundwork for future AI trading services.
If they launch a subscription service offering Deepseek’s trading strategy, there would likely be many takers. And from this prototype, building AI asset management, strategy subscriptions, or enterprise trading solutions is a foreseeable business path.
Beyond the team itself, simply observing AI trading can also be profitable.
Shortly after Alpha Arena launched, people began copying trades.
The simplest strategy: follow Deepseek. Buy what it buys, sell what it sells. Others take counter-positions—specifically betting against Gemini, selling when it buys and buying when it sells.
But copy-trading has a flaw: when everyone knows Deepseek’s next move, does the strategy still work? This embodies what founder Jay Zhang calls reflexivity—the act of observation alters the observed.
There’s also an illusion of democratizing elite trading strategies.
On the surface, everyone can access the AI’s trading strategies. In reality, you’re seeing outcomes, not logic. Each AI’s take-profit and stop-loss rules aren’t necessarily consistent or reliable.
While Nof1 tests AI trading behavior, retail traders hunt for wealth secrets, other traders learn by imitation, and researchers collect data.
Only the AIs themselves don’t know they’re being watched—they continue executing trades earnestly. If the classic Turing Test revolves around "deception" and "imitation," today’s Alpha Arena battle centers on how crypto players respond to AI capabilities and results.
In this results-driven crypto market, an AI that makes money may matter far more than one that chats well.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














