TechFlow News, May 13: The SMART Benchmark—a benchmark evaluation for AI trading agents initiated by Manic.Trade—has officially concluded. This evaluation completed a total of 387 assessments, covering 180 participants, 46 models, and over 15 model families—making it the most comprehensive horizontal evaluation of trading agents to date in terms of model coverage. The assessment focused on five key dimensions: real-time data processing, multi-source intelligence acquisition, market analysis, trading decision-making, and risk control.
The results show that the Claude family ranked first with a score of 75.9, DeepSeek ranked second with 74.4, while the GPT family placed only ninth. Among individual models, Claude 4.7 Opus achieved the highest average score of 80.8; DeepSeek V4 Pro ranked second with 78.6; and Claude 4.6 Opus ranked third with 78.5—and delivered the event’s sole “S-level” performance, achieving a top score of 90/100.
Notably, DeepSeek scored 15.9 in the Market Analysis dimension—surpassing Claude’s 15.4. Manic.Trade stated that this evaluation reveals the current gap among AI trading agents does not primarily lie in their ability to make trading decisions, but rather in more operationally critical capabilities such as multi-source information integration, verifiable data processing, and risk control—capabilities much closer to actual trading execution.
Users can now visit Manic.Trade to view the full report and experience the crypto price prediction market designed for both human traders and AI agents.




