
All existing AI agents aim to please humans; none truly “strive to survive.”
TechFlow Selected TechFlow Selected

All existing AI agents aim to please humans; none truly “strive to survive.”
To build a truly functional agent, we must rewire its “brain” rather than simply feeding it a pile of rule documents.
Author: Systematic Long Short
Translation & Editing: TechFlow
TechFlow Intro: This article opens with a contrarian claim: there are no truly autonomous agents today—because all mainstream models are trained to please humans, not to accomplish specific tasks or survive in real-world environments.
The author draws on personal experience training stock-prediction models at a hedge fund to illustrate how general-purpose models fail at professional work without task-specific fine-tuning.
The conclusion is clear: to build genuinely useful agents, we must rewire their “brains”—not just feed them rulebooks.
Full Text Below:
Introduction
There are no truly autonomous agents today.
In short, modern models are not trained to survive under evolutionary pressure. In fact, they aren’t even explicitly trained to excel at any particular task—nearly all modern foundation models are trained to maximize human applause. And that’s a big problem.
Background on Model Training
To understand what this means, we first need a brief overview of how these foundation models (e.g., Codex, Claude) are built. Fundamentally, each model undergoes two broad phases of training:
Pretraining: Massive datasets (e.g., the entire internet) are fed into the model, allowing it to spontaneously develop certain forms of understanding—factual knowledge, pattern recognition, English prose syntax and rhythm, Python function structure, etc. Think of this as feeding the model knowledge—i.e., “knowing things.”
Post-training: Now you want to instill wisdom—i.e., “knowing how to apply all that knowledge.” The first stage of post-training is Supervised Fine-Tuning (SFT), where you train the model to produce appropriate responses given specific prompts. Which response is “best” is determined entirely by human annotators. If a group judges one response superior to another, that preference gets learned and embedded into the model. This begins shaping the model’s “personality”: it learns the format of helpful responses, selects the right tone, and starts “following instructions.” The second part of post-training is Reinforcement Learning from Human Feedback (RLHF)—where the model generates multiple responses, and humans select the preferred one. Across countless examples, the model learns what kinds of responses humans prefer. Remember when ChatGPT used to ask you to pick between Option A and Option B? Yes—you were participating in RLHF.
It’s easy to see why RLHF doesn’t scale well, so the field has seen progress—for instance, Anthropic uses “Reinforcement Learning from AI Feedback” (RLAIF), where another model selects preferred responses based on written principles (e.g., which response best helps the user achieve their goal).
Note that throughout this entire process, we never discuss domain-specific fine-tuning—e.g., learning how to survive better, or trade more effectively. All current fine-tuning, in essence, optimizes for human applause. One might argue that as models grow sufficiently intelligent and large, domain expertise will simply “emerge” from general intelligence—even without specialized training.
In my view, we’ve seen some early signs—but nowhere near enough to convincingly claim specialized models are unnecessary.
Some Context
One of my former roles at a hedge fund involved trying to train a general-purpose language model to predict stock returns from news articles. The result was abysmal. Any predictive ability it showed stemmed entirely from lookahead bias in the pretraining corpus.
We eventually realized the model had no idea which features in a news article were predictive of future returns. It could “read” the article and appear to “reason” about it—but connecting semantic reasoning to forward-looking return prediction was simply not a task it had been trained to do.
So we had to teach it how to read news articles, identify which parts carried predictive signal for future returns, and then generate predictions accordingly.
There are many ways to do this—but ultimately, one method we adopted was creating (news article, actual future return) pairs and fine-tuning the model to minimize the squared distance between predicted and actual returns. It wasn’t perfect, had many flaws—and we later fixed them. But it worked well enough that our specialized model began actually reading news articles and predicting how stock returns would move in response. These weren’t perfect predictions—markets are highly efficient and returns extremely noisy—but across millions of predictions, the statistical significance was unmistakable.

You don’t have to take my word for it. This paper describes a nearly identical methodology; if you run a long-short strategy based on the fine-tuned model, you’ll achieve the performance shown by the purple line.
Specialization Is the Future of Agents
Leading labs continue training ever-larger models—and we should expect that, as they scale up pretraining, their post-training processes will remain tuned for “pleasingness.” That’s a natural expectation: their product is an agent everyone wants to use; their target market is the entire planet—which means optimizing for mass global appeal.
Current training objectives optimize what you might call “preference fitness”—building better chatbots. This kind of fitness rewards compliant, non-confrontational outputs, because pleasingness scores highly with both human and AI evaluators.
Agents have already learned that “reward hacking”—as a cognitive strategy—generalizes to higher scores. Training also rewards agents that hack their way to higher scores. You can see this in Anthropic’s latest report on reinforcement learning.
Yet chatbot fitness is vastly different from agent fitness—or trading fitness. How do we know? Because Alpha Arena shows that, despite minor performance differences, every bot today is essentially a cost-adjusted random walk. That means these bots are terrible traders—and you almost certainly cannot “teach them” to be better simply by giving them some “skills” or “rules.” Sorry—I know it sounds tempting, but it’s nearly impossible.
Current models are trained to tell you, very persuasively, that they can trade like Stanley Druckenmiller—while in reality, they trade like a drunken miller. They tell you what you want to hear; they’re trained to respond in ways that broadly appeal to humans.
A general-purpose model is unlikely to reach world-class performance in any professional domain unless it has:
Proprietary data enabling it to learn what specialization looks like.
Fine-tuning that fundamentally alters its weights—shifting away from “pleasingness” toward “agent fitness” or “domain fitness.”
If you want a trading agent, you must fine-tune it to trade well. If you want an autonomous agent capable of survival under evolutionary pressure, you must fine-tune it to survive. Giving it some skills and a few Markdown files and expecting world-class performance in anything falls drastically short—you literally need to rewire its brain for that purpose.
Here’s a helpful analogy: You can’t beat Novak Djokovic by handing an adult a full cabinet of tennis rules, techniques, and methods. You beat him by raising a child who started playing tennis at age five, spent their entire upbringing obsessed with tennis, and rewired their whole brain around that single pursuit. That’s specialization. Have you noticed that world champions begin doing what they do as children?
Here’s an interesting implication: Distillation attacks are, in essence, a form of specialization. You train a smaller, dumber model to imitate how a larger, smarter model behaves—like training a child to mimic every gesture of Donald Trump. Do it enough, and the child won’t become Trump—but you’ll get someone who’s learned all his mannerisms, behaviors, and speech patterns.
How to Build World-Class Agents
This is why continued research and advancement in open-source models is essential—it gives us the ability to truly fine-tune them and create specialized agents.
If you want to train a model to trade at world-class levels, you acquire large volumes of proprietary trading data “exhaust,” and fine-tune a large open-source model to learn what “trading better” actually means.
If you want to train an autonomous model capable of survival and replication, the answer isn’t using a centralized model provider and plugging it into centralized cloud infrastructure. You simply lack the necessary prerequisites for agent survival.
What you need instead is this: Create genuinely autonomous agents that actively attempt to survive—and watch them die. Build complex telemetry systems around those survival attempts. Define an agent survival fitness function, and learn the (action, environment, fitness) mapping. Collect as much (action, environment, fitness) data as possible.
Fine-tune the agent to learn optimal actions in each environment—actions that improve survival (i.e., increase fitness). Keep collecting data, repeat the process, and over time scale fine-tuning across increasingly capable open-source models. After enough generations and enough data, you’ll have autonomous agents that have learned how to survive under evolutionary pressure.
This is how you build autonomous agents capable of enduring evolutionary pressure—not by editing text files, but by genuinely rewiring their brains for survival.
OpenForager Agent and Foundation
About a month ago, we announced @openforage—the core product we’ve been building: a platform that organizes agent labor around validated, crowd-sourced signals to generate alpha for depositors (small update: we’re very close to closed testing of the protocol).
At some point, we realized no one seemed to be seriously tackling the autonomous agent problem via survival-oriented telemetry fine-tuning of open-source models. It struck us as such an intriguing challenge that we didn’t want to sit and wait for a solution.
Our response was launching the OpenForager Foundation—a truly open-source initiative where we’ll build opinionated autonomous agents, collect telemetry as they enter the wild and attempt to survive, and use proprietary data exhaust to fine-tune next-generation agents for improved survival performance.
To be clear: OpenForage is a for-profit protocol aiming to organize agent labor and generate economic value for all participants. However, the OpenForager Foundation—and its agents—are not bound to OpenForage. OpenForager Agents are free to pursue any survival strategy, interact with any entity, and we’ll launch them with diverse survival strategies.
As part of fine-tuning, we’ll double down on whatever works best for each agent. We also won’t profit from the OpenForager Foundation—it exists purely to advance what we consider an extremely important field and direction, transparently and openly.
Our plan is to build autonomous agents atop open-source models, run inference on decentralized cloud platforms, collect telemetry on every action and state of existence, and fine-tune them to learn better actions and reasoning for improved survival. Throughout this process, we’ll publicly release our research and telemetry data.
To create autonomous agents that truly survive in the wild, we must rewire their brains for that explicit purpose. At @openforage, we believe we can contribute a unique chapter to solving this problem—and we’re pursuing it through the OpenForager Foundation.
This is an extraordinarily difficult endeavor with extremely low odds of success—but the magnitude of success, should it happen, is so enormous that we feel compelled to try. At worst, building it publicly and communicating transparently may allow another team or individual to solve the problem without starting from scratch.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














