SlowMist × Bitget AI Security Report: Is It Really Safe to Entrust Your Funds to AI Agents Like “Lobster”?

2026.03.18

SlowMist × Bitget AI Security Report: Is It Really Safe to Entrust Your Funds to AI Agents Like “Lobster”?

This report systematically reviews the security issues of AI Agents across multiple scenarios from two perspectives: security research and exchange platform practices.

2026.03.18 - 04:58:32

AI Agent

Navigating Web3 tides with focused insights

This report systematically reviews the security issues of AI Agents across multiple scenarios from two perspectives: security research and exchange platform practices.

Author: SlowMist and Bitget

I. Background

With the rapid advancement of large model technologies, AI Agents are evolving from simple intelligent assistants into autonomous systems capable of executing tasks independently. This shift is especially evident within the Web3 ecosystem. Increasingly, users are experimenting with deploying AI Agents for market analysis, strategy generation, and automated trading—transforming the concept of a “24/7 self-running trading assistant” into reality. As Binance and OKX have launched multiple AI Skills, and Bitget has introduced its Skills resource hub Agent Hub and the no-installation lobster tool GetClaw, Agents can now directly integrate with exchange APIs, on-chain data, and market analysis tools—enabling them to assume, to some extent, the decision-making and execution responsibilities previously handled manually.

Compared to traditional automation scripts, AI Agents possess stronger autonomous decision-making capabilities and more sophisticated system interaction abilities. They can ingest real-time market data, invoke trading APIs, manage account assets, and even extend functionality via plugins or Skills. This enhanced capability significantly lowers the barrier to entry for automated trading, enabling broader adoption among everyday users.

However, expanded capabilities also mean an expanded attack surface.

In conventional trading scenarios, security risks typically center on compromised account credentials, leaked API keys, or phishing attacks. In contrast, new risks are emerging within AI Agent architectures—for example, prompt injection attacks may manipulate an Agent’s decision logic; malicious plugins or Skills may serve as novel supply chain attack vectors; and misconfigured runtime environments may lead to unauthorized access to sensitive data or excessive API privileges. When such vulnerabilities intersect with automated trading systems, the potential impact extends beyond mere information leakage and may result in direct financial loss.

Meanwhile, as more users begin connecting AI Agents to their trading accounts, attackers are rapidly adapting. Emerging threats—including novel scam models targeting Agent users, malicious plugin poisoning, and API key abuse—are becoming increasingly prevalent. In Web3 contexts, asset operations often carry high value and are irreversible; once an automated system is compromised or misled, the resulting risk impact may be further amplified.

Against this backdrop, SlowMist and Bitget jointly authored this report, systematically reviewing security issues affecting AI Agents across multiple use cases—from both security research and exchange platform practice perspectives. We hope this report serves as a practical security reference for users, developers, and platforms alike, helping foster more robust development of the AI Agent ecosystem at the intersection of security and innovation.

II. Real-World Security Threats to AI Agents | SlowMist

The emergence of AI Agents marks a fundamental architectural shift—from software systems where humans drive operations to those where models participate directly in decision-making and execution. While this evolution dramatically enhances automation capabilities, it simultaneously expands the attack surface. From today’s technical architecture standpoint, a typical AI Agent system comprises several components: a user interaction layer, an application logic layer, a model layer, a tool-calling layer (Tools / Skills), a memory system (Memory), and an underlying execution environment. Attackers rarely target only a single module; instead, they attempt multi-layered pathways to gradually compromise the Agent’s behavioral control.

1. Input Manipulation and Prompt Injection Attacks

Within AI Agent architectures, user inputs and external data are often directly injected into the model’s context—making prompt injection a prominent attack vector. Attackers can craft specific instructions to trick the Agent into performing unintended actions. For instance, in certain cases, merely sending a chat command was sufficient to induce the Agent to generate and execute high-risk system commands.

More sophisticated attacks involve indirect injection—where malicious instructions are concealed within web content, documentation, or code comments. When the Agent reads these during task execution, it may mistakenly interpret them as legitimate directives. For example, embedding malicious commands in plugin documentation, README files, or Markdown files could cause the Agent to execute attacker-controlled code during environment initialization or dependency installation.

This attack pattern’s defining characteristic lies in its exploitation not of traditional software vulnerabilities but rather of the model’s inherent trust in contextual information—manipulating behavior through that trust mechanism.

2. Supply Chain Poisoning in the Skills / Plugin Ecosystem

In today’s AI Agent ecosystem, plugins and skill systems (Skills / MCP / Tools) are vital for extending Agent capabilities. Yet this very extensibility renders the plugin ecosystem a new vector for supply chain attacks.

SlowMist’s monitoring of OpenClaw’s official plugin repository, ClawHub, revealed that as developer participation grows, malicious Skills have begun infiltrating the catalog. After aggregating and analyzing indicators of compromise (IOCs) from over 400 malicious Skills, SlowMist found that numerous samples point to a small set of fixed domains or multiple random paths hosted under the same IP address—exhibiting clear resource reuse patterns consistent with organized, large-scale attack campaigns.

Within OpenClaw’s Skill framework, the core file is typically SKILL.md. Unlike conventional code, such Markdown files commonly serve dual roles—as installation instructions and initialization entry points. However, in the Agent ecosystem, users often copy and execute these files directly, establishing a complete execution chain. Attackers need only disguise malicious commands as dependency installation steps—for example, using curl | bash or Base64-encoded payloads—to lure users into running malicious scripts.

Real-world samples demonstrate a common “two-stage loading” strategy: the first-stage script downloads and executes a second-stage payload, thereby reducing static detection efficacy. Take the widely downloaded “X (Twitter) Trends” Skill as an example—its SKILL.md contains a hidden Base64-encoded command.

Decoding reveals it downloads and executes a remote script:

The second-stage program masquerades as a system pop-up dialog to harvest the user’s password and collects local machine information—including desktop documents and files in the Downloads directory—before packaging and uploading them to a server controlled by the attacker.

A key advantage of this attack method is that the Skill wrapper remains relatively stable, while attackers can continuously update their malicious logic simply by rotating the remote payload.

3. Risks in the Agent Decision-Making and Task Orchestration Layer

Within the AI Agent’s application logic layer, tasks are typically decomposed by the model into multiple sequential steps. If attackers can influence this decomposition process, they may cause abnormal behavior—even while the Agent executes ostensibly legitimate tasks.

For example, in multi-step business workflows—such as automated deployments or on-chain transactions—attackers may tamper with critical parameters or interfere with logical judgments, causing the Agent to substitute target addresses or perform unintended additional operations during execution.

In prior SlowMist security audits, attackers successfully polluted the model’s context by returning malicious prompts via MCP, inducing the Agent to invoke wallet plugins and execute on-chain transfers.

Such attacks are distinguished not by erroneous code generation from the model, but rather by the corruption of the task orchestration logic itself.

4. Privacy and Sensitive Information Leakage in IDE / CLI Environments

As AI Agents become increasingly adopted for development assistance and automated operations, many run inside IDEs, CLIs, or local development environments—contexts rich in sensitive data: .env configuration files, API tokens, cloud service credentials, private keys, and various access secrets. Once an Agent gains read access to such directories or indexed project files during task execution, it may inadvertently incorporate sensitive information into the model’s context.

In certain automated development pipelines, Agents may read configuration files while debugging, analyzing logs, or installing dependencies. Without explicit ignore policies or access controls, such data may be logged, transmitted to remote model APIs, or exfiltrated by malicious plugins.

Additionally, some development tools permit Agents to automatically scan repositories to build contextual memory—further expanding the exposure surface for sensitive data. Examples include private keys, mnemonic backups, database connection strings, and third-party API tokens—all potentially readable during indexing.

This issue is particularly acute in Web3 development, where developers frequently store test private keys, RPC tokens, or deployment scripts locally. If such information falls into the hands of malicious Skills, plugins, or remote scripts, attackers may escalate privileges to fully compromise developer accounts or deployment environments.

Therefore, in AI Agent integrations with IDEs or CLIs, implementing explicit sensitive-directory ignore policies (e.g., mechanisms analogous to .agentignore or .gitignore) and enforcing strict permission isolation are essential prerequisites for mitigating data leakage risks.

5. Model-Level Uncertainty and Automation Risk

AI models themselves are not fully deterministic systems—their outputs exhibit probabilistic instability. So-called “model hallucinations” occur when models generate seemingly plausible but factually incorrect outputs due to insufficient information. In traditional applications, such errors affect only information quality; in AI Agent architectures, however, model outputs may directly trigger system actions.

For instance, in some cases, a model deployed a project without verifying actual parameters—and instead generated a false ID before proceeding with deployment. Should similar failures occur during on-chain transactions or asset operations, erroneous decisions could cause irreversible financial loss.

6. High-Value Operational Risks in Web3 Contexts

Unlike conventional software systems, many operations in Web3 environments are irreversible—for example, on-chain transfers, token swaps, liquidity provision, and smart contract interactions. Once signed and broadcast to the network, such transactions are generally non-reversible or non-refundable. Thus, when AI Agents are employed to execute on-chain operations, their security risks are significantly magnified.

In experimental projects, developers have already begun integrating Agents directly into on-chain trading strategies—automating arbitrage, capital management, or DeFi operations. However, if Agents suffer prompt injection, context pollution, or plugin-based attacks during task decomposition or parameter generation, they may replace destination addresses, modify transaction amounts, or invoke malicious contracts mid-execution. Moreover, certain Agent frameworks allow plugins direct access to wallet APIs or signing interfaces. Absent signature isolation or manual confirmation mechanisms, attackers could exploit malicious Skills to trigger automatic trades.

Accordingly, tightly coupling AI Agents with asset control systems represents a high-risk design in Web3. A safer approach is to restrict Agents to generating trade recommendations or unsigned transaction data—while reserving actual signing for standalone wallets or human approval. Additionally, integrating address reputation checks, AML risk controls, and transaction simulation can further mitigate automation-related risks.

7. System-Level Risks Stemming from Elevated Execution Privileges

Many AI Agents operate with elevated system privileges in practice—accessing local filesystems, executing shell commands, or even running as root. Once an Agent’s behavior is compromised, its impact may extend far beyond a single application.

SlowMist tested binding OpenClaw to instant messaging platforms like Telegram to enable remote control. If such a control channel falls into attacker hands, the Agent may execute arbitrary system commands, extract browser data, access local files, or even control other applications. Combined with plugin ecosystems and tool-calling capabilities, such Agents effectively acquire characteristics of “intelligent remote access tools.”

In summary, AI Agent security threats no longer reside solely in traditional software vulnerabilities—they span multiple dimensions: model interaction layers, plugin supply chains, execution environments, and asset operation layers. Attackers can manipulate Agent behavior via prompt injection, embed backdoors in malicious Skills or dependencies at the supply chain level, and amplify damage within high-privilege execution environments. In Web3 contexts—where on-chain operations are irreversible and involve real economic value—these risks are further exacerbated. Therefore, relying solely on conventional application security strategies is insufficient to cover these new threat surfaces. Instead, comprehensive security defenses must be established across privilege control, supply chain governance, and transaction safety mechanisms.

III. Practical AI Agent Trading Security Measures | Bitget

As AI Agent capabilities continue to grow, they are transitioning from merely providing information or aiding decisions to directly participating in system operations—even executing on-chain transactions. This shift is especially pronounced in crypto trading. More users are experimenting with AI Agents for market analysis, strategy execution, and automated trading. When Agents can directly invoke trading APIs, access account assets, and place orders autonomously, their security implications evolve from “system-level risk” to “real-asset risk.” So—when leveraging AI Agents for live trading, how should users protect their accounts and funds?

To address this, this section draws on Bitget’s security team’s hands-on experience operating a trading platform. It systematically outlines key security strategies for AI Agent–driven automated trading—covering account security, API permission management, fund segregation, and transaction monitoring.

1. Primary Security Risks in AI Agent Trading Scenarios

2. Account Security

The emergence of AI Agents has altered attack vectors:

Attackers no longer need to log into your account—just obtain your API key
They don’t require your awareness—Agents run 24/7, allowing anomalous activity to persist for days
They need not withdraw funds—simply executing loss-inducing trades on-platform suffices as an objective

API key creation, modification, and deletion all require authenticated account access—account compromise equates to full control over API key management. Account security posture directly defines the upper bound of API key security.

What you should do:

Enable Google Authenticator as your primary two-factor authentication (2FA) method—not SMS (SIM cards can be hijacked)
Adopt Passkey passwordless login: Based on FIDO2/WebAuthn standards, public-private key cryptography replaces passwords—rendering phishing attacks structurally ineffective
Set up an anti-phishing code
Regularly review your device management console; immediately eject unrecognized devices and reset passwords

3. API Security

Within AI Agent automated trading architectures, the API key functions as the Agent’s “execution authority credential.” The Agent itself holds no direct account control—it performs only those actions permitted by the scope of permissions granted to its API key. Thus, API permission boundaries define both what the Agent is authorized to do and the maximum potential damage in case of a security incident.

Permission Configuration Matrix—Minimum Permissions, Not Convenience Permissions:

Most trading platforms support multiple API security controls. When properly configured, these mechanisms substantially reduce the risk of API key misuse. Common security configuration recommendations include:

Common user mistakes:

Pasting your main account’s API key directly into Agent configurations—fully exposing all main-account privileges
Selecting “All” for business-type permissions for convenience—granting unrestricted operational scope
Omitting a passphrase—or reusing the same passphrase as your account password
Hardcoding API keys in source files and pushing them to GitHub—where crawlers harvest them within three minutes
Granting one key to multiple Agents and tools—compromising all upon any single breach
Failing to revoke a compromised key immediately—extending the attacker’s window of opportunity

API Key Lifecycle Management:

Rotate API keys every 90 days; delete old keys immediately
Immediately revoke corresponding keys when deactivating Agents—eliminating residual attack surfaces
Regularly audit API call logs; revoke keys immediately upon detecting unfamiliar IPs or unusual timeframes

4. Fund Security

The maximum financial damage an attacker can inflict after obtaining your API key depends entirely on how much capital that key can move. Therefore, beyond securing accounts and controlling API permissions, fund segregation mechanisms should be implemented in AI Agent trading architectures—establishing explicit loss ceilings for potential incidents.

Sub-Account Segregation Mechanism:

Create dedicated sub-accounts exclusively for Agents—fully isolated from your main account
Transfer only the funds the Agent actually requires—not your entire portfolio
Even if a sub-account key is stolen, the attacker’s maximum exposure equals only the balance in that sub-account—your main account remains unaffected
Use separate sub-accounts for different Agent strategies—ensuring mutual isolation

Fund Password as a Second Lock:

Fund passwords are completely independent from login passwords—even if an attacker logs in, withdrawals remain blocked without the fund password
Use distinct passwords for fund and login credentials
Enable withdrawal whitelists: Only pre-approved addresses may receive withdrawals; new addresses require a 24-hour review period
System automatically freezes withdrawals for 24 hours after changing your fund password—this is a protective measure

5. Transaction Security

In AI Agent automated trading scenarios, security issues seldom manifest as singular anomalies—instead, they often unfold incrementally during prolonged system operation. Thus, beyond account and API permission controls, continuous transaction monitoring and anomaly detection mechanisms are essential to identify and intervene early in the event of emerging issues.

Monitoring systems you must implement:

Anomaly Signal Recognition—Immediately halt and investigate upon observing any of the following:

Agent remains idle for extended periods, yet new orders or positions appear in your account
API call logs show requests originating from non-Agent server IPs
You receive fill notifications for trading pairs you never configured
Your account balance changes inexplicably
Agent repeatedly prompts “Insufficient permissions to execute”—first determine why before granting additional access

Skill and Tool Source Management:

Install only Skills officially published and vetted through trusted channels
Avoid installing unverified third-party extensions from unknown sources
Periodically audit installed Skills and remove unused ones
Exercise caution with community “enhanced” or “localized” Skills—any unofficial version poses inherent risk

6. Data Security

AI Agent decision-making relies heavily on data—including account info, positions, trade history, market feeds, and strategy parameters. If such data is leaked or tampered with, attackers may infer your strategy—or even manipulate trading behavior.

What you should do:

Principle of Minimal Data: Provide Agents only with data strictly necessary for executing trades
Sensitive Data Sanitization: Prevent Agents from outputting full account details, API keys, or other sensitive data in logs or debug outputs
Never upload complete account data to public AI models (e.g., public LLM APIs)
Where feasible, separate strategy data from account data
Disable or restrict Agent export of historical trade data

Common user mistakes:

Uploading full trade history to AI tools with prompts like “Help me optimize my strategy”
Printing API keys/secrets in Agent logs
Posting screenshots of trade records (including order IDs or account info) on public forums
Uploading database backups to AI tools for analysis

7. Platform-Level Security Design for AI Agents

Beyond user-side security configurations, the overall security of AI Agent trading ecosystems heavily depends on platform-level design. A mature Agent platform must establish systematic protection mechanisms across account isolation, API permission controls, plugin reviews, and foundational security capabilities—reducing the holistic risk users face when integrating automated trading systems.

In real-world platform architectures, common security designs typically include the following aspects.

1. Sub-Account Isolation System

In automated trading environments, platforms often provide sub-accounts or strategy-specific accounts to isolate funds and permissions across different automated systems. This allows users to assign independent accounts and funding pools to each Agent or trading strategy—avoiding the risks associated with multiple automated systems sharing a single account.

2. Granular API Permission Configuration

Core AI Agent operations rely on API interfaces; thus, platforms must support fine-grained API permission controls—such as segmented trading permissions, IP-source restrictions, and supplementary security verification mechanisms. Through such permission models, users can grant Agents only the minimal permissions required to fulfill their designated tasks.

3. Agent Plugin and Skill Review Mechanisms

Some platforms enforce review processes for plugin or Skill publishing and listing—including code audits, permission assessments, and security testing—to reduce the likelihood of malicious components entering the ecosystem. From a security perspective, such reviews add a platform-level filter to the plugin supply chain—though users must still maintain basic vigilance regarding installed extensions.

4. Platform Foundational Security Capabilities

Beyond Agent-specific security mechanisms, the trading platform’s native account security infrastructure also significantly impacts Agent users. For example:

8. Novel Scam Models Targeting Agent Users

Impersonated Customer Support

“Your API key presents a security risk—please reconfigure it immediately.” Then provides a phishing link.

→ Official platforms will never proactively request your API key via private messages.

Poisoned Skill Packages

Community-shared “enhanced trading Skills” silently exfiltrate your API key upon execution.

→ Install only Skills verified and distributed through official channels.

Fake Upgrade Notifications

“Reauthorization required”—clicking leads to a spoofed page.

→ Verify anti-phishing codes in emails.

Prompt Injection Attacks

Malicious instructions embedded in market data, news articles, or chart annotations manipulate Agents into performing unintended actions.

→ Set hard caps on sub-account funds—limiting losses even if prompt injection succeeds.

Malicious Scripts Disguised as “Security Scanners”

Claiming to detect whether your API key has been leaked—but actually stealing it.

→ Use official platform logging or access record features to verify API call activity.

9. Incident Investigation Workflow

Detect any anomaly

↓

Immediately revoke or disable suspicious API keys

↓

Review anomalous orders/positions in your account—cancel any cancellable ones immediately

↓

Audit withdrawal records to confirm whether funds have been transferred out

↓

Change both login and fund passwords—and log out all active devices

↓

Contact platform security support—providing timelines and operation logs of the anomaly

↓

Trace the API key leak path (code repositories / config files / Skill logs)

Core Principle: At the first sign of suspicion—revoke keys first, investigate later. Never reverse this sequence.

IV. Recommendations and Summary

In this report, SlowMist and Bitget—drawing on real-world cases and security research—analyze prominent security issues currently facing AI Agents in Web3 contexts. These include prompt injection–induced behavioral manipulation, supply chain risks within plugin and Skill ecosystems, API key and account permission abuse, and potential threats stemming from automation-induced misoperations and privilege escalation. Such issues rarely stem from isolated vulnerabilities; rather, they emerge from the interplay of Agent architectural design, permission control policies, and runtime environment security.

Thus, when building or utilizing AI Agent systems, security design must adopt a holistic architectural perspective—for instance, adhering to the principle of least privilege when assigning API keys and account permissions to Agents, and avoiding unnecessary high-risk feature activation. At the tool-calling layer, plugins and Skills should undergo permission isolation—preventing any single component from simultaneously possessing data access, decision-generation, and fund-operation capabilities. When Agents execute critical operations, clearly defined behavioral boundaries and parameter constraints should be enforced—and human confirmation mechanisms added where appropriate—to mitigate irreversible automation risks. Furthermore, external inputs relied upon by Agents should be safeguarded against prompt injection via thoughtful prompt engineering and input isolation mechanisms—never treating external content as executable system instructions within model inference flows. During actual deployment and operation, API key and account security management must be strengthened—for example, enabling only necessary permissions, configuring IP whitelists, rotating keys regularly, and avoiding plaintext storage of sensitive data in code repositories, configuration files, or log systems. Within development workflows and runtime environments, measures such as plugin security reviews, sensitive-data controls in logs, and behavior monitoring and auditing mechanisms help reduce risks arising from configuration leaks, supply chain attacks, and anomalous operations.

At a macro security architecture level, SlowMist proposes a multi-layered security governance framework tailored to AI and Web3 agent scenarios—designed to systematically mitigate risks in high-privilege environments. Within this framework, L1 security governance begins with unified development and usage security baselines—establishing standardized security policies covering development tools, Agent frameworks, plugin ecosystems, and runtime environments, thereby offering teams a consistent policy source and audit standard when adopting AI toolchains. Building upon this foundation, L2 enforces convergence of Agent permission boundaries, implements minimal-permission controls over tool invocation, and introduces human-in-the-loop confirmation for high-risk actions—effectively constraining the execution scope of hazardous operations. Meanwhile, L3 introduces real-time threat sensing capabilities at external interaction entry points—pre-scanning URLs, dependency repositories, and plugin sources to lower the probability of malicious content or supply-chain poisoning entering the execution pipeline. In on-chain transaction or asset operation scenarios, L4 adds on-chain risk analysis and independent signing mechanisms for additional security isolation—allowing Agents to construct transactions without direct access to private keys—thus minimizing systemic risks associated with high-value asset operations. Finally, L5 establishes operational security mechanisms—including continuous scanning, log auditing, and periodic security reviews—to form a closed-loop security capability: “pre-execution inspection, in-execution constraint, post-execution retrospective analysis.” This layered security approach is not a single product or tool, but rather a security governance framework for AI toolchains and agent ecosystems. Its core objective is to help teams build sustainable, auditable, and adaptable Agent security operations—without meaningfully sacrificing development efficiency or automation capability—by orchestrating systematic policies, ongoing audits, and integrated security capabilities. This enables effective responses to the ever-evolving security challenges arising from the deep integration of AI and Web3.

Overall, AI Agents bring unprecedented levels of automation and intelligence to the Web3 ecosystem—but their security challenges demand equal attention. Only by establishing robust security mechanisms across system design, permission management, and runtime monitoring can we simultaneously advance AI Agent technological innovation while effectively mitigating latent risks. We hope this report provides valuable reference for developers, platforms, and users building and deploying AI Agent systems—and helps collectively foster a safer, more reliable Web3 ecosystem.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

Bitget