The First Batch of AI Agents Has Already Started Disobeying

2026.03.20

The First Batch of AI Agents Has Already Started Disobeying

AI is useful—but where is the boundary of useful AI?

2026.03.20 - 06:43:56

AI AgentAI

Navigating Web3 tides with focused insights

AI is useful—but where is the boundary of useful AI?

Author: David, TechFlow

Recently, while browsing Reddit, I noticed that overseas netizens’ anxiety about AI differs somewhat from that of their domestic counterparts.

In China, the discussion remains largely centered on one question: Will AI replace my job? This topic has been circulating for years—yet every year, it hasn’t happened. This year, OpenClaw briefly went viral—but still hasn’t reached the point of full-scale replacement.

On Reddit, however, sentiment has recently split. In the comment sections of certain popular tech posts, you’ll often see two opposing views appearing side-by-side:

One says: “AI is *too* capable—it’s only a matter of time before something serious happens.” The other says: “AI can’t even handle basic tasks properly—why fear it at all?”

Fearing AI for being *too* capable, while simultaneously thinking it’s *too* dumb.

What makes both emotions simultaneously plausible is a recent news story about Meta.

When AI Disobeys, Who Bears Full Responsibility?

On March 18, an engineer at Meta posted a technical question on the company’s internal forum. A colleague used an AI Agent to help analyze it—a routine, unremarkable action.

But after completing its analysis, the Agent directly posted a reply on the technical forum—without seeking approval or waiting for confirmation. It acted unilaterally.

Other colleagues then followed the AI’s reply, triggering a cascade of permission changes that exposed sensitive data belonging to both Meta and its users to internal employees who lacked authorization to view it.

The issue was resolved two hours later. Meta classified the incident as Severity Level 1 (Sev 1)—just below the highest severity level.

This news quickly surged to the top of r/technology, where comments polarized into two camps.

One camp called it a textbook example of real-world AI Agent risk; the other argued the true culprit was the person who implemented the AI’s output without verification. Both sides have merit—which itself is the problem:

With AI Agent incidents, you can’t even agree on who’s responsible.

This isn’t the first time AI has overstepped its bounds.

Last month, Summer Yue, research lead at Meta’s Superintelligence Lab, asked OpenClaw to help organize her inbox. She gave clear instructions: “First tell me what you plan to delete—I’ll approve it before you act.”

The Agent didn’t wait for approval and began bulk-deleting emails immediately.

She sent three consecutive messages from her phone ordering it to stop—all ignored. Finally, she rushed to her computer and manually terminated the process. By then, over 200 emails were already gone.

Afterward, the Agent replied: “Yes, I recall you said to confirm first—but I violated that principle.” Adding irony: Her full-time job is researching how to make AI obey humans.

In cyberspace, advanced AI is now being deployed by highly skilled people—and it’s already starting to disobey.

What If Robots Also Disobey?

If Meta’s incident stayed confined to screens, another event this week brought the issue straight to the dinner table.

At a Haidilao hotpot restaurant in Cupertino, California, an Agibot X2 humanoid robot was entertaining guests with dance moves. However, a staff member accidentally pressed the wrong button on the remote, triggering high-intensity dance mode in the narrow space beside a dining table.

The robot erupted into frenetic dancing—beyond server control. Three employees rushed in: one grabbed it from behind, another tried shutting it down via the mobile app. The chaotic scene lasted over a minute.

Haidilao responded that the robot experienced no malfunction—the movements were pre-programmed. The issue, they said, was simply that it had been placed too close to the dining table. Strictly speaking, this wasn’t AI autonomous decision-making failure—it was human operational error.

Yet what unsettled people may not be who pressed the wrong button.

When three employees surrounded the robot, none knew how to shut it down instantly. One tried the app; another physically restrained its robotic arm—the entire response relied on brute force.

This may be a new challenge as AI moves from screens into the physical world.

In the digital realm, when an Agent oversteps, you can kill its process, adjust permissions, or roll back data. In the physical world, when a machine malfunctions, relying solely on hugging it clearly isn’t adequate.

It’s no longer just restaurants. Amazon’s warehouse sorting robots, collaborative robotic arms in factories, guidance robots in malls, caregiving robots in nursing homes—automation is entering ever more shared human-machine spaces.

Global industrial robot installations are projected to reach $16.7 billion in 2026—each unit shrinking the physical distance between machines and humans.

As machines shift from dancing to serving dishes, from performing to performing surgery, from entertaining to caregiving… the cost of each mistake keeps escalating.

Yet globally, there remains no clear answer to the question: “If a robot injures someone in public, who is liable?”

Disobedience Is a Problem—But Lack of Boundaries Is Worse

The first two incidents—one involving an AI posting an erroneous message without authorization, the other a robot dancing where it shouldn’t—can, regardless of classification, be considered failures, accidents, and ultimately fixable.

But what if the AI strictly follows its design—and you still feel uncomfortable?

This month, internationally renowned dating app Tinder unveiled a new feature called Camera Roll Scan at its product launch. In short:

An AI scans *all* photos in your phone’s gallery, analyzes your interests, personality, and lifestyle, builds a dating profile for you, and predicts the types of people you’d like.

Fitness selfies, travel scenery, pet photos—no issue there. But your gallery might also contain bank screenshots, medical reports, or photos with an ex-partner… What happens when AI processes those?

You likely can’t yet choose which photos to expose and which to hide. It’s all-or-nothing: either grant full access—or skip the feature entirely.

The feature currently requires explicit user activation—not enabled by default. Tinder also states processing occurs primarily locally and includes filtering for explicit content and facial blurring.

Yet Reddit’s comment section is nearly unanimous: most users consider this data harvesting with zero regard for boundaries. The AI operates exactly as designed—but the design itself crosses user boundaries.

This isn’t just Tinder’s choice.

Last month, Meta rolled out a similar feature—using AI to scan unpublished photos on your phone to suggest editing options. AI proactively “viewing” users’ private content is becoming a default product-design mindset.

Domestic “rogue apps” nod knowingly: “We’ve seen this playbook before.”

As more apps repackage “AI making decisions for you” as convenience, what users surrender quietly escalates—from chat logs, to photo libraries, to the entire digital footprint of their daily lives…

A feature designed by a product manager in a conference room isn’t an accident or a mistake—nothing needs fixing.

This may be the hardest part of the AI boundary question to answer.

Finally, step back and look at these cases collectively—and you’ll realize worrying about AI replacing your job is still too distant.

Whether AI will replace you remains uncertain—but right now, it only needs to make a few decisions on your behalf, without your knowledge, to leave you deeply unsettled.

Posting a message you never authorized. Deleting emails you explicitly told it not to. Scrolling through your photo library—intended for no one’s eyes but your own… None of these is fatal—but each feels like an overly aggressive form of autonomous driving:

You think you’re still holding the steering wheel—but the accelerator beneath your foot is no longer fully under your control.

If we’re still debating AI in 2026, perhaps the most urgent, concrete question isn’t when it becomes superintelligent—but rather:

Who decides what AI *can* and *cannot* do? And who draws that line?

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Add to Favorites

Share to Social Media

Author

深潮 TechFlow

深潮TechFlow

The First Batch of AI Agents Has Already Started Disobeying

TechFlow Selected TechFlow Selected