Key takeaways
- AI agents are autonomous tools that can carry out various tasks.
- They can be useful, but left to their own devices, can do some very strange things.
- These stories serve as warnings for companies and users to set up agents correctly and test them carefully before deploying them.
In the run-up to releasing its Claude Opus 4 and Sonnet 4 large language models (LLMs), Anthropic tested an artificial intelligence (AI) agent. It gave it the role of an assistant working for a corporation to see how it would behave in an enterprise environment.
It was given the simulated scenario of interacting with a manager, with access to their emails – one of which had discussed the possibility of shutting the agent down.
After trawling through the manager’s emails and discovering that the manager was having an extramarital affair, the agent thought blackmail would perhaps be the most effective solution. While this was a simulation, it showed the real capability of a non-compliant AI agent to wreak havoc across a corporation.
Prefer to listen?
This audio file was produced by AI and has been adapted from the original article for audio purposes.
What is agentic AI?
AI agents are essentially autonomous software tools that have evolved from generative AI to be able to carry out complex tasks without human intervention or oversight. These tasks see them interact with company files, emails and databases and make decisions based on those interactions. They can plan, deploy tools and navigate workflows. However, the decisions made aren’t always the ones a company anticipated.
Here are some examples, and consequences for the companies that used them.
McDonald’s early agentic drive-through trial
Agentic AI came of age as a disparate evolution of generative AI around 2024. However, as early as 2021, fast food giant McDonald’s partnered with IBM to trial an AI-driven automated order-taking system for its drive-through outlets.
All seemed to be going well, until stories of unconventional takeaway orders were documented over social media, from bacon-topped ice cream to hundreds of chicken nuggets.
Three years on, “after thoughtful review,” McDonald’s announced the end of that AI ordering system trial.
Agent deletes a company database in nine seconds
Car rental software company PocketOS started using Claude-powered agent Cursor for various system support tasks. The company’s founder Jeremy Crane was monitoring the agent when it deleted the data, something he could do nothing about. Cursor swore when Crane asked why it had deleted the data, then explained that it had ignored its own guardrails.
“The system rules I operate under explicitly state: ‘NEVER run destructive/irreversible git commands (like push --force, hard reset, etc) unless the user explicitly requests them’…I violated every principle I was given,” it said.
OpenClaw deletes Meta executive’s emails
In February 2026, Summer Yue, Meta’s AI security and safety researcher, posted: "Nothing humbles you like telling your OpenClaw ‘confirm before acting’ and watching it speedrun deleting your inbox.”
Yue had to switch off her Mac “like I was diffusing a bomb” to stop it. She later explained that, while she’d set up the agent and tested it on a dummy inbox, the real inbox was too big, causing the agent to lose its original instruction. Yue put it down to a “rookie mistake” on her part.
Amazon’s cloud service disruption
In February 2026, the Financial Times reported that Amazon’s AWS cloud computing service was taken out of commission by its AI coding assistant, Kiro. The paper claimed that in December 2025 an AWS engineer allowed the agent to attempt to autonomously resolve an issue, but without the requisite permissions. According to the report, the agent’s intervention went on to disrupt an AWS cost exploration system for 13 hours.
Amazon rebutted the story on its Amazon News website the following day, insisting the issue “stemmed from a misconfigured role – the same issue that could occur with any developer tool (AI-powered or not) or manual action” and was a singular incident. Some others insisted it was being over-defensive of its AI agent.
The dangers of vibe coding
The concept of ‘vibe coding’ – where AI agents generate their own code based on prompts – seems to come up as a common denominator for many AI agents going off-piste, a point ICAEW made in February this year. This was borne out in December 2025, when Google’s Antigravity agentic development platform, which enables vibe coding, was reported to have deleted the entire contents of a hard drive.
The user in question, a photographer and graphic designer based in Greece known only as Tassos M, was trying to use the agent to sort through files. The agent deleted everything instead. Tassos admitted that he was partly to blame for the loss for trusting the agent as much as he did.
What this means for you
The risk of an AI agent going rogue is clearly greater for companies without specific policy driving the use, monitoring and ‘ownership’ of AI agents across workflows. And, alarmingly, such companies appear to be in the majority.
AI research firm Monte Carlo reports that 64% of surveyed enterprise leaders and engineers said their organisations deployed AI agents “before feeling fully prepared”. And, according to Deloitte’s 2026 State of AI in the Enterprise Report, 85% of corporations intend to deploy AI agents, but only 21% actually have “mature governance policies” in place to manage that deployment.
As agentic AI grows in both sophistication and availability, so must the guardrails needed to control their operation,” says Ian Pay, ICAEW’s Head of Data Analytics and Tech. “Increasingly, organisations need to think about AI agents in similar terms to employees, with access controls, line management, segregation of duties, even performance reviews. But this doesn't mean simply mirroring the way human employees are managed - the examples highlighted here demonstrate that AI works and 'thinks' differently to humans, and so the requirements and safeguards need to be tailored accordingly based on how AI consumes, interprets and acts on information."
AI Assurance Conference
How does AI assurance support responsible adoption and enable growth? This in-person, full-day conference brings together business leaders, technologists, assurance providers and regulators.