Three Ways AI Agents Break Your Work, and How to Prevent Each One

It took a full system reboot before the laptop responded again.

The agent had been given a refactor. It hit an error, tried to fix it, hit another error, tried that fix, and entered the loop. By the time anyone noticed, sixteen processes were running, memory had passed 100%, and the machine was unresponsive. Two hours were already gone — not because of the agent's work, but because of the recovery — and the only artifact that survived was the prompt that started it.

That session is the reason this article exists. AI agents fail in patterned, predictable ways. Once you have seen each pattern once, you can prevent the next ten.

Across different people, tools, and use cases, we keep seeing the same three agent failures. Every serious agent incident we hear from operators fits one of them. Each has a clear cause. Each has a clear prevention. The controls are not difficult. They are just unfamiliar, so people skip them until they have learned the hard way.

The three failure modes

The failures look different on the surface, but they usually fall into three buckets: runaway cost, data loss, and runaway loops.

Failure	What it costs you	How often we see it
Runaway cost	Money. Sometimes thousands of dollars before anyone notices	Low frequency, high severity
Data loss	Work. Documents, code, configurations — gone or overwritten	Medium frequency, very high severity
Runaway loop	Time. Often a forced reboot and lost session state	Medium frequency, high severity

The rest of this piece walks through each one, with the actual incidents we have seen and the controls that prevent them.

Failure 1: runaway cost

How it happens

Cost failures are not just about API calls.

Source	What triggers it
Per-token API billing	Direct use of Anthropic, OpenAI, or similar APIs without a spend cap
Cloud resource creation	An agent that can spin up cloud infrastructure, doing so without supervision
Paid tool integrations	An agent calling premium services or third-party APIs you forgot were billable
Retry storms	An agent hitting an error and retrying hundreds of times, each retry billed

The pattern that surprises people most is the retry storm: the agent keeps retrying the same failed task, and every retry triggers another paid action. A single misconfigured loop can burn through a month's budget in an afternoon.

What we have seen

We have not had a runaway API bill ourselves, and the reason is structural: we use agent tools that operate inside fixed-rate subscription plans. The "you hit your daily limit" message that occasionally pauses our agents is not a bug — it is a hard ceiling that doubles as a cost guardrail.

What we have heard second-hand is consistent: operators start with raw API access, delegate something ambitious, and wake up to an expensive mistake.

How to prevent it

Tier	Control	Effect
Required	Set a monthly spend cap on every billable account	Hard ceiling on the worst case
Recommended	Start on subscription plans rather than usage-based pricing	Predictable monthly cost, built-in rate limits
Recommended	Cap individual agent sessions at two to three hours	Limits the damage of a single bad run
Advanced	Require explicit approval before any cloud resource is created	Removes the second-order cost path

Where to set spend caps

Service	Path
Anthropic API	Console → Settings → Spend Limits
OpenAI API	Platform → Settings → Billing → Usage Limits
AWS / Google Cloud	Console → Budgets / Billing Alerts
Third-party APIs	Each provider's dashboard, usually under Billing

Set these once, then review them only when your usage changes. They cost nothing when nothing goes wrong, and they save you the entire downside when something does.

Failure 2: data loss

This is the most frequent failure mode, the most damaging when it happens, and the one the pre-flight checklist is most concerned with.

How it happens

Two causes account for almost every incident we have seen.

The first is vague instructions. "Clean this up" or "fix the references" gives the agent room to interpret. The interpretation is not always wrong, but it is always broader than what you intended.

The second is context drift. Long sessions accumulate context the agent uses to decide what to touch. As the session grows, the agent loses track of which files were in scope and which were not. By hour three, "update the homepage" can mean any file the agent has seen recently.

What we have seen

Here are a few incidents we have lived through, or heard directly from operators we trust:

A configuration file silently rewritten — a custom build configuration we had tuned over weeks, "simplified to a standard structure" by an agent that decided our setup looked unusual. The agent had not been asked to touch the configuration. It just decided the configuration was getting in the way of what it was asked to do.

A small visual tweak that became a full stylesheet rewrite — a request to "adjust this color slightly" that ended with the entire stylesheet replaced from scratch. The new file looked fine on its own. It also threw away every spacing decision, every override, and every comment in the original.

A multi-file edit when one was asked for — a single-file change request that produced commits across six files. Some of the secondary changes were arguably improvements. Others quietly broke things that took hours to find.

In every case, the recovery was version control. Without it, the work would have been gone.

How to prevent it

Tier	Control	Effect
Required	Commit to version control before any agent run	Recovery is always possible
Required	Review the diff after every agent run	You catch the unintended changes immediately
Recommended	Work on a separate branch for risky tasks	Mistakes never reach the main line
Advanced	Add explicit "do not delete files" rules to standing agent instructions	Hard limit at the agent level

A reusable safety pattern

For files in version control:

# Before the agent run
git status
git add -A
git commit -m "before agent change"

# → delegate the task

# After the agent run
git diff --stat   # shows which files changed
git diff          # shows what changed in each

If you use git, take a full snapshot before the agent starts. Use git add -A so newly created files are included too — git commit -am can miss those. If the agent creates files you did not ask for, you want them captured in the snapshot.

For documents in Notion, Google Docs, or similar tools without git, the equivalent is to open the version history before the agent works on the document, and confirm a snapshot of the current state exists. Most modern document tools auto-snapshot, but "auto" is not the same as "I have personally verified."

Failure 3: runaway loop

How it happens

The agent encounters an error. It tries to fix the error. The fix produces a new error. The agent tries to fix that one. Each iteration spins up new processes, new file reads, new memory allocations. Nothing is failing loud enough to stop the loop, and nothing is succeeding well enough to break out of it.

If the agent has the ability to spawn long-running processes — terminals, builds, watchers — those accumulate too. Within an hour, your machine can have dozens of stuck processes consuming memory and CPU.

What we have seen

This is the one that produced the laptop reboot at the top of this article.

The agent had been asked to do a substantial refactor. It hit an error, attempted a fix, hit a different error, attempted another fix, and entered the loop. New terminal processes kept opening faster than old ones closed. Memory crossed 16 GB and the machine froze.

That incident was the one that taught us most of the controls below.

How to prevent it

Tier	Control	Effect
Required	Cap sessions at two to three hours, then start fresh	Memory and context drift cannot accumulate indefinitely
Recommended	Keep your task manager visible during agent runs	Spikes show up before the freeze
Recommended	Apply the three-tries rule	Loops cannot run more than three iterations on the same problem
Advanced	Decompose large tasks into reviewable steps	Reduces the surface area where loops can start

What to watch in the task manager

Metric	Danger threshold	What to do
Memory	Above 80%	End the session, restart fresh
CPU	Pinned at 100% for several minutes	Identify the runaway process and terminate it
Disk	Sustained 100% activity	Check for runaway writes — these often precede a freeze

The three-tries rule

Attempt	Response
First	Let the agent attempt a fix
Second	Tell it to try a different approach
Third	Stop. Roll back, then debug manually

This single rule prevents most of the runaway loops we have seen. It costs nothing to apply and saves entire afternoons.

The first-time setup

If you are about to delegate to an AI agent for the first time, six steps cover the safety floor:

#	Action	Time	Failure mode it prevents
1	Run `git init` on the project	30 seconds	Data loss
2	Commit the initial state with `git add -A && git commit -m "initial"`	10 seconds	Data loss
3	Set a monthly spend cap on any usage-billed API account	3 minutes	Runaway cost
4	Add a "confirm before deleting files" rule to your agent's standing instructions	1 minute	Data loss
5	Pin your task manager to the taskbar	10 seconds	Runaway loop
6	Memorize the three-tries rule	—	Runaway loop

A short setup pass prevents most of the incidents that make people give up on agent workflows too early.

The shorter version

If you only do three things, do these:

#	Control	Failure mode it prevents	Time
1	Commit before every agent run	Data loss	10 seconds
2	Review the diff after every agent run	Data loss, runaway cost	30 seconds
3	Cap each session at two to three hours	Runaway loop, runaway cost	—

AI agents are capable and fast, but brittle when the instructions are vague or the session has no boundaries. The right way to think about them is not as a tool but as a contractor: useful when the brief is clear and the boundaries are explicit, expensive when neither is true.

The good news is that the controls are simple. Most of them are thirty seconds of work. The agents that become genuinely useful are the ones running inside guardrails you set once and rarely have to think about again.