How to Stop Your AI Agent From Running Up a Bill

While testing an agent workflow for this site, we hit the same build error for the eighth time. Each retry was a fresh API call. The session was on a fixed-rate subscription, so the daily limit kicked in and the agent paused. If it had been on metered API billing, the math says we would have burned through several hundred dollars by the time we noticed.

That session is the reason this article exists. AI agent costs do not go wrong gradually. They go wrong in a single afternoon. The fix is a structure you set once, before the first session that can hurt you.

This piece covers the four cost-runaway patterns we have seen, the three precautions that catch most of them, and the safeguards for the cases spend caps cannot reach.

The four cost-runaway patterns

Cost failures fall into four distinct categories. Each has a different cause, and each needs a different control.

Pattern	What goes wrong	Worst-case cost shape
Metered API runaway	Agent retries the same failing call hundreds of times on per-token billing	Several thousand dollars in an afternoon
Cloud resource creation	Agent provisions infrastructure (servers, databases, storage) on your account	Recurring monthly costs until you find and remove the resources
Paid third-party API calls	Agent calls a billable external service without your noticing	Variable, depends on the service
Retry storms	Error → fix → new error → new fix loop, billed each iteration	Compounds with metered API to produce the worst incidents

The retry storm is the one that produces most of the horror stories. A misconfigured loop can burn through a month's budget in a few hours. The good news is that every category has a known prevention, and they stack — putting all four in place takes about ten minutes.

What we have seen

A near-miss is the easiest way to describe the shape of the risk.

On a small test build for The Executive OS, we asked an agent to clean up a build configuration. The agent got the first part wrong, hit an error, attempted a fix, hit a different error, attempted another fix. Each iteration was a full API call. The cycle ran fast — eight iterations in roughly fifteen minutes.

This stayed a near-miss for one reason: the session was on a fixed-rate subscription. The daily limit kicked in and the agent paused. The cost was zero beyond the subscription we were already paying for.

If the same session had been on per-token API billing without a spend cap, the numbers get ugly fast. Eight rapid iterations of a configuration cleanup at typical token volumes is several hundred dollars. Multiply by the number of times an unsupervised agent could iterate before someone noticed, and you have the four-figure surprise bill that makes for the canonical story in this category.

The control that worked was not "watch the agent carefully." It was "use a billing model with a built-in ceiling." The first depends on attention. The second works even when attention fails.

The three precautions

Three controls catch most of what can go wrong. Set them once, then leave them alone until your usage pattern changes.

1. Use a subscription plan, or set a hard spend cap

Billing model	Cost behavior	Right for
Subscription	Predictable. Built-in rate limits double as a cost ceiling.	Newcomers, anyone who prefers a known monthly cost
Per-token API (metered)	Pay for what you use. No ceiling unless you set one.	Operators with a clear sense of their consumption

Either model works. The mistake is running on metered API without setting a spend cap.

For each provider, the spend-cap setting is a single toggle. The exact path varies and changes occasionally, so confirm against the provider's current documentation, but the pattern is the same:

Provider	Where the spend cap lives
Anthropic API	Console → Settings → Billing → Spend Limits
OpenAI API	Platform → Settings → Billing → Usage Limits (set both Hard and Soft)
AWS / Google Cloud	Each console's Budgets / Billing Alerts panel
Third-party APIs	Each service's billing dashboard

The hard limit prevents disasters. The soft limit notifies you before you get there. Set both.

2. Block cloud-resource creation at the agent level

Spend caps protect the AI provider's billing. They do not protect AWS, Google Cloud, or any other infrastructure account the agent might touch.

If your agent can touch infrastructure — by running command-line tools, making API calls, or executing shell commands — treat every resource-creation step as something that needs explicit approval. Add this to your standing instructions:

- Do not create cloud resources (instances, databases, storage, queues) without confirmation.
- Do not register or pay for any third-party service.
- Do not modify any billing-related configuration.

Standing instructions are the right place for this rule because the cost path is structural, not session-specific. We cover the broader pattern in Standing Instructions for AI Agents.

3. Cap session length at two to three hours

Retry storms get worse in long sessions. Past two or three hours, several factors stack:

Memory pressure builds
Earlier instructions get summarized away
Context drift makes the agent more likely to misread the situation
Each attempted fix gets layered on top of stale context

The fix is simple: cap each session at two to three hours, take a break, and start fresh. The break is also when you review what happened and update your standing instructions if anything new came up.

This is the same session-boundary rule that protects against runaway loops in general. We cover it more in Three Ways AI Agents Break Your Work.

Standing-instruction safeguards that close the remaining gaps

Spend caps cap the bill. They do not change the agent's behavior. The agent will still try to retry the same failing call eighty times if you let it. To stop the behavior at the source, add these to your standing instructions:

- If you hit the same error three times in a row, stop and ask before trying again.
- If a single task has been running for more than thirty minutes, summarize progress and ask whether to continue.
- Do not run commands that install or upgrade dependencies without explicit confirmation.
- Do not call any external service that bills per request without prior approval.

The "three errors and stop" rule is the single highest-leverage line. It costs nothing to add and prevents most of the runaway loops we have seen.

Where spend caps do not catch everything

Even with a spend cap set, three patterns can still produce a bill:

Pattern	Why the cap does not catch it	Mitigation
Already-consumed usage	Spend caps apply to future usage; what was already used will still be billed	Set the cap at the start of the month, not after the first big session
Billing-time lag	Most providers count usage with a small delay, so the cap can be exceeded by a small margin	Set the cap at 90% of what you actually want to allow
Cloud resources spun up by the agent	The AI provider's spend cap does not see your AWS/GCP usage	Block resource creation at the agent level (see standing instructions above)

These are not reasons to skip the spend cap. They are reasons to layer the agent-level rules on top of it.

The five-minute setup

If you are setting up an agent for the first time, this is the order:

#	Action	Time
1	Set the monthly spend cap on every billable account	3 minutes
2	Add the cost-control rules to your standing instructions	1 minute
3	Decide your session-length rule (we use two to three hours)	—
4	Note the "three errors and stop" rule somewhere visible during sessions	30 seconds
5	Set a calendar reminder to review your billing dashboard weekly for the first month	30 seconds

Five minutes of setup, applied once, prevents most of the cost incidents that ruin agent workflows for newcomers.

A pre-session check

Before any non-trivial agent run, the cost-side equivalent of the pre-flight checklist is three questions:

Is the spend cap still in place? Caps occasionally get reset by billing changes; a quick glance at the dashboard catches this.
Are the standing instructions still loaded? Some tools require restarting the session to pick up changes; confirm the agent acknowledges them.
Is the task scoped tightly enough that a retry storm is unlikely? The looser the task, the more iterations the agent will run, and the more cost exposure compounds.

If any answer is uncertain, fix it before sending the prompt.

The bottom line

Tier	What it prevents
Subscription plan or hard spend cap	The four-figure surprise bill
Standing-instruction rules against billable actions	Cloud-resource and third-party-API costs that bypass the cap
Session length cap at two to three hours	Retry storms compounding into expensive runs
"Three errors and stop" rule	Most of the runaway loops we have seen

None of these controls are difficult. They are just easy to skip when the bill has never surprised you. Set them up before the first incident, and most of the stories you have read about runaway AI bills will not be your stories.