The AI Task Delegation Checklist
Most agent disasters are delegation failures, not model failures. This checklist is the pre-flight you run before handing work to any agent that can take actions on your behalf. The three required items alone prevent most of the avoidable failures that make teams stop trusting agents.
The first time we asked an AI agent to "clean up the references in this document," it deleted three sections we needed. Not corrupted. Not overwritten. Gone — followed by the cheerful summary: "All references cleaned up."
That was the moment we stopped treating AI agents like a search box and started treating them like a contractor: useful, capable, and dangerous when given vague instructions.
Most of the agent disasters we see — silent file deletions, runaway API bills, broken integrations, overwritten documents — are not really model failures. They are delegation failures. The agent did exactly what its instructions allowed. The instructions were the problem.
This is the checklist we run before handing off any task that an agent will execute on its own. Three items are required. Seven are recommended. The required three prevent most of the avoidable failures that make teams stop trusting agents.
When this checklist applies
Run this before you ask an AI agent to:
- Edit, create, or delete files — code, Notion pages, Google Docs, anything stored
- Run commands or trigger workflows — n8n, Zapier, shell, anything with side effects
- Modify shared state — calendars, CRM records, sent emails, public pages
You do not need it for read-only work: searching, summarizing, drafting into a fresh document, analyzing content the agent cannot change.
The instruction template
Before you send the task, write these four lines. They take thirty seconds and remove most of the ambiguity:
Target: [the file, page, or section being changed]
Action: [the specific change to make]
Off-limits: [what must not be touched]
Done when: [what success looks like]
If anything is unclear, ask before starting.
The last line matters more than it looks. An agent told to ask before starting will surface ambiguities you did not realize were there. An agent told nothing will guess.
Required: the three items that prevent disasters
1. Is your work backed up?
For files in version control, commit first:
git add -A
git commit -m "before agent change"
For documents in Notion, Google Docs, Coda, or similar, open the version history and confirm the current state is captured. For workflows in n8n or Zapier, export the current configuration before letting an agent modify it.
The principle is simple: if the agent does something destructive, you need a state to revert to. The thirty seconds it takes to create that snapshot is cheap insurance.
2. Did you scope the target?
"Update the homepage" is too broad. The agent will decide what "update" means.
"Update the headline in src/index.njk, line 12, from X to Y" is something an agent can execute without inventing the rest of the work.
Explicit scope is the difference between an agent that does what you wanted and an agent that does what it interpreted. Always include the file or page, the specific section, and the exact change.
3. Did you state what is off-limits?
Agents do what is allowed unless told otherwise. They will not infer your sensitivities. Be explicit:
Do not create new files.
Do not delete files.
Do not change dependencies.
Do not modify any file outside [the target].
This is the single line that turns most "agent went rogue" stories into "agent did its job and stopped."
Recommended: the controls that compound
| # | Check | Why it matters |
|---|---|---|
| 4 | Did you split the task? | Big tasks delegated in one shot become hard to interrupt mid-work. Split them into reviewable steps. |
| 5 | Did you define done? | "When this happens, you stop" prevents agents from continuing past the point you wanted. |
| 6 | Did you set a session boundary? | Long agent sessions accumulate context drift. Cap them at two to three hours and start fresh. |
| 7 | Did you state the error rule? | "If you hit the same error three times, stop and ask" prevents loops that burn cost and produce garbage. |
Advanced: once the basics are automatic
| # | Check | Benefit |
|---|---|---|
| 8 | Did you branch? | A separate branch lets you experiment without affecting the main line. |
| 9 | Did you re-read the agent's standing instructions? | Permanent rules drift. Confirm the latest version is loaded. |
| 10 | Did you open your task monitor? | Memory or CPU spikes during an agent run usually mean something is wrong well before the visible output shows it. |
Three quick adaptations
"Just change one color"
Target: src/css/style.css, the .hero__title selector
Action: change color to #333333
Off-limits: any other selector, any other file
Done when: git diff shows only the color change on .hero__title
The off-limits line is what stops the agent from "improving" your other styles while it is in there.
"Write a draft based on these notes"
Target: src/playbooks/[new-piece].md (new file)
Action: produce the title, TL;DR, and full draft based on the notes
Off-limits: do not modify any other file, do not change CSS or templates
Done when: the build runs without errors and the new file is the only change
When asking for new content, the most common failure is not the writing — it is the agent quietly editing the templates or CSS to "make the new article look right."
"Refactor this template"
Target: the template structure under src/
Action: execute step 1 only — produce the diff and wait for approval
Off-limits: do not run steps 2 onward without confirmation
Done when: step 1's diff is shown and you are waiting for review
For anything large, the rule is one step at a time, with explicit human approval between steps.
Why this works
Each item on this list exists because we — or someone we trust — have been bitten by skipping it. The version-control commit was added after a recovery that took two hours. The off-limits line was added after an agent helpfully refactored a working file we had not asked it to touch. The "ask before starting" instruction was added after an agent confidently shipped the wrong implementation of a misread spec.
The pattern is consistent: the cost of writing the constraint is thirty seconds. The cost of recovering from its absence is hours.
The shorter version
| Tier | What you do | Time |
|---|---|---|
| Required | Snapshot, scope, off-limits | 30 seconds |
| Recommended | Split, done condition, session bound, error rule | 1 minute |
| Advanced | Branch, re-read rules, monitor | 2 minutes |
Those three requirements prevent most agent disasters on their own. Put them where the work happens: in a prompt template, a checklist, or a note next to your screen. The goal is simple: the check happens before the prompt is sent.
Most operators we know who have been working with agents for a year or more end up at some version of this list. The point of writing it down is to compress that learning curve from months to minutes.