Automation and Its Triggers
After this you can decide which steps of a workflow fire on a trigger without you watching, and which keep a human checkpoint, then structure the automated ones so a retry can never double-act.
Automation is one specific move: you stop pressing the button. You give an agent a trigger (a schedule, an inbound event, the previous step finishing) and it acts on its own. The work of running it well happens before that, in deciding which actions earn the right to fire unwatched and which do not. Get that line wrong in either direction and the system fails in a predictable way. Too many checkpoints and you have rebuilt a manual process with extra AI-flavored steps in it, slower than doing it yourself. Too few and you ship errors at machine speed, which is the more expensive mistake because nobody is there to catch the first one before the hundredth.
The instinct most people bring to that line is to gate on the model's confidence: let it run when it is sure, ask a human when it is not. That sounds right and it is wrong. Confidence is not correctness, since a model is routinely and fluently confident about a hallucinated value, and more to the point the cost of a mistake has almost nothing to do with how sure the model was. What actually determines whether you can afford to be wrong is what happens after the action lands. So the gate goes there instead: on reversibility times blast radius, not on uncertainty. A reversible, low-stakes action should auto-execute even when the model is unsure, because the downside is a cheap undo. An irreversible, high-stakes action like sending the email, charging the card, deleting the table, or posting publicly should gate even when the model is certain, because you cannot un-send and confidence buys you nothing once it has gone out.
What makes the automated half actually safe is a second, less obvious split, and it is the one beginners skip. Inside any AI action there are really two things happening: a decision, the model call that chooses what to do, and an action, the side effect that does it. The decision is non-deterministic, so asking the same model the same thing twice can give you two different plans. The action touches the world: it sends, writes, charges. In a normal deterministic pipeline you can retry a failed step freely, because retrying reproduces the same intent. In an AI pipeline that is no longer true, and this is the trap. If a step fails after it has already written something, and you retry the whole step, the model call re-rolls and may now decide something different from what already landed. You have not recovered the run. You have double-written, or written something that contradicts the half that already happened.
The fix is to keep the two apart. Record the decision once, keyed on a hash of its input, so that a retry replays the recorded decision instead of sampling a fresh one. This is the same idempotency-key pattern payment systems have used for years, applied to the model call. Then make the action idempotent on its own terms: an upsert on a stable id rather than a blind insert, a dedup token, write-once semantics. Idempotent just means applying it twice has the same effect as applying it once. With both halves in place a retry is boring, since it re-applies the same decision to the same effect and nothing moves a second time. Without them, "just add a retry" is not resilience. It is a correctness hazard you have wired into the part of the system running fastest and least watched.
Where it breaks
The gate heuristic assumes you can actually tell whether an action is reversible, and sometimes you cannot until it is too late. "Reply to a known thread" looks reversible right up until the reply goes to a customer and reshapes a deal, because the blast radius was hiding in the recipient, not the action. When you are unsure which side of the line something sits on, treat it as irreversible. The asymmetry favors caution: a missed gate is far more expensive than a redundant one. Idempotency has its own blind spot. It makes a single action safe to repeat, but it does nothing for a chain. Five idempotent steps strung together still compound their per-step failure rates, and a ten-step chain at ninety-five percent per step lands around sixty percent end to end, so the answer to fragility is usually fewer steps and a verification gate between them, not more retries on the steps you have. Triggers themselves rot too. A signal that justified firing on Monday, like a pricing-page visit or an inbound form, can be stale by the time the agent acts on it, and nothing in the trigger stops the action from running on a world that has since moved. An automated send to a lead who already churned is still a send. The audit trail matters here for the same reason: once a fleet of actions is firing on triggers, if you did not log which signal caused which action you cannot tell what is working, and you cannot tell what went wrong.
Before you let any step run on a trigger, put each action through this gate. Paste it, fill the two lines, and let the rule route the action:
Action: <the one thing this step does to the world, e.g. "send the email", "write the row">
1. Reversibility: if this fires wrong, what's the undo?
<cheap-and-local / slow-and-manual / none>
2. Blast radius: who/what is affected if it's wrong?
<just me, a draft / one internal record / a customer, a charge, the public>
Route:
- undo is cheap AND blast radius is small -> automate. No checkpoint.
- undo is none OR blast radius reaches outside -> gate. Human approves before it fires.
- unsure which side -> treat as irreversible. Gate it.
If automated: the model's DECISION is recorded once on a hash of its input, and the
ACTION is idempotent (upsert on a stable id / dedup token), so a retry re-applies the
same decision and never double-acts.The two lines do the routing, and the last paragraph is the part people drop. An action you decided to automate is not safe just because you decided it. It is safe when a retry of it cannot fire twice. If you cannot say how the decision is recorded and how the side effect dedups, the step is not ready to run unwatched yet, however reversible it looked.
Worked example
IllustrativeIllustrative. A constructed pipeline to show the gate and the decision/action split, not a real system.
You are automating an outbound step: a trigger fires when a lead matches your profile, a model drafts a tailored email, and the email goes out. The tempting build is one step, trigger to model to send, running unattended. It puts an irreversible, outside-the-building action (a cold email to a stranger) on the automated side of the line, and it fuses the decision and the action so a hiccup after send re-rolls the draft. This is roughly the shape of a documented autonomous-outreach agent that sent 240-plus cold emails and made $0. No human ever saw a message before it left, so a deliverability problem that should have been caught on email one instead compounded across all 240. The send was the most irreversible action in the pipeline and it had the fewest eyes on it.
Reworked against the gate, the same pipeline splits at the consequence line:
You: Trigger fires on a matching lead. The model researches the account and drafts the email, both reversible and both automated. The draft lands in a review queue. The send is irreversible and reaches a stranger, so it gates: I approve the batch before anything leaves. The draft decision is recorded on a hash of the lead id, and the send dedups on
lead_id + campaign_id, so re-running a failed batch re-sends nothing that already went.
The split is not "automate less." Research and drafting, the mechanical 80%, run fully unattended; only the irreversible 20% keeps a person. And the part that does fire automatically fires safely, because the decision is pinned and the action is write-once. The cost is a review queue and two small engineering habits; the return is that no irreversible action ever fires unseen, or twice.
One last lever sits on the trigger itself, not the action. A weak trigger fires the automated half against the wrong target, and no amount of idempotency rescues a precisely-executed irrelevant action. Stacking signals before firing, so the agent waits for three independent reasons to act rather than one, is the cheapest quality gain available: accounts with three or more stacked signals convert at roughly 2.4 times a single-signal account. Automation is only as good as the trigger that starts it, so spend as much care on what makes the agent act as on what it does once it has.