Skip to lesson

Concurrency Discipline: Single-Writer and Reconciliation

After this you can run several agents at once without them corrupting shared state. You give every shared file exactly one writer, make everyone else read it or queue a change for later, and reconcile those queued changes at a single deliberate point, so parallel work can finish correctly instead of quietly losing data.

Understand

The first time you run agents in parallel, the appeal is obvious and the hazard is invisible. You hand three sub-agents three sections of the same report and ask each to write its part into report.md. They run at the same time. Each one reads the file, adds its section, and saves. The trouble is that "reads the file, adds its section, saves" is three steps, and while agent A is between reading and saving, agent B reads the same starting version. Each one starts from the empty file it read, so whichever saves last leaves behind only its own section and the earlier two are overwritten. No error fires. The file looks complete. Two thirds of the work is just gone, and you find out only if you happen to notice. This is a torn write. Two non-atomic read-modify-write sequences interleave, the later save lands on top of the earlier one, and last-writer-wins is the name for the damage that does. It is a standard hazard in any system with shared mutable state, and parallel agents reproduce it easily because the request that triggers it looks completely ordinary.

The discipline that prevents it is almost aggressively simple. For any file that more than one agent might touch, exactly one of them is allowed to write it. Everyone else either reads it, or hands their proposed change to the one writer and lets it do the writing. The workspace these lessons come from states the rule as single-writer-per-file, and applies it most visibly to its shared planning documents: the main session is the only thing that writes them, and any sub-agents it spawns may read those documents for context but are forbidden from editing them. The sub-agents' contributions get folded in by the single writer at one defined moment, not written directly by each agent the instant it finishes. The stated reason is exactly the failure above, that concurrent writers to one canonical file produce torn snapshots and last-writer-wins data loss, so the system removes the possibility rather than hoping to time around it.

Many writers versus onewhy concurrent writers lose data and how a single writer plus queued changes removes the race entirely.
Many writers versus onewhy concurrent writers lose data and how a single writer plus queued changes removes the race entirely.

Reading is the easy half. The hard half is what happens when the shared thing genuinely does need contributions from many concurrent sessions, not just one writer plus some readers. You cannot always funnel everything through a single live writer, because sometimes the writers are running in parallel by design and none of them is "the" owner. The answer is to stop writing the shared thing during the parallel window and write proposals instead. Each session appends its intended changes to its own private file, one nobody else touches, so there is no contention because there is no shared target yet. Then, after the parallel work finishes, one serial pass reads all those proposal files and reconciles them into the real shared document. The workspace does this for its shared index files: when a burst of parallel sessions is running, a commit-time check notices the pattern and defers updates to the shared indexes, each session queues its index edits into its own deltas file, and a later single merge session folds them all in. Concurrency during the work, serialization at the merge. The shared state is only ever written by one thing at a time, it is just that "one thing" is a later reconciler rather than a live owner.

Reconcile after, don't write duringthe windowed-deferral pattern, where parallel sessions queue private deltas and a single serial pass merges them.
Reconcile after, don't write duringthe windowed-deferral pattern, where parallel sessions queue private deltas and a single serial pass merges them.

What makes this worth internalizing rather than memorizing is that it is one rule wearing different costumes. Whether the shared resource is a planning doc with one owner, a message channel where each message is written once by its sender and never edited by the reader, or an index reconciled from queued deltas, the invariant underneath is identical. One writer per resource at any moment, everyone else reads or queues. Once you see that, you stop treating each new shared-state problem as novel and start asking the same first question every time: who owns this file, and how does everyone else's input reach it without two of them writing at once.

One invariant, several instancesthe same single-writer rule applied across three different shared surfaces.
One invariant, several instancesthe same single-writer rule applied across three different shared surfaces.

Where it breaks

This whole approach has a property worth being honest about: in the system described here, there is no real concurrency control underneath it. No locks, no leases, no compare-and-swap, no transactions. The safety rests entirely on human discipline plus a hook that recognizes a filename pattern. That works because the actual concurrency is low and human-paced, sessions a person starts and stops, not a thousand automated writers hammering a file. Drop genuinely simultaneous automated writers onto the same design and it would not hold, because nothing is actually enforcing mutual exclusion. Know that you are buying safety with convention, and that convention scales only as far as the discipline behind it.

The convention is also brittle in a specific way. Because the deferral triggers on a filename pattern, a mistyped name or a missing date means the check does not recognize the parallel window, deferral never switches on, and you are silently back to contention with no warning that your safety net failed to deploy. Pattern-matching is a fragile way to gate something important, and the failure is quiet.

Reconciliation introduces its own single point of failure at the other end. The merge step is serial and usually manual, which means it is a bottleneck and it is skippable. If that merge is delayed, botched, or forgotten, every queued change just never lands, and because each session "succeeded" at writing its own deltas file, nothing looks wrong until someone notices the shared document is missing a week of updates.

And the in-between states will catch a careless reader. A derived or shared file sitting between sync points may not reflect the latest work, and an agent that reads it as current truth acts on stale state. The design tolerates these staleness windows on purpose, but tolerating them is not the same as the reader knowing they exist. Finally, weigh whether you need any of this at all. Coordination machinery built for heavy concurrency is pure overhead on work that never actually runs more than one writer at a time, and provisioning it ahead of real need is a cost, not a precaution.

Do it now

Before you parallelize anything that writes, run the shared resources past this. The goal is to leave the parallel window with zero files that two agents could write at once.

Paste this
SINGLE-WRITER CHECK — for every file/record the parallel work touches:
1. Name the ONE writer.                  Who is allowed to write this? (exactly one)
2. Everyone else: read or queue.          Readers read. Contributors write a PRIVATE file, never the shared one.
3. Name the reconcile point.              Where/when does one serial pass merge the queued changes in?
4. Name the stale window.                 Between now and that merge, who might read this and be wrong?
If any shared file has two potential writers and no reconcile point, do NOT run it in parallel yet.

The concrete pattern that satisfies it: instead of N agents writing shared.md, have agent k write part-k.md (its own file, no contention), then one final step concatenates or merges the parts into shared.md. You have converted a write race into a read-then-merge, which has no race because only the merge step writes the shared file.

Worked example

Illustrative

Illustrative. A constructed pair to show the hazard and the fix, not a real run.

You ask three agents to draft three sections of a design doc in parallel, each writing into design.md:

Agent A: read design.md (empty), wrote "## Goals", saved. Agent B: read design.md (empty, before A's save landed), wrote "## Approach", saved. Agent C: read design.md (empty), wrote "## Risks", saved.

Result: design.md contains only "## Risks". Goals and Approach were overwritten. No error was raised.

Each agent did exactly what it was told and the document still lost two thirds of itself, because all three started from the empty version and the last save won. Now the single-writer version:

Agent A: wrote goals.md Agent B: wrote approach.md Agent C: wrote risks.md Merge step: read all three, wrote design.md = goals + approach + risks.

Result: design.md has all three sections. Nothing raced, because only the merge step ever wrote the shared file.

The same three agents, two designsthe one structural change that turns a lossy write race into a safe merge.
The same three agents, two designsthe one structural change that turns a lossy write race into a safe merge.

Splitting the work so that only the merge step writes the shared file leaves nothing for two agents to race over. The protection lives in the layout rather than in the agents being careful or the timing happening to work out, and the agents themselves are no different between the two runs. When you parallelize anything that writes, design the shared target so a collision cannot occur, instead of trusting that it will not.