braid recover
Resumes from an interrupted operation (add, remove, replace) by opening LUKS devices, mounting the pool, rebuilding pool.json from live pool state when appropriate, running owed maintenance when the btrfs balance state is idle, and clearing the pending-operation journal only after the safe recovery path completes.
When to use it
- After a crash, power failure, or interrupted braid command.
- When
braid statusor other commands show “pending operation – runbraid recover”. - Only available when
pending-op.jsonexists.
Basic example
sudo braid recover
You’ll be prompted for the pool passphrase. Output shows the recovery process:
Recovering from interrupted "add" operation (started 2026-03-15T14:30:00Z)...
pool.json written from completed add membership.
pool.json written from committed add membership.
pending-op.json cleared. Recovery complete.
Before the pool.json lines, a real run prints either per-disk LUKS-open and mount rows (if the pool was offline) or a single pool already mounted at ... row (if it was already mounted). On the idle/no-paused owed RAID1 path, after the committed line it prints a RAID1 soft-balance replay row pair before the final pending-op.json cleared line. If the balance check is paused, running, or unknown, recover fails before the replay row and does not clear the journal.
Important
If recover refuses owed RAID1 replay because btrfs balance state is paused, running, or unknown, it left
pending-op.jsonin place. Inspect btrfs manually before clearing recovery state.
Common variations
Non-interactive (passphrase from stdin):
echo -n 'my-passphrase' | sudo braid recover --passphrase-stdin
Passphrase from a file:
sudo braid recover --passphrase-file /root/passphrase.txt
Recover with a missing disk (degraded mode):
sudo braid recover --allow-degraded
Preview what would happen:
sudo braid recover --dry-run
Flags
| Flag | Effect |
|---|---|
--passphrase-stdin | Read passphrase from stdin instead of TTY prompt |
--passphrase-file <path> | Read passphrase from a file instead of TTY prompt (conflicts with --passphrase-stdin) |
--allow-degraded | Allow mounting with missing devices (redundancy is reduced until you replace the missing device) |
--dry-run | Show what would be done without making changes |
--progress auto|always|never | Control progress display (default: auto) |
What happens under the hood
-
Loads
pending-op.json(refuses if absent – nothing to recover). -
Chooses the mount membership from the journal phase. Existing-pool add and remove-missing
PoolMutationphases mount from the pre-operation membership. Add, remove-missing, and replace post-maintenance phases mount from the committed target membership. ReplacePoolMutation, bootstrap addPoolMutation(the first disk, whose pre-operation membership is empty), andRemovemount from the admission membership (pre-operation snapshot plus target-only members) – for replace this matters because the kernel may still be completingdev_replace. -
Opens LUKS devices and mounts the pool (or reuses the existing mount if already mounted). Exception: a
Replace::PoolMutationjournal on an externally-mounted pool is refused (see Safety checks); replace post-maintenance recovery on an already-mounted pool is allowed. -
For
Replace::PoolMutationonly, if a kernel-resumed btrfs replace is in progress, waits for it to finish. -
For
Replace::PoolMutationonly, if the pool was just mounted by this recover run, performs a full relock-and-remount cycle (umount,btrfs device scan --forget, close LUKS, reopen, remount) to ensure the kernel’s in-memory device topology matches the on-disk state. -
Probes the live pool to discover actual membership.
-
For interrupted existing-pool add
PoolMutation, first runs a non-destructive Add target reconciliation pass: any journaled add target whose underlying disk is physically present and LUKS-openable is opened, scanned, and followed by a live-pool re-probe. Targets that turn out to be live pool members are adopted into the recoveredpool.jsonwithoutwipefsorbtrfs device add. -
For add
PoolMutation, replays only journaled targets that are not already live.RecoverableBraidLabeledtargets are replayed viawipefs --all --types btrfsplusbtrfs device add -fafter LUKS UUID and visible-FSID checks.FreshLukstargets that are physically present are replayed from the journaled format options, skipping format if the disk already has the expected LUKS label; if the journal carriedenroll_key_file, the keyfile is re-enrolled, then the LUKS header is backed up, the mapper is opened, andbtrfs device addruns without-f.FreshLukstargets that are physically absent or carry an unexpected LUKS label make recover fail and leavepending-op.jsonin place so the disk can be reattached or replaced and recovery rerun.See Pending LUKS header backups – copy each
.luksheaderoff-system and delete the local copy. -
For add
PostAddBalanceRaid1, does not format, enroll, back up headers as target prep, wipe, or add disks. It only validates the committed live pool and runs the owed RAID1 balance when btrfs balance state is idle; a paused, running, or unknown balance state fails closed with the journal preserved. -
For replace and remove-missing
PoolMutation, detects whether the primary btrfs membership mutation committed. If it did not commit, recover restores/keeps the pre-operationpool.json, clears the journal, and tells you to rerun the original command. It does not rerunbtrfs replace startorbtrfs device remove. -
For replace and remove-missing post-maintenance phases, validates committed live membership, repairs
pool.jsonif needed, and finishes only owed maintenance such as resize or, when btrfs balance state is idle, soft RAID1 balance; it does not rerun the primary btrfs membership mutation. A paused, running, or unknown balance state before owed RAID1 replay fails closed withpending-op.jsonpreserved. -
Resolves
/dev/disk/by-id/paths from live LUKS UUIDs, using btrfs devid only for missing or null-underlying bindings (not from the journal’s by-id path, which may be stale). -
Writes or repairs
pool.jsononly after the journal phase allows it and live membership is complete. -
Clears
pending-op.jsononly after membership is complete and any owed balance work is done.
Safety checks
- Refuses if no
pending-op.jsonexists. - Refuses if another braid operation is in progress (pool lock
/run/braid-pool.lockis held) – retry once it finishes. - Refuses to adopt live pool members outside the recovery admission membership for the current journal phase (guards against devices added outside braid). Most phases admit the pre-operation snapshot plus target-only members;
Replace::PostReplaceMaintenanceadmits only the committed target membership because btrfs preserves the old device’s devid on the replacement after commit. - Hard-fails if a live pool device has no
/dev/disk/by-id/symlink (recovery can’t guess a stable identifier). - Detects interrupted bootstrap add (first disk, no filesystem yet) and gives specific wipe-and-retry instructions instead of a confusing mount error.
- Refuses to overwrite
pool.jsonor clearpending-op.jsonif the post-mount probe at the configured mount point sees the pool unmounted or with zero btrfs devices. The mount may have been removed externally between recover’s mount step and membership probe;pool.jsonandpending-op.jsonare both preserved – investigate the mount, then re-runbraid recover. - For existing-pool add recovery, refuses to clear the journal while any journaled add target is missing from the live pool.
- Returned-disk replay may need a pool passphrase even when the pool is already mounted, because the mapper for the journaled target may still be closed.
- Without
--allow-degraded, refuses to mount if devices are missing (exit code 2 for degraded-refused, distinguishing it from other errors). - Refuses to recover
Replace::PoolMutationwhen the pool is already mounted (admin-mounted, circumventing braid’s pending-op preflight onunlock). The kernel may have resumed an interrupteddev_replaceon that mount session, leaving stale in-memory device state that recover cannot scrub without unmounting – which it will not do on a mount it does not own. Remediation:sudo braid lock; sudo braid recover.
Related commands
- status – shows pending operation state and prompts you to recover
- discover – rebuild UUID-keyed pool.json from LUKS labels and UUIDs (when there’s no journal)
- unlock – normal unlock (when no journal exists)