Decision: Systemd Lifecycle State Machine

Principle: Resilient by default

Context

braid needs systemd integration for three things: interactive unlock, unattended unlock, and clean shutdown (LUKS close before power-off). The module must not generate data-pool fileSystems or boot.initrd.luks.devices entries — those create hard boot dependencies on the data pool (see 003-resilient-boot.md). Instead, the CLI owns LUKS open/close and btrfs mount/unmount at runtime, and a thin systemd layer provides the entry points and shutdown hook.

Units

                          ┌─────────────────────┐
                          │  braid-pool.target   │  entry point
                          │  wants + after       │
                          └─────────┬────────────┘
                                    │ (soft dep)
                          ┌─────────▼────────────┐
                          │  braid-unlock.service │  interactive passphrase
                          │  oneshot              │
                          └─────────┬────────────┘
                                    │ (CLI marks online on success)
                          ┌─────────▼────────────┐
                          │  braid-online.service │  lifecycle owner
                          │  ExecStart=/bin/true  │
                          │  ExecStop=braid lock  │  --systemd-stop
                          │  oneshot, RAE         │
                          └──────────────────────┘

  braid-auto-unlock.service          (alternative unlock path, boot-time)
  wantedBy multi-user.target          activates braid-online via same CLI path

  mnt-storage.mount                   (auto-generated by systemd from /proc/mounts)

  braid-monitor.timer -> braid-monitor.service -> braid-alert.service -> braid-beep.service
  ConditionPathIsMountPoint            (health polling, skipped when pool not mounted)
                                \----> braid-alert-advisory.service

  braid-scrub.timer -> braid-scrub.service
  braid-online.service -> braid-scrub-resume-trigger.service -> braid-scrub.service
  BindsTo + After braid-online.service    (lifecycle-bound periodic scrub)
  Persistent=true                          (catch-up on activation)

RAE = RemainAfterExit = true

braid-pool.target — entry point

Public handle for “bring pool online.” User runs systemctl start braid-pool.target.

wants (not requires) braid-unlock.service — soft dependency. Unlock failure does not fail the target, and the target cannot block boot because nothing requires it.
after braid-unlock.service — ordering only.
Does not want or require braid-online.service. The CLI activates that separately after confirming the mount succeeded.

braid-unlock.service — interactive passphrase unlock

Single orchestrator: opens all LUKS devices and mounts the btrfs pool in one shot. Guarantees exactly one passphrase prompt (avoids relying on systemd-ask-password cache behavior across multiple LUKS units).

Type = oneshot — runs once, returns to inactive on completion. ConditionPathIsMountPoint (below) prevents re-run while mounted; the inactive state allows systemctl start braid-pool.target to re-unlock after a prior braid lock.
ConditionPathIsMountPoint = !${mountPoint} — skips if pool already mounted.
Calls systemd-ask-password --timeout=0 --id=braid | braid unlock --passphrase-stdin.

braid-auto-unlock.service — unattended USB keyfile unlock

Optional (only created when braid.autoUnlock.enable = true). Runs at boot, unlocks from a USB keyfile without interactive prompt.

wantedBy = [ "multi-user.target" ] — starts automatically at boot.
after = [ "local-fs.target" ] — waits for /run to exist.
ConditionPathIsMountPoint = !${mountPoint} — skips if pool already mounted.
No RemainAfterExit — intentional. If USB is absent at boot (service exits 0 on skip), a later systemctl start braid-auto-unlock can re-run when the USB is inserted.
Mounts USB read-only, validates keyfile path (symlink defense), runs braid unlock --key-file, always unmounts USB after (never leaves keyfile accessible).
Always exits 0 — failures are logged to the journal but never reported as unit failure, because auto-unlock must not block boot under any circumstance.

braid-online.service — lifecycle owner

State-ownership service. Its only purpose is to mark “pool is online” and run the bounded braid lock stop path on stop.

ExecStart = /bin/true — no work. Exists for its ExecStop hook.
ExecStop = braid lock --systemd-stop --deadline-secs <n> – unmounts pool and closes all LUKS on shutdown or manual stop with a bounded stop-coordinator/pool-lock wait below TimeoutStopSec. In this mode, braid permits a running or paused btrfs balance: a running balance is explicitly paused before unmount, an already-paused balance proceeds to unmount, and every other exclusive operation is refused. If the blocking btrfs balance userspace process briefly holds the mount fd after its parent dies, the systemd-stop path uses a longer transient-busy umount retry than plain braid lock.
RemainAfterExit = true — persists “active” state.
ConditionPathIsMountPoint = ${mountPoint} – systemd skips activation when the pool is not mounted (systemctl start returns 0 but the unit stays inactive). Defense-in-depth: the CLI’s mountpoint -q check is the primary gate, but this condition prevents direct systemctl start from leaving the unit active while unmounted.
TimeoutStopSec = 300s – raises the stop timeout from the 90s default so a slow braid lock is not SIGKILL’d mid-operation.
Not in any dependency chain. Neither the target nor unlock services want/require it. Activated exclusively by the CLI after mountpoint -q confirms the pool is mounted.

mnt-storage.mount — readiness contract

Auto-generated by systemd from /proc/mounts when the btrfs pool is mounted. Consumer services bind to this unit.

braid-monitor.timer + braid-monitor.service — health polling

Periodic oneshot (default: every 5 minutes). A detector for disk-health errors and, best effort, proactive RAID1 chunk-pair ENOSPC risk — checks btrfs device stats for errors and btrfs device usage for capacity risk.

This is the canonical exit-code table for braid monitor. ADR 014 owns the severity→beep semantics (which causes are Warning vs Critical) and defers the numbers to here:

Exit	Meaning	Wrapper action
0	Healthy, pool-offline, or pool-lock-contended cycle	nothing
1	Critical alert active (btrfs device errors, missing device, SMART, latched `ScrubFailed`, or fail-closed `ComputationError`)	start `braid-alert.service`, which pulls in `braid-beep.service` when beeping is enabled
3	Warning-only alert active (ENOSPC risk)	start `braid-alert-advisory.service` (bounded `alertCommand` only, no beep)
2	pre-`cmd_monitor` setup failure (pool-lock I/O, config load); never emitted by `cmd_monitor` itself	log `braid monitor failed` to the journal

ConditionPathIsMountPoint — skipped cleanly when pool is not mounted (no dependency-failure noise from timer). No After or BindsTo on mnt-storage.mount — those directives force systemd to load the unit, which doesn’t exist before the first unlock.
The audible beep is reserved for exit 1 / Critical. The wrapper routes exit 3 to the non-beeping advisory unit before the ≥ 2 failure branch, so a Warning is never misread as a monitor failure and never trains the operator to mute the channel ADR 014 built for a dying disk. A Warning-only ENOSPC advisory is not one-shot: it re-fires on later monitor cycles until acked, and again after each reminder interval elapses — that cadence is owned by ADR 014.
braid monitor fails closed: probe/parse/stats/mountinfo failures, acked-stats.json baseline read/parse failures, and alert-latch read/quarantine failures latch AlertCause::ComputationError and exit 1, so the wrapper above starts the beeper. See ADR 014 fail-closed contract for the cause taxonomy.
Scoped fail-open exception. The best-effort btrfs device usage ENOSPC probe is the single documented exception to the fail-closed mandate: a probe/parse/marker-load failure there skips only the EnospcRisk cause and never latches ComputationError, so device-error / missing-device alerting in the same cycle is untouched. ADR 014’s pure-detector section names this carve-out; the probe mechanism lives here.
The gate and the fail-closed path are independent mount checks, so the gate cannot mask a real alert. ConditionPathIsMountPoint resolves through statx(STATX_ATTR_MOUNT_ROOT) (then name_to_handle_at(2), then /proc/self/fdinfo) – a kernel VFS query, never a parse of /proc/self/mountinfo text. The fail-closed path above instead parses that text and latches ComputationError on a malformed line, duplicate target, or read error. On a genuinely-mounted pool statx reports a mount root regardless of any text anomaly, so the service runs and the beep fires – the protective beep is never gated away. The gate only short-circuits a statx-confirmed-offline pool; the sole beep it suppresses is braid’s conservative ComputationError on an offline pool with anomalous mountinfo text, which is not a disk-health alert.

braid-scrub.timer + scrub service + resume trigger – lifecycle-bound scrub

Periodic scrub (default: monthly). Uses a timer-lifecycle pattern distinct from the monitor’s ConditionPathIsMountPoint-only approach.

Timer is wantedBy, BindsTo, and After braid-online.service. Starts when pool comes online, stops when pool goes offline.
Persistent=true + AccuracySec=1d. When the timer activates (pool unlock), systemd compares the last-trigger stamp against OnCalendar. If a scrub was overdue during the offline period, it fires immediately.
braid-scrub.service is the only foreground scrub runner. It is Type=simple; its internal braid scrub-resume-or-start --mount <mount> ExecStart resumes saved scrub progress first, then starts a fresh scrub only when btrfs reports nothing resumable.
braid-scrub.service uses a shared ExecStop cancel script – same pattern as the nixpkgs btrfs scrub service. This cancels in-flight scrub on lock or shutdown through btrfs scrub cancel, leaving btrfs-progs’ /var/lib/btrfs/scrub.status.<fsid> progress file available for the next resume.
Failure alerting (onFailure + clean-teardown contract). braid-scrub.service declares onFailure = [ braid-scrub-failed.service ] (gated on monitor.enable, so there is no dangling unit reference when the monitor does not exist) and SuccessExitStatus = [ 3 ]. A genuinely failed scrub fails the unit and fires onFailure, which writes the scrub-failed flag and starts the beeper (see ADR 014). Two carve-outs make this safe:
- A cancelled scrub is not a failure. braid lock/suspend/shutdown cancel the in-flight scrub, and btrfs exits 1 for a cancel – indistinguishable from a genuine failure by exit code or by scrub status (btrfs sets canceled = !!ret, so a fatal error also renders as aborted). So the ExecStop script writes a cancel-request marker (/var/lib/braid/scrub-cancel-requested) before issuing the cancel, and scrub-resume-or-start keys off it: marker present at the post-exit check -> exit 0 (clean, resumable); marker absent -> exit non-zero (genuine failure -> onFailure). The runner removes any stale marker at entry, and that cleanup is fail-closed – if it cannot guarantee a clean slate (an un-removable marker), the run errors out and alerts rather than risk reading a later genuine exit 1 as a cancel. Scrub status is never consulted; the marker is the sole discriminator. lock/suspend/shutdown therefore never leave the unit failed.
- Corruption found is not an execution failure. btrfs exits 3 when a scrub completes but finds uncorrectable errors. SuccessExitStatus=3 declares that a service success, so corruption routes to the monitor’s BtrfsDeviceErrors device-stats poll (ADR 014), never to onFailure. This also fixes a latent bug where such a scrub silently left the unit failed. Only exit 3 is whitelisted; genuine failures (exit 1, no marker) still fail the unit.
braid-scrub-resume-trigger.service is the pool-online predicate-and-poke path. It is Type=oneshot, wantedBy, BindsTo, and After braid-online.service; it runs internal braid scrub-needs-resume --mount <mount> and starts braid-scrub.service with systemctl start --no-block only when saved progress is resumable.
The scrub service and resume trigger use BindsTo + After braid-online.service. On shutdown or systemctl stop braid-online.service, systemd stops them before braid lock runs.
ConditionPathIsMountPoint on the scrub service and trigger is defense-in-depth.
Serialization via single runner. Only braid-scrub.service ever runs btrfs scrub; both activation paths (timer and trigger) issue systemctl start braid-scrub.service, and systemd coalesces overlapping starts for the same unit. A completed scrub-resume-or-start run satisfies both an overdue timer fire and a pool-online resumable state, with no flock and no /run/braid-scrub.lock.
No pool lock. Distinct from the single-runner serialization above, the scrub subcommands also take LockPolicy::None, so a scheduled scrub never holds /run/braid-pool.lock: braid does not serialize it against a pool mutator, leaving real btrfs conflicts to the kernel. See Pool lock mutual exclusion for why that is safe (a balance overlaps; a replace is the kernel’s documented rejection case).
Conflicts + Before shutdown.target and sleep.target on the scrub service. The short-lived resume trigger also uses Conflicts + Before sleep.target so suspend setup wins cleanly against pool-online activation.

braid-alert.service + braid-beep.service – notification

braid-alert.service is the Critical alert orchestrator. It is a Type=oneshot, RemainAfterExit=true latch so repeated monitor cycles do not re-run notification work until braid ack stops it. It starts on monitor exit 1, from the SMART hook, and from braid-scrub-failed.service so a failed scrub alerts immediately without waiting for the next monitor cycle.

When beeping is enabled, braid-alert.service wants both braid-pcspkr-load.service and braid-beep.service. The pcspkr loader remains a separate hardened oneshot with CAP_SYS_MODULE; module loading is not folded into the alert orchestrator. The orchestrator also runs the operator’s alertCommand, if configured, through ${pkgs.runtimeShell} -c under a bounded timeout -k 5s wrapper. The bound is braid.monitor.alertCommandTimeoutSec (default 60 seconds). The command runs as root and is intentionally not sandboxed: examples may need network access, /home, or other host resources, and a systemd sandbox would silently break real notifiers. The timeout wrapper matters because oneshot services have no default start timeout; an unbounded notifier could otherwise leave the alert latch stuck in activating.

braid-beep.service owns the persistent PC speaker loop. It is Type=simple, uses exponential backoff, and calls the same braid-beep-probe wrapper that braid doctor discovers through /etc/braid/notifier-config.json. It declares BindsTo=braid-alert.service without After=. The missing After= is deliberate: the beep starts in parallel with the orchestrator, so a slow or hung alertCommand cannot delay the audible alarm. A first probe before pcspkr is loaded is harmless because the loop suppresses probe errors and retries after 5 seconds. BindsTo still propagates braid ack’s explicit systemctl stop braid-alert.service to the beep loop, so no Rust change is needed.

The beep loop is hardened with the shared sandbox baseline plus a small capability set (CAP_SETUID, CAP_SETGID) needed by setpriv to drop to nobody:beep. It deliberately does not use PrivateDevices, because that would hide /dev/input/* and the PC Speaker evdev node. It uses Restart=always, RestartSec=5, and StartLimitIntervalSec=0 so an independent loop death self-heals while an explicit cascaded stop from ack still wins. No extra sleep or shutdown edges are declared: normal service defaults already add the shutdown ordering, and the beep loop owns no pool or LUKS resource.

braid-alert-advisory.service is the Warning/exit-3 path. It is also a oneshot+RAE latch, but it never wants the beep unit and has no BindsTo relationship. It reuses the same bounded alertCommand wrapper as braid-alert.service, so a hung Warning notifier cannot wedge the blocking systemctl start braid-alert-advisory.service call inside the timer-driven braid-monitor.service.

Rust dispatch as synchronization layer

The wrapper (braid-wrapper.sh) is a pure exec shim: it sets the module-controlled PATH and execs the Rust binary. Synchronization lives in Rust dispatch (cli/src/main.rs), which owns the pool lock, braid-online.service lifecycle updates, and shutdown stop coordination. See 026-pool-lock-rust-owned.md.

modules/braid/cli.nix emits systemd_lifecycle = true for module-managed installs. Standalone CLI deployments omit it; those configs still get mount permission fixups but do not touch braid-online.service.

After every unlock, add, or recover attempt:

Rust dispatch acquires /run/braid-pool.lock, loads config and membership, and snapshots braid-online.service ActiveState only when systemd_lifecycle = true.
CLI opens LUKS + mounts pool when the command reaches its mount step. (recover self-mounts when recovering from an interrupted operation.)
Before dispatch returns, success or failure, Rust runs mark_online while the pool lock is still held.
mark_online checks mountpoint -q; pre-mount failures short-circuit here.
Rust sets permissions (root:poolAccessGroup 2770) if poolAccessGroup is configured.
When systemd_lifecycle = true, Rust starts braid-online.service only when the initial snapshot was inactive or failed.
If activation fails: prints WARNING to stderr, then preserves the command’s original exit result. Pool is mounted and usable; only the shutdown hook is missing.

On lock:

Plain braid lock acquires /run/braid-stop-coordinator.lock, then /run/braid-pool.lock.
When systemd_lifecycle = true, Rust stops braid-scrub.timer, braid-scrub-resume-trigger.service, then braid-scrub.service (timer first prevents re-trigger; trigger before service prevents the trigger from queuing a fresh start of the service being stopped; service last cancels in-flight scrub).
When systemd_lifecycle = true, Rust iterates systemctl show -P BoundBy braid-online.service and stops each remaining bound consumer (samba, nfs, future). The scrub units already handled in step 2 are skipped. This mirrors the cascade systemd performs on shutdown for user-initiated braid lock.
CLI unmounts pool + closes LUKS.
Plain braid lock writes done\n to /run/braid-stop-coordinator.lock.
When systemd_lifecycle = true, Rust checks the mount is gone and runs systemctl stop braid-online.service synchronously so the command returns only after the lifecycle owner is inactive. The synchronous stop runs only when the post-cleanup mountpoint check confirms the mount is gone; if the check itself fails, Rust warns and skips the stop, leaving the unit active for the operator to retry. The recursive ExecStop reentry polls the coordinator, observes done\n, and exits 0.

On system shutdown:

systemd stops braid-online.service (if active); its BindsTo+After cascade stops the scrub units and any full-triad consumer first. ExecStop then re-runs the same scrub-stop + BoundBy iteration as the “On lock” steps 2-3. For the scrub units and any consumer that follows the documented WantedBy+BindsTo+After triad, the cascade has already stopped them, so these re-issued stops are no-ops. A consumer that declares BindsTo without After has no stop-ordering guarantee and may still be active when ExecStop runs, so the explicit blocking stop here is what frees the mount. Running the pre-steps unconditionally covers both cases, keeping teardown code-owned and independent of cascade ordering.
ExecStop = braid lock --systemd-stop --deadline-secs <n> waits for an in-flight plain braid lock to finish through the stop coordinator, or waits for the pool lock up to the configured deadline.
Lock dispatch loads membership from pool.json; if pool.json is absent or corrupt, it warns and proceeds with empty membership because mapper cleanup still requires per-candidate LUKS UUID verification.
CLI unmounts and closes LUKS. If sysfs reports a running btrfs balance, --systemd-stop first runs btrfs balance pause so the kernel persists the paused balance before LUKS close; if sysfs reports an already-paused balance, teardown proceeds directly to unmount. Next-boot braid recover fails closed on that persisted paused balance and preserves pending-op.json for manual inspection instead of resuming it. Plain braid lock still refuses all active exclusive operations. The systemd-stop path also retries transient umount EBUSY longer than user lock so a surviving btrfs balance process can release its mount fd during shutdown.
Drives are safe to power off.

Pool lock mutual exclusion

Pool mutators, alert-state mutators, key enrollment, lock, and discover --write (unlock, add, recover, remove, remove-missing, replace, enroll, lock, discover --write, ack, monitor) acquire an exclusive flock on /run/braid-pool.lock in Rust dispatch before reading pool state. unlock, add, recover, remove, remove-missing, replace, enroll, lock, and discover --write are non-blocking fail-fast commands: if the lock is already held by another braid process, the CLI exits 1 immediately with braid: another braid operation is already in progress and the user must retry once the active operation completes. Bare discover is read-only and does not acquire the lock. ack waits up to 10 seconds before returning a retry message. monitor exits 0 silently on contention so a skipped timer cycle does not start alert notification. The lock is held through post-processing (permissions, braid-online activation/deactivation). Under the held lock, unlock re-checks whether the pool is already mounted and exits cleanly if a prior winner mounted it sequentially; other mutators operate on current locked state. See Principle 12.

Periodic scrub is deliberately exempt. The scrub subcommands – scrub-cancel, scrub-needs-resume, and scrub-resume-or-start – take the LockPolicy::None discipline in cli/src/main.rs#lock_policy, so braid-scrub.service never acquires /run/braid-pool.lock. A long monthly scrub can therefore run while a pool mutator holds the lock. If scrub instead took the pool lock, every non-blocking mutator would be rejected for the scrub’s entire multi-hour duration via the fail-fast another braid operation is already in progress path above – the pool lock is built for short mutations that briefly exclude each other, not a multi-hour hold.

Not holding the lock is safe because braid defers the real conflict check to the kernel. Scrub is not in btrfs’ exclusive_operation set, so it does not hold the exclop lock and a balance can overlap a running scrub. The one documented conflict is replace, which reuses btrfs’ scrub machinery: the kernel – not braid – refuses to start a replace while a scrub is in progress, returning “scrub is in progress” (the kernel’s SCRUB_INPROGRESS result). braid classifies that on the stderr substring in cli/src/pool.rs#replace_error and turns it into a recovery hint pointing the operator at btrfs scrub cancel (and braid status). tests/repro/btrfs-replace-rejected-during-scrub.py pins both halves: the kernel rejection and the classified hint.

Lock acquisition site

For non-dry-run pool mutators, alert-state mutators, key enrollment, lock, and discover --write, the operation lock is acquired in cli/src/main.rs dispatch before config load, pool.json load, journal read, identity probes, subprocess health probes, or interactive prompts. The shell wrapper must not acquire /run/braid-pool.lock; it execs the Rust binary and leaves critical-section ownership to dispatch.

A command started during another mutator could otherwise read stale state, then acquire the lock after the first command finishes and act on old inputs. Late acquisition also regresses the fail-fast UX – users see prompts and probes complete before being told the operation is contended.

The pool lock is the first real execution boundary. Do not model it after the sleep inhibitor’s late-acquisition pattern: the inhibitor protects against suspend mid-operation and can wait until the irreversible window; the pool lock protects against state-staleness and must precede any read of pool state.

ExecStop bounded-wait pattern

When a unit’s ExecStop= invokes a CLI that needs a contended resource (e.g. braid-online.service ExecStop=braid lock colliding with an in-flight mutator that holds the pool lock), the ExecStop path gets a distinct bounded-wait variant – not a fail-fast call. “ExecStop fails fast; in-flight work finishes and a later stop attempt succeeds” is not a valid design: during shutdown there is no later stop attempt. systemctl poweroff can leave the resource (mounted btrfs / open LUKS) in an inconsistent state, and the “in-flight mutator finishes before TimeoutStopSec” claim is not guaranteed.

Current pattern: braid-online.service runs braid lock --systemd-stop --deadline-secs ${braid.lockSystemdStopDeadlineSecs}. The module default is 270 seconds and an assertion requires it to be strictly less than braid-online.service TimeoutStopSec (300 seconds). That deadline bounds only stop-coordinator and pool-lock acquisition; once lock cleanup reaches btrfs balance pause or umount, any kernel wait to quiesce btrfs has no userspace timeout and is bounded only by the unit’s TimeoutStopSec (300 seconds). The systemd-stop path also has a longer transient-busy umount retry (60 attempts at 500ms) because btrfs-progs holds the mount fd while blocked in BTRFS_IOC_BALANCE_V2 and can survive the Rust parent briefly during shutdown. Regular braid lock stays fail-fast for user invocations; the bounded-wait path is documented and tested as a distinct mode.

`systemctl start/stop` inside held-resource windows

systemctl start <unit> on an already-active oneshot+RemainAfterExit unit is a no-op at the work level, but it still queues a job. If a stop job for the same unit is already in flight (because someone else invoked systemctl stop), the start queues behind the stop. If that stop’s ExecStop= is itself blocked on a resource the caller holds, the result is a deadlock.

This is load-bearing for any CLI that both holds a resource and uses systemctl start/stop on a unit whose ExecStart=/ExecStop= touches that resource (e.g. Rust dispatch holding pool.lock while activating braid-online.service whose ExecStop calls braid lock).

These rules govern start/stop of braid-online.service itself. The systemctl stop calls in run_lock_pre_steps target bound consumers and scrub units, not the lifecycle owner, so they queue no job against braid-online.service and the start-behind-stop deadlock above does not apply to them.

Rules:

Snapshot full unit state at the start of the held-resource window with systemctl show -P ActiveState <unit>. Do NOT use systemctl is-active – it returns “active” only for active, classifying activating and deactivating as not-active. A deactivating unit (its ExecStop is already running and waiting on the held resource) snapshotted as “not active” leads the caller to issue a start that queues behind the in-flight stop – the exact deadlock the snapshot was supposed to prevent.
Only emit systemctl start <unit> at the end of the window if the snapshot was inactive or failed. Skip when active, activating, or deactivating. See ADR 026 snapshot rule.
Only emit systemctl stop <unit> at the end of the window if the snapshot was active or activating. Skip when inactive, failed, or deactivating.
- Exception: plain braid lock’s post-success mark_offline runs a synchronous systemctl stop braid-online.service without a stop-side snapshot. It is safe because /run/braid-stop-coordinator.lock plus the done\n protocol guarantees the recursive ExecStop reentry exits 0 once plain braid lock has finished cmd_lock, instead of queuing behind the in-flight stop. This coordinator is the mechanism that replaces the stop-side snapshot gate for mark_offline; see ADR 026 stop coordinator. mark_offline skips the synchronous stop when the post-cleanup mountpoint -q check itself fails (e.g. OnlineError::Spawn mid-shutdown): the unit stays active and the operator retries. Treating unknown mount state as still-mounted mirrors mark_online’s start-side fail-safe.

Consumer dependency contracts

Services that depend on the pool being mounted use one of three patterns:

Frequent periodic services (monitor): ConditionPathIsMountPoint only. Neither After nor BindsTo on mnt-storage.mount – those directives force systemd to load the unit, which doesn’t exist until the CLI mounts the pool at runtime (auto-generated from /proc/mounts). The condition gate silently skips the service when unmounted. Fires every 5 minutes – missed fires are cheap, so lifecycle binding is unnecessary.

Infrequent periodic services (scrub): The timer, scrub service, and resume trigger use BindsTo + After on braid-online.service; the timer and trigger are wantedBy the online unit. The timer’s active lifecycle matches the pool’s online period. Persistent=true handles catch-up for overdue fires. Unlike the monitor timer (which fires every 5 minutes and can afford missed runs), the monthly scrub timer cannot wait until next month if it misses – lifecycle binding ensures it fires on the next unlock. The scrub service and resume trigger also get ConditionPathIsMountPoint as defense-in-depth. For manual lock, Rust dispatch stops the timer, resume trigger, and scrub service before unmount (see above).

Long-running services holding open files (samba, nfs): braid.poolBoundServices = [ "samba-smbd" "nfs-server" ]; is the canonical NixOS-module interface. It stamps the full WantedBy=braid-online.service + BindsTo=braid-online.service + After=braid-online.service triad (same shape as the scrub timer above), plus ConditionPathIsMountPoint=<pool mount>, onto services owned by other modules. BindsTo + After ensures systemd stops them before braid lock runs ExecStop, preventing unmount failures from busy filesystems; WantedBy ensures they restart automatically when braid unlock reactivates braid-online.service. The triad handles the unlock-start and lock-stop lifecycle, but these consumers carry their own boot or direct-start edges – NixOS wants samba-smbd.service from samba.target and nfs-server.service from multi-user.target. For starts not initiated by braid-online.service, ConditionPathIsMountPoint is the load-bearing gate that prevents serving an offline mount directory. Rust dispatch iterates BoundBy braid-online.service and stops these consumers before unmount, mirroring the cascade systemd performs on shutdown for user-initiated lock. See ../../guides/sharing-and-permissions.md#binding-shares-to-the-pool-lifecycle for the user-facing example.

Key design constraints

No hard boot dependencies. wants everywhere, never requires. Pool failure never blocks boot.
Rust-synchronized lifecycle. For dispatch-managed operations, Rust keeps braid-online synchronized with pool mount state: it activates the service only after mountpoint -q succeeds, and deactivates it after a successful lock. ConditionPathIsMountPoint on the unit is defense-in-depth against direct systemctl start when unmounted. Out-of-band mount or unmount bypasses dispatch and can leave braid-online stale; braid lock handles already-unmounted pools gracefully.
One passphrase prompt. braid-unlock.service is the sole interactive prompt source. The CLI opens all LUKS devices from that single passphrase.
Graceful degradation. If braid-online activation fails, the pool is still mounted and usable – only the shutdown hook is missing (warned to stderr).
One pool operation at a time. Enforced by a non-blocking flock in Rust dispatch, not wrapper logic or unit topology – concurrent attempts are rejected, not queued. See Principle 12.

See

modules/braid/storage.nix — unit definitions
modules/braid/pool-bound-services.nix – long-running consumer stamping
modules/braid/monitor.nix — monitor/alert units
modules/braid/braid-wrapper.sh — pure exec shim
026-pool-lock-rust-owned.md — Rust-owned pool lock and lifecycle synchronization
003-resilient-boot.md — why no hard dependencies
017-runtime-disk-membership.md — lifecycle model context
033-systemd-unit-hardening.md – systemd exec sandbox baseline and per-unit exceptions
tests/module/systemd-lifecycle.py — state machine test suite
tests/module/pool-bound-services.py – pool-bound consumer lifecycle coverage
tests/repro/btrfs-replace-rejected-during-scrub.py – kernel rejects a conflicting mutator during scrub; recovery hint classified

Keyboard shortcuts

braid