Decision: btrfs RAID1
Principle: btrfs RAID1
Context
The NAS needs checksumming (bit rot detection), self-healing (automatic repair from redundant copy), and dynamic drive pooling (add/remove drives without reformatting). The filesystem sits on top of LUKS.
Options considered
ZFS raidz
- Checksumming + self-healing + RAID. Mature and well-tested.
- Rejected: out-of-tree kernel module. Licensing conflict means it can never be mainlined. NixOS supports it but it’s a second-class citizen — kernel updates can break the module, and the build dependency is heavy.
btrfs RAID5/6
- Same benefits as RAID1 with less overhead (parity instead of mirroring).
- Rejected: not production-ready. The write-hole bug has been a known issue for years. Data loss reports exist. The btrfs wiki explicitly warns against it.
SnapRAID + mergerfs
- Parity-based protection with independent drives. ~75% space efficiency with 3+1.
- Rejected: no auto-healing. SnapRAID syncs on a schedule (e.g., nightly). Bit rot between syncs is undetected. No checksumming on read. Drives are independent ext4 — good for recovery but no real-time protection.
btrfs RAID1
- Checksums every block on read, heals from the RAID1 copy automatically. Dynamic pool —
btrfs device add/removeat any time with any size drive. In-kernel, first-class NixOS support. Simple stack: LUKS + btrfs. - Accepted.
Decision
btrfs RAID1. The 50% space overhead is accepted as the cost of real-time auto-healing with a simple, in-kernel stack.
Tradeoffs accepted
- 50% space overhead — 3x 12TB = ~18TB usable. Parity schemes would give ~24TB.
- Fixed 2-way redundancy — btrfs RAID1 keeps exactly 2 copies of every block, regardless of pool size. A 3- or 4-drive pool tolerates one drive failure, the same as a 2-drive pool. Additional drives buy usable capacity, not extra fault tolerance. Higher-redundancy profiles (RAID1C3, RAID1C4) exist in btrfs but are not used by braid — the product’s redundancy story is “tolerate one drive failure.”
- No drive independence — drives are part of a btrfs pool, not individually mountable. Recovery requires a working btrfs toolchain.
- Rebalancing cost — adding or removing a drive triggers a balance operation that can take hours on large pools.
- Incremental growth — start with 1 drive (single profile, no redundancy), add a second to convert to RAID1. This is a feature, not a tradeoff — data is available immediately, protection comes when the second drive arrives.
Replacement strategy
Device replacement always uses btrfs replace start, including when the source device is missing. btrfs replace start <devid> supports replacing by devid when the source is unavailable, rebuilding from RAID1 mirrors. This is preferred over the alternative btrfs device add + btrfs balance + btrfs device remove approach because:
- No degraded balance: btrfs docs explicitly warn against balancing a degraded filesystem to lower redundancy.
btrfs replaceavoids this entirely. - Devid preservation: the new device inherits the old devid, keeping the pool topology stable.
- Single operation: one
btrfs replace startcall vs. three separate commands with partial-failure risk.
braid remove-missing is retained for cleanup only (forgetting stale device entries), not for replacement.
When braid blocks a live replacement because the pool has missing devices, the intended next step is repairing the missing device via braid replace --old <missing-name> --new <new-name>=/dev/disk/by-id/<...> (the missing devid auto-resolves from --old), not forgetting it.
See
cli/src/cmd.rs—base_mount_options()and the btrfs mount invocationtests/storage/btrfs-heal.nix— validates auto-healingtests/storage/btrfs-grow1.nix,tests/storage/btrfs-shrink.nix— validates dynamic pooling