I want to build a NAS using mdadm for the RAID and btrfs for the bitrot detection. I have a fairly basic setup, combining 3 1TB disks with mdadm to a RAID5, than btrfs on top of that.
I know that mdadm cannot repair bitrot. It can only tell me when there are mismatches but it doesn't know which data is correct and which is faulty. When I tell mdadm to repair my md0 after I simulate bitrot, it always rebuilds the parity. Btrfs uses checksums so it knows which data is faulty, but it cannot repair the data since it cannot see the parity.
I can however run a btrfs scrub and read the syslog to get the offset of the data that did not match its checksum. I then can translate this offset to a disk and a offset on that disk, because I know the data start offset of md0 (2048 * 512), the chunk size (512K) and the layout (left-symmetric). The layout means that in my first layer the parity is on the third disk, in the second layer on the second disk, and in the third layer on the first disk.
Combining all this data and some more btrfs on disk-format knowledge, I can calculate exactly which chunk of which disk is the faulty one. However, I cannot find a way to tell mdadm to repair this specific chunk.
I already wrote a script which swaps the parity and the faulty chunk using the dd command, then starts a repair with mdadm and then swaps them back, but this is not a good solution and I would really want mdadm to mark this sector as bad and don't use it again. Since it started to rot, chances are high it will do it again.
My question is: is there any way to tell mdadm to repair a single chunk (which is not the parity) and possibly even mark a disk sector as bad? Maybe creating a read io error?
( And I know ZFS can do all this all by itself, but I don't want to use ECC memory )
Edit: this question / answer is about how btrfs RAID6 is unstable and how ZFS is much more stable / useable. That does not address my question about how to repair a single known faulty chunk with mdadm.