BTRFS RAID5 disk failed while balancing

Question

If you clicked on the link to this topic: Thank you!

I have the following setup:

6x 500GB HDD-Drives
1x 32GB NVME-SSD (Intel Optane)

I used bcache to setup up the SSD as caching device and all other six drives are backing devices. After all that was in place, I formatted the six HHDs with btrfs in RAID5. Everything works as expected for the last 7 months now.

By now I have a spare of 6x 2TB HDD drives and I want to replace the old 500GB disks one by one. So I started with the first one by deleting it from the btrfs. This worked fine, I had no issues there. After that I cleanly detached the empty disk from bcache, still everything is fine, so I removed it. Here are the commandlines for this:

sudo btrfs device delete /dev/bcacheX /media/raid
cat /sys/block/bcacheX/bcache/state
cat /sys/block/bcacheX/bcache/dirty_data
sudo sh -c "echo 1 > /sys/block/bcacheX/bcache/detach"
cat /sys/block/bcacheX/bcache/state

After that I installed one of 2TB drives, attached it to bcache and added it to the raid. The next step was to balance the data over to the new drive. Please see the commandlines:

sudo make-bcache -B /dev/sdY
sudo sh -c "echo '60a63f7c-2e68-4503-9f25-71b6b00e47b2' > /sys/block/bcacheY/bcache/attach"
sudo sh -c "echo writeback > /sys/block/bcacheY/bcache/cache_mode"
sudo btrfs device add /dev/bcacheY /media/raid
sudo btrfs fi ba start /media/raid/

The balance worked fine until ~164GB were written to the new drive, this is about 50% of the data to be balanced. Suddenly write errors on the disk appear. The Raid slowly became unusable (I was running 3 VMs of the RAID while balancing). I think it worked for some time due to the SSD commiting the writes. At some point the balancing stopped and I was only able to kill the VMs. I checked the I/Os on the disks and the SSD spit out constant 1,2 GB/s read. I think the bcache somehow delivered data to the btrfs and it got rejected there and requested again, but this is just a guess. Anyway, I ended up resetting the host and I physically disconnected the broken disk and put a new one in place. I also created a bcache backing device on it and issued the following command to replace the faulty disk:

sudo btrfs replace start -r 7 /dev/bcache5 /media/raid

The filesystem needs to be mounted read/write for this command to work. It is now doing its work, but very slow, about 3,5 MB/s. Unfortunately the syslog reports a lot of these messages:

...
scrub_missing_raid56_worker: 62 callbacks suppressed
BTRFS error (device bcache0): failed to rebuild valid logical 4929143865344 for dev (null)
...
BTRFS error (device bcache0): failed to rebuild valid logical 4932249866240 for dev (null)
scrub_missing_raid56_worker: 1 callbacks suppressed
BTRFS error (device bcache0): failed to rebuild valid logical 4933254250496 for dev (null)
....

If I try to read a file from the filesystem, the output-command fails with a simple I/O error and the syslog shows something entries similar to this:

BTRFS warning (device bcache0): csum failed root 5 ino 1143 off 7274496 csum 0xccccf554 expected csum 0x6340b527 mirror 2

So far, so good (or bad). It took about 6 hours for 4,3% of the replacement so far. No read or write errors have been reported for the replacement procedure ("btrfs replace status"). I will let it to its thing until finished. Before the first 2TB disk failed, 164 GB of data have been written according to "btrfs filesystem show". If I check the amount of data written to the new drive, the 4,3% represent about 82 GB (according to /proc/diskstats). I don't know how to interpret this, but anyway.

And now finally my questions: If the replace command finishes successfully, what should I do next. A scrub? A balance? Another backup? ;-) Do you see anything that I have done wrong in this procedure? Do the warnings and the errors reported from btrfs mean, that the data is lost? :-(

Here is some additional info (edited):

$ sudo btrfs fi sh
Total devices 7 FS bytes used 1.56TiB
Label: none  uuid: 9f765025-5354-47e4-afcc-a601b2a52703
devid    0 size 1.82TiB used 164.03GiB path /dev/bcache5
devid    1 size 465.76GiB used 360.03GiB path /dev/bcache4
devid    3 size 465.76GiB used 360.00GiB path /dev/bcache3
devid    4 size 465.76GiB used 359.03GiB path /dev/bcache1
devid    5 size 465.76GiB used 360.00GiB path /dev/bcache0
devid    6 size 465.76GiB used 360.03GiB path /dev/bcache2
*** Some devices missing

$ sudo btrfs dev stats /media/raid/
[/dev/bcache5].write_io_errs    0
[/dev/bcache5].read_io_errs     0
[/dev/bcache5].flush_io_errs    0
[/dev/bcache5].corruption_errs  0
[/dev/bcache5].generation_errs  0
[/dev/bcache4].write_io_errs    0
[/dev/bcache4].read_io_errs     0
[/dev/bcache4].flush_io_errs    0
[/dev/bcache4].corruption_errs  0
[/dev/bcache4].generation_errs  0
[/dev/bcache3].write_io_errs    0
[/dev/bcache3].read_io_errs     0
[/dev/bcache3].flush_io_errs    0
[/dev/bcache3].corruption_errs  0
[/dev/bcache3].generation_errs  0
[/dev/bcache1].write_io_errs    0
[/dev/bcache1].read_io_errs     0
[/dev/bcache1].flush_io_errs    0
[/dev/bcache1].corruption_errs  0
[/dev/bcache1].generation_errs  0
[/dev/bcache0].write_io_errs    0
[/dev/bcache0].read_io_errs     0
[/dev/bcache0].flush_io_errs    0
[/dev/bcache0].corruption_errs  0
[/dev/bcache0].generation_errs  0
[/dev/bcache2].write_io_errs    0
[/dev/bcache2].read_io_errs     0
[/dev/bcache2].flush_io_errs    0
[/dev/bcache2].corruption_errs  0
[/dev/bcache2].generation_errs  0
[devid:7].write_io_errs    9525186
[devid:7].read_io_errs     10136573
[devid:7].flush_io_errs    143
[devid:7].corruption_errs  0
[devid:7].generation_errs  0

$ sudo btrfs fi df /media/raid/
Data, RAID5: total=1.56TiB, used=1.55TiB
System, RAID1: total=64.00MiB, used=128.00KiB
Metadata, RAID1: total=4.00GiB, used=2.48GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

$ uname -a
Linux hostname 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ btrfs --version
btrfs-progs v4.15.1

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

Thank you again for reading and hopefully your comments/answers!

EDIT 2

The 'device replace' just finished. It was next to the 9% mark, I think this percentage matches the written amount of data on the drive: 164 GiB from a total size of 1,82 TiB. So 100% would mean a full 2TB replacemant. So here some additional output:

$ btrfs replace status -1 /media/raid/
Started on 30.Oct 08:16:53, finished on 30.Oct 21:05:22, 0 write errs, 0 uncorr. read errs

$ sudo btrfs fi sh
Label: none  uuid: 9f765025-5354-47e4-afcc-a601b2a52703
Total devices 6 FS bytes used 1.56TiB
devid    1 size 465.76GiB used 360.03GiB path /dev/bcache4
devid    3 size 465.76GiB used 360.00GiB path /dev/bcache3
devid    4 size 465.76GiB used 359.03GiB path /dev/bcache1
devid    5 size 465.76GiB used 360.00GiB path /dev/bcache0
devid    6 size 465.76GiB used 360.03GiB path /dev/bcache2
devid    7 size 1.82TiB used 164.03GiB path /dev/bcache5

Reading files still aborts with I/O errors, syslog still shows:

BTRFS warning (device bcache0): csum failed root 5 ino 1143 off 7274496 csum 0x98f94189 expected csum 0x6340b527 mirror 1
BTRFS warning (device bcache0): csum failed root 5 ino 1143 off 7274496 csum 0xccccf554 expected csum 0x6340b527 mirror 2

So I think the most harmless action is a read-only scrub, I just started the process. Errors and warnings flood the syslog:

$ sudo btrfs scrub start -BdrR /media/raid # -B no backgroud, -d statistics per device, -r read-only, -R raw statistics per device
$ tail -f /var/log/syslog
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2848, gen 0
BTRFS warning (device bcache0): checksum error at logical 4590109331456 on dev /dev/bcache5, physical 2954104832, root 5, inode 418, offset 1030803456, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10 Imaging-fixed.vdi)
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2849, gen 0
BTRFS warning (device bcache0): checksum error at logical 4590108811264 on dev /dev/bcache5, physical 2953977856, root 5, inode 1533, offset 93051236352, length 4096, links 1 (path: VMs/Virtualbox/vmrbreb/vmrbreb-fixed.vdi)
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2850, gen 0
BTRFS warning (device bcache0): checksum error at logical 4590109335552 on dev /dev/bcache5, physical 2954108928, root 5, inode 418, offset 1030807552, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10 Imaging-fixed.vdi)
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2851, gen 0
BTRFS warning (device bcache0): checksum error at logical 4590108815360 on dev /dev/bcache5, physical 2953981952, root 5, inode 621, offset 11864412160, length 4096, links 1 (path: VMs/Virtualbox/Win102016_Alter-Firefox/Win102016_Alter-Firefox-disk1.vdi)
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2852, gen 0
BTRFS warning (device bcache0): checksum error at logical 4590109339648 on dev /dev/bcache5, physical 2954113024, root 5, inode 418, offset 1030811648, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10 Imaging-fixed.vdi)
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2853, gen 0
BTRFS warning (device bcache0): checksum error at logical 4590109343744 on dev /dev/bcache5, physical 2954117120, root 5, inode 418, offset 1030815744, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging

My questions still remain: What should I do next. A scrub? A balance? Did I do something completely wrong? How to interpret the errors and warnings from the read-only scrub, may btrfs be able to fix them?

Your setup hides hardware errors behind cache. That’s not a good idea. If you really do need block caching, you should place it in front of the RAID array. Btrfs currently can’t do that. To mitigate problems with the current setup, consider switching to write-through caching. — Daniel B, Commented Oct 30, 2018 at 14:12
Would you mind mentioning which *ix OS, and version, you are using? Please click on edit above at left to edit the original post. — K7AAY, Commented Oct 30, 2018 at 16:38
If it's BTRFS, it's either Linux or ReactOS (yes, they actually have a driver). Assuming it's Linux given the context, in which case output from uname -a and btrfs --version should be added too. — Austin Hemmelgarn, Commented Oct 30, 2018 at 19:15
Thanks for your comments. I've added all requested info in the last block of the post. The process closes in on the 13 hour mark, 8,7% done; 160 GiB written — Oliver R., Commented Oct 30, 2018 at 20:01

Stack Exchange Network

BTRFS RAID5 disk failed while balancing

0

You must log in to answer this question.

Browse other questions tagged
raid
cache
hard-drive-failure
btrfs
raid-5
.

Hot Network Questions

BTRFS RAID5 disk failed while balancing

0

You must log in to answer this question.

Browse other questions tagged raidcachehard-drive-failurebtrfsraid-5.

Related

Hot Network Questions

Browse other questions tagged
raid
cache
hard-drive-failure
btrfs
raid-5
.