2

I was transfering data from and disk to a new one. However the new one an seagate IronWolf 12Tb has trouble (may be too sensible to output voltage). Never the less then replace operation stops on with a message I did not take time to note.

So I had to reboot to remove Seagate disk. I perform a btrfs check on the original disk that ends with no errors, stop the server, remove de 12Tb disk and reboot...

With a boot failure as my btrfs device won't mount with :

mount: wrong fs type, bad option, bad superblock on /dev/sdd1,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

So I perform (as you should did) the : dmesg | tail and get :

[ 2833.182505] BTRFS info (device sdd1): disk space caching is enabled
[ 2833.182515] BTRFS info (device sdd1): has skinny extents
[ 2833.321953] BTRFS warning (device sdd1): cannot mount because device replace operation is ongoing and
[ 2833.321962] BTRFS warning (device sdd1): tgtdev (devid 0) is missing, need to run 'btrfs dev scan'?
[ 2833.321969] BTRFS error (device sdd1): failed to init dev_replace: -5
[ 2833.339466] BTRFS: open_ctree failed

Well I agree with the situation however the "btrfs replace cancel" require a mount point. And the system refuse to mount... the dog looking ofr his tail.

usage: btrfs replace cancel <mount_point>

I've made many search and did not find any viable solution. I've search for "replace operation is ongoing" and I hopefully found page with source code of : dev-replace.c where if found this block of code :

    /*
     * allow 'btrfs dev replace_cancel' if src/tgt device is
     * missing
     */
    if (!dev_replace->srcdev &&
        !btrfs_test_opt(dev_root, DEGRADED)) {
        ret = -EIO;
        pr_warn("btrfs: cannot mount because device replace operation is ongoing and\n" "srcdev (devid %llu) is missing, need to run 'btrfs dev scan'?\n",
            (unsigned long long)src_devid);
    }
    if (!dev_replace->tgtdev &&
        !btrfs_test_opt(dev_root, DEGRADED)) {
        ret = -EIO;
        pr_warn("btrfs: cannot mount because device replace operation is ongoing and\n" "tgtdev (devid %llu) is missing, need to run btrfs dev scan?\n",
            (unsigned long long)BTRFS_DEV_REPLACE_DEVID);

}

Not it's a small advise that the "official" reason of the error is that the btrfs volume is degraded. Hopefully i was reading this page as the same time : Using Btrfs with Multiple Devices where I was reading :

Replacing failed devices

Using btrfs replace

When you have a device that's in the process of failing or has failed in a RAID array you should use the btrfs replace command rather than adding a new device and removing the failed one. This is a newer technique that worked for me when adding and deleting devices didn't however it may be helpful to consult the mailing list of irc channel before attempting recovery.

First list the devices in the filesystem, in this example we have one missing device that we will replace with a new drive of the same size. In the following output we see that the final device number (which is missing) is device 6:

enter code here

user@host:~$ sudo btrfs filesystem show
Label: none  uuid: 67b4821f-16e0-436d-b521-e4ab2c7d3ab7
     Total devices 6 FS bytes used 5.47TiB
     devid    1 size 1.81TiB used 1.71TiB path /dev/sda3
     devid    2 size 1.81TiB used 1.71TiB path /dev/sdb3
     devid    3 size 1.82TiB used 1.72TiB path /dev/sdc1
     devid    4 size 1.82TiB used 1.72TiB path /dev/sdd1
     devid    5 size 2.73TiB used 2.62TiB path /dev/sde1
     *** Some devices missing

This is not my exact situation as I do not have "*** Some devices missing", however it's quite close. I read the following :

If the device is present then it's easier to determine the numeric device ID required.

Before replacing the device you will need to mount the array, if you have a missing device then you will need to use the following command:

sudo mount -o degraded /dev/sda1 /mnt

Here it was the way to mount a degraded btrfs in order to cancel an interupted replace operation.

1 Answer 1

2

So here his the full solution :

[root@home disk]# mount /dev/sdd1 /store/backup_big_btrfs/
mount: wrong fs type, bad option, bad superblock on /dev/sdd1,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
[root@home disk]# dmesg | tail
[ 2833.182505] BTRFS info (device sdd1): disk space caching is enabled
[ 2833.182515] BTRFS info (device sdd1): has skinny extents
[ 2833.321953] BTRFS warning (device sdd1): cannot mount because device replace operation is ongoing and
[ 2833.321962] BTRFS warning (device sdd1): tgtdev (devid 0) is missing, need to run 'btrfs dev scan'?
[ 2833.321969] BTRFS error (device sdd1): failed to init dev_replace: -5
[ 2833.339466] BTRFS: open_ctree failed
[root@home disk]# btrfs replace cancel
btrfs replace cancel: too few arguments
usage: btrfs replace cancel <mount_point>

    Cancel a running device replace operation.

[root@home disk]# btrfs replace cancel /store/backup_big_btrfs
ERROR: not a btrfs filesystem: /store/backup_big_btrfs
[root@home disk]# mount -o degraded /dev/sdd1 /store/backup_big_btrfs/
[root@home disk]# btrfs replace cancel /store/backup_big_btrfs/
[root@home disk]# umount /store/backup_big_btrfs/
[root@home disk]# mount /dev/sdd1 /store/backup_big_btrfs/

When you can't mount btrfs due to an failed replace opperation then :

  1. Check your btrfs volume
  2. mount it with option -o degraded
  3. cancel the replace opperation
  4. unmount the btrfs volume
  5. mount it without any option

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .