Recovering software raid

Question

I have a 3-disk software raid 5. Two disk appear to have failed at the same time; their number of Events is the same, while other third disk's is higher. I have copied all three disks into new partitions so that I can experiment on them without hosing them further, and tried recreating the array with just the broken two (since they should be in the same state). But nothing I've tried gets me a usable superblock. Are there other things I can try to recover the data?

Here is the mdadm --examine for each drive:

/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : f8d0c619:9f54ad08:bd0b98c0:101144a1
  Creation Time : Sun Jul 18 01:56:33 2010
     Raid Level : raid5
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sat Sep 27 13:59:35 2014
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 2
  Spare Devices : 0
       Checksum : cbf4174b - correct
         Events : 5983

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
/dev/sdd1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : f8d0c619:9f54ad08:bd0b98c0:101144a1
  Creation Time : Sun Jul 18 01:56:33 2010
     Raid Level : raid5
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sat Sep 27 08:00:42 2014
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : cbf3c2d6 - correct
         Events : 5940

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       33        1      active sync   /dev/sdc1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync   /dev/sdd1
/dev/sde1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : f8d0c619:9f54ad08:bd0b98c0:101144a1
  Creation Time : Sun Jul 18 01:56:33 2010
     Raid Level : raid5
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sat Sep 27 08:00:42 2014
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : cbf3c2e8 - correct
         Events : 5940

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       49        2      active sync   /dev/sdd1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync   /dev/sdd1

wurtel · Accepted Answer · 2014-10-27 12:14:50Z

0

mdadm --create --level 5 -n 3 --assume-clean -p ls -c 64 /dev/md0 missing /dev/sdc1 /dev/sdd1

This creates the raid in the same way as it originally was, with a missing device. Most importantly the number of devices, the layour, the chunk size and the order of the devices is the same. The missing device name together with --assume-clean) prevents any reinitializing, so you should be able to access the data again.

After creating the device again you can add a third disk to replace the missing one.

answered Oct 27, 2014 at 12:14

wurtel

1,5578 silver badges9 bronze badges

I'm a bit averse to trying this on the original disks yet, but I tried it on my copies with: sudo mdadm --create --level 5 -n 3 --assume-clean -p ls -c 64 /dev/md1 missing /dev/sdb2 /dev/sdb3. This creates without complaint, but e2fsck can't find the superblock.
– Jorenko
Commented Oct 28, 2014 at 0:21
The copies need to be on a device that is exactly the same size as the original, as version 0.90 metadata is located at the end of the device. If you copied to a larger device, then the existing metadata can't be found. Using losetup with the --sizelimit option to create a loop device on your copies with the correct size could help. mdadm --examine on your copy should show the same information as you originally posted.
– wurtel
Commented Oct 28, 2014 at 8:29
Yes, I had this problem the first time I created my copies. They are the correct size now, and mdadm --examine does show the same result.
– Jorenko
Commented Oct 28, 2014 at 10:41
The only thing I can think of is that the order of the component devices might not be correct, or the --assume-clean needs to be left off with a missing component. I've used this technique on a number of occasions (usually after someone pulled the wrong hotswap disk with the intent of replacing a failed drive, i.e. getting confused about which drive had failed), and never had problems.
– wurtel
Commented Oct 28, 2014 at 12:02

Add a comment |

Stack Exchange Network

Recovering software raid

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
raid
mdadm
.

Hot Network Questions

Recovering software raid

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxraidmdadm.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
raid
mdadm
.