RAID0 data recovery & recovery strategy validation

Question

I have a Synology expansion (DX213) connected to my NAS. It houses 2 disks of 2TB and they are in a RAID0 configuration (awful idea, I know and I don't need a reminder ;) ). Last weekend, the array failed and I can no longer start the RAID array.

I am starting to believe that the issue originated in the backplane (the DX213) and not in the disks, because they look fine. They are definitely not dead (yet). I have them connected to a linux machine and I can see them just fine:

$ sudo fdisk -l /dev/sdb
Disk /dev/sdb: 1.8 TiB, 2000396746752 bytes, 3907024896 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000a85dd

Device     Boot   Start        End    Sectors  Size Id Type
/dev/sdb1           256    4980735    4980480  2.4G 83 Linux
/dev/sdb2       4980736    9175039    4194304    2G 82 Linux swap / Solaris
/dev/sdb3       9437184 3907024064 3897586881  1.8T 83 Linux

$ sudo fdisk -l /dev/sdc
Disk /dev/sdc: 1.8 TiB, 2000396746752 bytes, 3907024896 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0004dd4e

Device     Boot   Start        End    Sectors  Size Id Type
/dev/sdc1           256    4980735    4980480  2.4G 83 Linux
/dev/sdc2       4980736    9175039    4194304    2G 82 Linux swap / Solaris
/dev/sdc3       9437184 3907024064 3897586881  1.8T 83 Linux

When examining the disks, mdadm can still recognize the raid array and both disks appear to be in a clean state but the superblocks on both disks are clearly out of sync.

$ sudo mdadm --examine /dev/sd[bc]3 
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 1d7dd58f:dd7dd3d2:b646173b:afd51417
           Name : mist-nas:2
  Creation Time : Tue Nov 26 19:47:24 2013
     Raid Level : raid0
   Raid Devices : 2

 Avail Dev Size : 3897584833 (1858.51 GiB 1995.56 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 46933df7:36901a5b:7a1239fe:e999c419

    Update Time : Sat Aug 27 20:14:12 2016
       Checksum : 42117b5b - correct
         Events : 8

     Chunk Size : 64K

   Device Role : Active device 0
   Array State : A. ('A' == active, '.' == missing, 'R' == replacing)

/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 1d7dd58f:dd7dd3d2:b646173b:afd51417
           Name : mist-nas:2
  Creation Time : Tue Nov 26 19:47:24 2013
     Raid Level : raid0
   Raid Devices : 2

 Avail Dev Size : 3897584833 (1858.51 GiB 1995.56 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : e4b60f4c:604b2e27:359cb71b:24453937

    Update Time : Tue Nov 26 19:47:24 2013
       Checksum : 997fa41a - correct
         Events : 4

     Chunk Size : 64K

   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

The only difference is the last update timestamp and the event count. I know that no write operations were ongoing when the array went down and both disks are in a clean state, so I am quite confident I can still access my data. To recover though, I will have to recreate the array or fiddle with the faulty superblock and that gives me the creeps, to say the least...

I have cloned both drives with dd to new drives in order to have a backup in case I do something stupid. The new drives have a sector size of 4096 though (they are 3 and 4TB disks), whereas the old drives have a sector size of 512. The size of the sd[bc]3 partition is not a multiple of 4096 sectors, so I had to round up the size of the partition to the next sector. I hope that is not a problem?

The command I am considering to run is:

$ sudo mdadm --create --readonly --assume-clean --level=0 -n2 /dev/md2 /dev/sdb3 /dev/sdc3

This command will probably overwrite the current superblocks so I want to be absolutely sure this will not destroy my chances of getting my data back. What will be the result of this command?

I would also like to validate my strategy before really acting. I created 2 4GB partitions on a USB key, created a RAID0 array with them, created an EXT4 filesystem on the array, mounted it and copied some files on it. The question is how I can manipulate the superblock of one of the partitions to recreate the situation I have with the 4TB array.

I was considering using a hex editor to manipulate the superblock manually, but then I would probably also need to recalculate the checksum. How should I do this?

Overmind · Accepted Answer · 2016-08-30 11:43:35Z

You should remove the drive from the array, remove it from the system, re-probe for disks, and then re-add the drive back into the array.

Remove the failed drive from the array with

mdadm --manage --set-faulty

Remove and re-insert the drive from/into the system physically (or using device delete and scsi host rescan).

Now check if the drive is found again, and check if it works correctly. You can see dmesg output or look at /proc/partitions. Run a pv < on the device.

Then re-add the drive to the array with mdadm .

Then do a final check with cat /proc/mdstat to see if you succeeded.

mstaessen · Accepted Answer · 2016-09-03 14:00:47Z

I managed to get my data back, though not in a trivial way (spoiler alert: it includes hex editors and some reverse engineering). I am just posting my approach for future reference.

So my RAID0 array is broken because of non-matching superblocks. Since there is no redundancy in RAID0, mdadm cannot start a RAID0 array unless all superblocks match up. My disks looked fine but the superblocks were out of sync.

Solution: make the superblocks match again.

First idea: Running the above command, will recreate the RAID array exactly as it was before, but will overwrite the current superblocks.

Assessment first idea: risky. There is no guarantee that mdadm will recreate the array exactly the same way as before. Maybe I'd forget some parameters, maybe mdadm writes in other places than those I want, destroying my underlying filesystem and data, or even something else.

Conclusion: bad idea.

Second idea. Manipulate superblocks myself using a hex editor.

Pros:

I am in control, unless I make a stupid mistake, no changes will be made to bytes that do not matter.
Only non-matching values of superblock will be modified, so layout of the array will not be affected.

Challenges:

Where is the superblock written on disk?
How does it look like?
Can I identify the correct bytes and reconstruct the output of mdadm --examine from reading the hex values?
Changing attributes will invalidate the superblock checksum, how do I obtain a valid checksum?

As it turns out, these challenges are quite easy to overcome. There is a great page on the linux-raid wiki: https://raid.wiki.kernel.org/index.php/RAID_superblock_formats. It documents the v1 superblock and where to find it on a disk. For a v1.2 superblock, it is located 4K from the beginning of the disk and is written in the next 4K (because it is sector aligned and new disks use 4K sectors, even though the disk it is used on had 512 byte sectors).

You can also refer to the source code of the v1 superblock which isn't too hard to read: https://github.com/neilbrown/mdadm/blob/master/super1.c

After careful analysis, I settled on this plan:

First, back up the first 8K of every disk. This way I can always go back to the original state.

dd if=/dev/sdXY of=sdXY.backup bs=1 count=8K
Extract the superblocks of every disk. This can easily be done

dd if=/dev/sdXY of=sdXY.superblock bs=1 count=4K skip=4K
Read in the superblock in a hex editor. I found the web based http://hexed.it to be very good.
Modify the necessary attributes, leave the checksum as is. Be careful when modifying timestamps. A linux timestamp takes 32 bits or 4 bytes, in mdadm a timestamp takes up 64 bits or 8 bytes. Do not forget to copy the other 4. The superblock is 256 bytes + 2 bytes for each member of the array. These last bytes are a sequence of member id's or roles.
Write the superblock to disk.

dd if=sdXY.superblock of=/dev/sdXY bs=1 count=4K seek=4K
Examine the superblock with mdadm --examine /dev/sdXY. It will show you that the checksum is invalid, but will also show you the expected checksum.
Modify the checksum to the correct one. In the hex editor the bytes are inversed, so ``99 7F A4 1Abecomes1A A4 7F 99` in the hex editor.
Write the new superblock to disk with the same command as step 5.
Repeat for every disk.

When both superblocks matched up, I was able again to start the array. I checked the filesystem and it appeared to be clean. Mounted the filesystem and copied everything to a RAID5 array, which I will also protect with a UPS any time soon.

I got very lucky and won't forget these very scary moments. I have always kept my calm and kept thinking how I could reassemble the array.

I would strongly advise against playing around with your broken array before thoroughly analysing the problem. Also, I wrote down my plan before starting, so that I would not skip a step, resulting in risk of data loss.

Stack Exchange Network

RAID0 data recovery & recovery strategy validation

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
raid
software-raid
mdadm
raid-0
.

Hot Network Questions

RAID0 data recovery & recovery strategy validation

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxraidsoftware-raidmdadmraid-0.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
raid
software-raid
mdadm
raid-0
.