mdadm raid has unknown filesystem after adding new disk

Question

My Computer is running ubuntu 1804 but i am new to linux.
I am using mdadm and created a raid-5 with 3x 4TB drives (sda, sdb sdc).
OS is on a separate SSD (sde).
No, I don't have a backup. Yes it is stupid to think that everything will be okay. But I don't have enough space to backup my 8TB raid.

I wanted to have more space on my raid, so I added another drive (sdd) following these instructions and it worked.

A problem I have was, that my raid disappeared after each reboot so I thought that a reinstall of mdadm could help.
After i did that, rebootet and created the array with:

sudo mdadm --create --assume-clean --level=5 --raid-devices=4 /dev/md0 /dev/sda /dev/sdb /dev/sdc /dev/sdd

(because "--assemble --scan" didn't work) the filesystem wasn't recognised anymore.

Then I have created the array multiple times, because I read somewhere that this can happen if the drives are in a false order. This didn't work.

Checking the filesystem returned this:

challenger1304@hannes:~$ sudo fsck /dev/md0 
fsck from util-linux 2.31.1
e2fsck 1.44.1 (24-Mar-2018)
ext2fs_open2: Bad magic number in super-block
fsck.ext4: Superblock invalid, trying backup blocks...
fsck.ext4: Bad magic number in super-block while trying to open /dev/md0

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>

Then I tried to replace the bad Superblock like the tool told me to do. To see where the Superblock-backups might be, I run "mkfs.ext4 -n /dev/md0" and have tried most of them too.
Didn't work either.

After some more research I found a tool called "TestDisk", installed it and run it. Found a lot of partitions (only had one on the raid) with the correct label. In these Partitions where no files and the tool could't recreate the partitions.
I've done that three times after recreating the array with the disks in a different order. Every time same result.

I suspect that creating a new array destroys any old arrays that might have been on those disks. — grawity_u1686, Commented Oct 12, 2018 at 10:18

gog · Accepted Answer · 2018-10-12 12:42:08Z

Preface

I am using mdadm and created a raid-5 with 3x 4TB drives (sda, sdb sdc)

Using raid5 with so big drives is about searching for troubles. You can find many warnings against doing that here and there on the Internet, but I'll warn you once again: your data is in danger. Once one of the disks in the raid5 array fails, you are left with no redundancy at all. Once you insert a new spare and start the resync process, the chance that another disk experiences a read error are high. Once that happens, your data is gone.

Then, another suggestion: I always make RAID arrays of partitions, not of whole disks. But this is something like a "preference", that I'm following since when I learnt about mdadm raid. Actually, all my disks contain more than one RAID (example: a first partition in raid1 with the OS, then another one for a bigger RAID).

No, I don't have a backup. Yes it is stupid to think that everything will be okay. But I don't have enough space to backup my 8TB raid.

Fair enough. It means that you don't care much if you'll lose that data, especially during a so delicate operation as growing/reshaping a raid array. As you can see on the Linux RAID Wiki about growing an array, the first works right after the chapter title are:

BACK UP. BACK UP !! BACK UP !!!!

Back to business

Let's try to understand what happened during the steps you described here.

A problem I have was, that my raid disappeared after each reboot

This should have made you stop and say: what's wrong? Let's fix this, before I'm going to lose all my data.

Probably you forgot to populate, or better, update your /etc/mdadm/mdadm.conf file with a line similar to:

ARRAY /dev/md/0 metadata=1.2 UUID=deadbeef:deadbeef:deadbeef:deadbeef name=myhostname:0

which you can generate using

 mdadm --detail --scan

And, probably, you'd need your initramfs updated. On Debian, you can do that with

update-initramfs -u

so I thought that a reinstall of mdadm could help.

No help at all.

After i did that, rebootet and created the array with:

sudo mdadm --create --assume-clean --level=5 --raid-devices=4 /dev/md0 /dev/sda /dev/sdb /dev/sdc /dev/sdd

Oh no. Wow, you now have overwritten the RAID superblock on all your four disks. Your data is still there, but you have overwritten the information that tell mdadm the layout of your RAID array. Assuming that the old superblock and the new one were the same version (1.2), you could recover this situation by overwriting the RAID superblocks with a backup from the original disks.

(because "--assemble --scan" didn't work)

This should have stopped you from proceeding, and searching more info about the error returned. Most probably, the layout of data on the drives was not the one that mdadm was expecting, or something similar. Not a big problem per se.

Then I have created the array multiple times, because I read somewhere that this can happen if the drives are in a false order. This didn't work.

You overwritten the RAID superblock on all the disks, again and again. No change to your data (assuming the old and new RAID superblock were the same version)

Then I tried to replace the bad Superblock like the tool told me to do.

Oh dear, no no no! You started corrupting your data here. There's no chance to recover all your data intact now, since you wrote that data in an unknown position (with respect to the original layour of the grown array).

After some more research I found a tool called "TestDisk", installed it and run it. Found a lot of partitions (only had one on the raid) with the correct label. In these Partitions where no files and the tool could't recreate the partitions.

That tool is reading by the block device (the RAID array) directly, without using a filesystem, and probably is looking at some data that is made up of blocks which are in the wrong order. Totally without meaning.

Now What?

I think you can say goodbye to the old data. But, since you simply wrote little data to the array, there's some chance that you can recover the data, once you find out how the original, grown array was setup. This includes finding out:

The size of the array
The array layout (left-symmetric, etc..)
The stripe size
The disk order

Without a backup of the RAID superblocks, it will be really difficult. Unfortunately, you learned the "make a backup" suggestion the hard way. Good luck!

Stack Exchange Network

mdadm raid has unknown filesystem after adding new disk

1 Answer 1

Preface

Back to business

Now What?

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
raid
ext4
mdadm
filesystem-corruption
.

Hot Network Questions

mdadm raid has unknown filesystem after adding new disk

1 Answer 1

Preface

Back to business

Now What?

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxraidext4mdadmfilesystem-corruption.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
raid
ext4
mdadm
filesystem-corruption
.