2

I recently upgraded a server from Fedora 6 to Fedora 14. In addition to the main hard drive where the OS is installed, I have 3 1TB hard drives configured for RAID5 (via software). After the upgrade, I noticed one of the hard drives had been removed from the raid array. I tried to add it back with mdadm --add, but it just put it in as a spare. I figured I'd get back to it later.

Then, when performing a reboot, the system could not mount the raid array at all. I removed it from the fstab so I could boot the system, and now I'm trying to get the raid array back up.

I ran the following:

mdadm --create /dev/md0 --assume-clean --level=5 --chunk=64 --raid-devices=3 missing /dev/sdc1 /dev/sdd1

I know my chunk size is 64k, and "missing" is for the drive that got kicked out of the array (/dev/sdb1).

That seemed to work, and mdadm reports that the array is running "clean, degraded" with the missing drive.

However, I can't mount the raid array. When I try:

mount -t ext3 /dev/md0 /mnt/foo

I get:

mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

and /var/log/messages shows:

EXT3-fs (md0): error: can't find ext3 filesystem on dev md0.

Does anyone have any ideas of what to try next?

2
  • Have you tried mounting two of the three drives? I'm not sure if RAID-5 will work with just one of three drives. Commented Jun 7, 2011 at 20:14
  • Yeah, that's what I did with the mdadm --create command. You'll notice at the end I specified the three drives as "missing," "/dev/sdc1," and "/dev/sdd1." Also, mdadm --query --details /dev/md0 reports the array as "clean, degraded" which is what I would expect for a RAID5 array missing one drive.
    – jstevej
    Commented Jun 7, 2011 at 20:25

2 Answers 2

0

You may got the missing on the wrong position. Only the correct position of the drives and missing will work. Just run:

mdadm --examine  /dev/sdb1

This will output (among other things) the information which number in the RAID is really missing. Look for the this line:

      Number   Major   Minor   RaidDevice State
this     0     253       13        0      active sync   /dev/dm-13

In this case, it is number 0 (=first device) and active as my RAID is online right now . Now you know wich drive should be specified as missing.

But you still have 2 choices: The order of the working drives may also need to be swapped. This information is lost, however, because it got overwritten by your reassembly try.

1
  • I tried your suggestion; in fact, I tried all 6 possible combinations of drive order (missing sdc1 sdd1, missing sdd1 sdc1, sdc1 missing sdd1, sdd1 missing sdc1, sdc1 sdd1 missing, sdd1 sdc1 missing), but all of them gave me the same mount error.
    – jstevej
    Commented Jun 8, 2011 at 11:59
0

one of the things I've found is that mdadm --create /dev/md0 --assume-clean will only work correctly if you use same (or close) version of mdadm that was used to create original array. That is becase they use different offsets for data and metadata, even if same superblock version (like 1.2)

Problem is that mdadm output will always say it recreated array just fine, but the data contained in /dev/md0 will be wrong.

for example, using recent mdadm 3.3.2 or even previous 3.2.5 didn't work for me, but falling back to mdadm 3.1.4 (which created the array) worked just fine.

Note that I also took care to specify drives in correct order (as detailed in mdadm --examine /dev/sd?) when recreating array and use overlay files for all the testing (in order not to increase the damage), using instructions at https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID

In my case the problem was 6-disk RAID5 which was being grown to 7 disks, but it didn't progress at all so was aborted and wouldn't assemble anymore with mdadm: Failed to restore critical section for reshape, sorry., and --force and --invalid-backup weren't helping either, so I had to use --create --assume-clean

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .