1

I have 4 x 3TB NAS setup as RAID5 which has been working great for almost a year.

After a recent abrupt shutdown (had to hit the power button) the RAID will no longer mount on boot up.

I've run:

mdadm --examine /dev/sd[bcdefghijklmn]1 >> raid.status

The output is below:

/dev/sda:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 7d2a94ca:d9a42ca9:a4e6f976:8b5ca26b
Name : BruceLee:0 (local to host BruceLee)
Creation Time : Mon Feb 4 23:07:01 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 8790405888 (8383.18 GiB 9001.38 GB)
Used Dev Size : 5860270592 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : 2c1e0041:21d926d6:1c69aa87:f1340a12

Update Time : Sat Dec 27 20:54:55 2014
Checksum : d94ccaf5 - correct
Events : 17012

Layout : left-symmetric
Chunk Size : 128K

Device Role : Active device 0
Array State : AAA. ('A' == active, '.' == missing)
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 7d2a94ca:d9a42ca9:a4e6f976:8b5ca26b
Name : BruceLee:0 (local to host BruceLee)
Creation Time : Mon Feb 4 23:07:01 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 8790405888 (8383.18 GiB 9001.38 GB)
Used Dev Size : 5860270592 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : a0261c8f:8a2fbb93:4093753a:74e7c5f5

Update Time : Sat Dec 27 20:54:55 2014
Checksum : 7b84067b - correct
Events : 17012

Layout : left-symmetric
Chunk Size : 128K

Device Role : Active device 1
Array State : AAA. ('A' == active, '.' == missing)
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 7d2a94ca:d9a42ca9:a4e6f976:8b5ca26b
Name : BruceLee:0 (local to host BruceLee)
Creation Time : Mon Feb 4 23:07:01 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 8790405888 (8383.18 GiB 9001.38 GB)
Used Dev Size : 5860270592 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : 9dc56e9e:d6b00f7a:71da67c7:38b7436c

Update Time : Sat Dec 27 20:54:55 2014
Checksum : 749b3dba - correct
Events : 17012

Layout : left-symmetric
Chunk Size : 128K

Device Role : Active device 2
Array State : AAA. ('A' == active, '.' == missing)
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 7d2a94ca:d9a42ca9:a4e6f976:8b5ca26b
Name : BruceLee:0 (local to host BruceLee)
Creation Time : Mon Feb 4 23:07:01 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 8790405888 (8383.18 GiB 9001.38 GB)
Used Dev Size : 5860270592 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 81e5776f:2a466bee:399251a0:ab60e9a4

Update Time : Sun Nov 2 09:07:02 2014
Checksum : cb4aebaf - correct
Events : 159

Layout : left-symmetric
Chunk Size : 128K

Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing)

When checking the Disks in Ubuntu Disk Manager sda/b/c are are showing as OK and sdd is showing as OK with 64 bad sectors

If I run fsck /dev/md0

It reads:

fsck.ext2: Invalid argument while trying to open /dev/md0

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem. If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
or
e2fsck -b 32768 <device>

Lastly if I run

mdadm --examine /dev/sd[a-d] | egrep 'Event|/dev/sd'

I get:

/dev/sda:
Events : 17012
/dev/sdb:
Events : 17012
/dev/sdc:
Events : 17012
/dev/sdd:
Events : 159

If I run cat /proc/mdstat I get:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdb[1](S) sdc[2](S) sdd[3](S) sda[0](S)
1172054204Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdb[1](S) sdc[2](S) sdd[3](S) sda[0](S)
11720542048 blocks super 1.2

unused devices: <none>

8 blocks super 1.2

unused devices: <none>

Lastly running file -s /dev/md0

I get:

/dev/md0: empty

Basically I think I need to run --assemble on the RAID but I'm afraid of losing my data but also that 4th drive concerns me a little.

Could someone advise of the best next logical steps to get this up and running again?

2
  • the output is badly formatted (missing newlines), can you edit it? Also include /proc/mdstat and file -s /dev/md0... Commented Dec 28, 2014 at 14:04
  • Hi frostschutz, thanks for helping out. I now know StackExchange allows basic HTML which is better for me. I'm itching to run --assemble, but just trying to bide my time to ensure anything I attempt isnt going to cost me my data. Any further help would be greatly appreciated. Thanks Adam
    – Adamation
    Commented Dec 29, 2014 at 16:03

1 Answer 1

2

I have had the most success by executing the following strategy:

# mdadm --stop /dev/md0
# mdadm --create /dev/md0 --metadata=1.2 --level=5 --raid-devices=4 --chunk=128 --layout=left-symmetric /dev/sda1 /dev/sdb1 /dev/sdc1 missing

That creates the device with the same parameters as originally used. The missing causes the device to be created in degraded mode, hence no resyncing of the disks will occur. You can then check that the filesystem is intact (modulo the unclean shutdown); if so, you can proceed by adding /dev/sdd1 to the array:

# mdadm --add /dev/md0 /dev/sdd1

mdadm will now rebuild /dev/sdd1 from the existing raid array.

Of course you may prefer doing a --stop followed by a --assemble, but the above has worked for me in the past after a wrong disk was hot-swapped after another disk had failed.

Note that you /dev/sdd1 was apparently already offline for 4 weeks. I recommend using some monitoring script that notifies you about md failures; in Debian that's provided automatically by the /etc/cron.daily/mdadm script, which basically does:

mdadm --monitor --scan --oneshot

It can also be done by mdadm running as a daemon:

mdadm --monitor --pid-file /run/mdadm/monitor.pid --daemonise --scan --syslog

You can provide an email address to receive alerts with --mail [email protected]; of course your system needs to be able to send emails in that case...

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .