0

We have a RAID-1 setup. We changed the old 1TB disks for 3.7TB disks. The disk changes worked without a problem, the disks are in sync. We did have to change the partition type from MBR to GPT, but that didn't affect the RAID bitmaps.

We reshaped the partition of one disk, waited for it to sync, then reshaped the partition of the second one, and waited again for the sync.

Now, our disks (sda and sdc) are used in multiple RAIDs, namely md0, md1 and md2.

The new disks are in sync, no problem, but the RAID still see the 1TB original space. We followed the instructions to grow the RAID, but only for md2, as we didn't need to grow md0 and md1. The instructions are adapted from https://raid.wiki.kernel.org/index.php/Growing

mdadm --grow /dev/md2 --bitmap none
mdadm --grow /dev/md2 --size=max
mdadm --wait /dev/md2
mdadm --grow /dev/md2 --bitmap internal

I noticed that the "--bitmap none" removed bitmaps on md0, md1 and md2.

The wait was pretty short, as we realized that the OS had not seen that the disks were bigger. We did a partprobe and redid the four commands, and now we had a proper wait as the disks synced. But it still fails on the last command:

~# mdadm --grow /dev/md2 --bitmap internal
mdadm: Cannot add bitmap while array is resyncing or reshaping etc.
mdadm: failed to set internal bitmap

The mdadm --detail /dev/md2 shows that everything is fine, except that the Consistency Policy is resync instead of bitmap. Same thing with /proc/mdstat.

This is on Debian Bullseye, so using mdadm v4.1.

I am aware that there is missing information here, but I'm not sure what to look for or report to diagnose and fix this issue. Any idea where to look?

1 Answer 1

0

Well, we had to do a cold reboot of the machine to run the new kernel version.

During the shutdown, we got a brief message that a RAID process was stuck and not answering, before that process was forcefully shut down. We didn't have time to jot down what the process was.

When the machine rebooted we could do the mdadm --grow /dev/md2 --bitmap internal and it worked without issue.

So the issue was a stuck process, and a cold reboot resolved that issue.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .