An adventure with mdadm
on a broken hard drive
A Supermicro server started acting up. The problem was evident on startup – a lot of “wait” time for RAID to initialize and not really succeeding, until eventually timing out. Quite a number of times emergency mode was invoked. Sometimes leaving the system for quite long meant it kind of rebuilt. The symptoms on startup were something like:
A start job is running for dev/md124p1.device (30s / 1min 30s).
Also running smartctl
on the four drives in the system was interesting – three disk were rather snappy and the fourth always lagged. Once could also see with cat /proc/mdstat
that the drive speed of the one mirror was excruciatingly slow.
I don’t think there is an exact formula for fixing a broken drive, but at least in the case having a mirror helped. However, detaching this drive from the mirror became a nightmare. Most of this adventure is about trying to detach the mirror. In total there were 4 drives, Mirror A (faulty) and Mirror B (working fine). All drives are Western Digital 2.7 TB drives.
Eventually the call was made to remove this drive, first from the software configuration, and then from the system. But how? How to not get drowned in terminology? It’s really hard to google. This is somewhat complicated stuff but for your reading pleasure we give you a transcript:
First of all, cat /proc/mdstat
is rather useful. It showed the problem quite clearly. Continuous rebuilding taking exceeding long, hours. I could compare the healthy mirror to the broken mirror and speed was also an obvious problem. After a long time the mirror then actually appears fine, but rebooting start the whole process from scratch again.
Personalities : [raid1] md124 : active raid1 sdc[1] sdd[0] 2783756288 blocks super external:/md125/0 [2/1] [U_] [>....................] recovery = 0.0% (197568/2783756288) finish=77145.4min speed=600K/sec md125 : inactive sdd[1](S) sdc[0](S) 10402 blocks super external:imsm md126 : active raid1 sda[1] sdb[0] 2783756288 blocks super external:/md127/0 [2/2] [UU] md127 : inactive sda[1](S) sdb[0](S) 10402 blocks super external:imsm unused devices: <none>
But eventually this got boring so we moved to strike the disk. The point here was as long as that thing said recovery
one could break the mirror.
To examine more closely, let’s look at the two disks which are okay:
mdadm --detail /dev/md126p2 /dev/md126p2: Container : /dev/md/imsm0, member 0 Raid Level : raid1 Array Size : 1048576 (1024.00 MiB 1073.74 MB) Used Dev Size : 18446744073709551615 Raid Devices : 2 Total Devices : 2 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : resync UUID : 39900a34:f3e2b971:5d553540:b110c1e7 Number Major Minor RaidDevice State 1 8 0 0 active sync /dev/sda 0 8 16 1 active sync /dev/sdb
Moving on to the disks that are broken. Through observation we deduced it was /dev/sdd
mdadm --detail /dev/md124p1 /dev/md124p1: Container : /dev/md/imsm1, member 0 Raid Level : raid1 Array Size : 2783755247 (2654.80 GiB 2850.57 GB) Used Dev Size : 2783756288 (2654.80 GiB 2850.57 GB) Raid Devices : 2 Total Devices : 2 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Consistency Policy : resync Rebuild Status : 0% complete UUID : c6956ed2:f692b781:631f768b:8404a220 Number Major Minor RaidDevice State 1 8 32 0 active sync /dev/sdc 0 8 48 1 spare rebuilding /dev/sdd
On the surface this looks good – but since we tried this 10 times already and saw that the end result is wait for nothing but errors, let’s move on to removing this disk:
mdadm --fail /dev/md/imsm1 --remove /dev/sdd mdadm: /dev/sdd is still in use, cannot remove.
Whilst rebuilding, you can’t remove it. You have to wait. You can even umount it, it will still be in use.
So let’s wait. Then eventually:
mdadm --fail /dev/md/imsm1 --remove /dev/sdd mdadm: hot removed /dev/sdd from /dev/md/imsm1
Now that’s hot. Okay, with most of the heavy terminology out of the way, what next. It’s removed, now what?
Please note above imsm1
was chosen. Who knows why? Something about parent containers.
Once it’s removed it’s not really removed. You have more work to do. The information says you have to wipe stuff. Like this:
wipefs -a /dev/sdc1
This doesn’t give any output.
Finally this is supposed to reduce it from 2 to 1 disk:
Finally now our “mirror” has one drive. How do we tell it about it? We tried all of these without success:
# mdadm --grow /dev/md124 --raid-devices=1 mdadm: '1' is an unusual number of drives for an array, so it is probably a mistake. If you really mean it you will need to specify --force before setting the number of drives.
For now just a reboot to observe behavior.
What is it’s still in use?
We tried all of the below to try and stop the disk whilst rebuilding and recovery but it never worked:
echo "idle" > /sys/block/md124/md/sync_action
That didn’t work, so we tried this:
echo 0 > /proc/sys/dev/raid/speed_limit_max
Much more googling, then this:
echo frozen > /sys/block/md0/md/sync_action echo none > /sys/block/md0/md/resync_start echo idle > /sys/block/md0/md/sync_action
After all of this, cat /proc/mdstat
still shows recovery and apparently disk still in use. It appears on a broken disk once it’s “locked” in the operating system it’s a real nightmare to remove. We had to wait it out.
Eventually this worked again and we could remove i:
mdadm --fail /dev/md/imsm1 --remove /dev/sddmdadm: hot removed /dev/sdd from /dev/md/imsm1
At this point it’s not a member anymore, but rebooting makes it a member again. So you have to zero out some bits as described above and do more work.