I have a linux software Raid5 array (md1), containing 4 x 16TB + 2 x 8TB hard drives. 2 x 8TB hard drives were merged together (Raid0 array; md0), working as a (fifth) 16TB device. This is just for data storage. Since the 2 x 8TB needed to me removed, I decided to shrink the number of devices to 4. Therefore I performed following steps:
mdadm --grow /dev/md1 --array-size 46883175936
mdadm --grow --raid-devices=4 /dev/md1 --backup-file=/home/backup
The attentive reader will note, that there is one missing step, the resizing of the file system before dealing with mdadm.
The resulting reshaping process is now stuck at 0%:
md1 : active raid5 md0[5](S) sda1[0] sdb1[3] sde1[4] sdc1[2]
46883175936 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUUU]
[>....................] reshape = 0.0% (1/15627725312) finish=2686015287.7min speed=0K/sec
md0 : active raid0 sdd[0] sdf[1]
15627788288 blocks super 1.2 512k chunks
Although, iostat
does indicate some action on these hard drives:
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1 155.89 564.43 443.38 108549018 85268978
sda 2851.68 82114.67 2400.81 15791944384 461714155
sdb 2851.88 82115.09 2401.49 15792024433 461844519
sdc 2852.37 82143.80 2401.57 15797546322 461859251
sdd 866.01 13.34 41827.19 2565819 8044026920
sde 2852.42 82143.79 2402.46 15797544281 462031635
sdf 866.50 14.33 41835.12 2755774 8045552820
And mdadm --detail /dev/md1
looks fine:
/dev/md1:
Version : 1.2
Creation Time : Tue Jun 1 17:25:18 2021
Raid Level : raid5
Array Size : 46883175936 (43.66 TiB 48.01 TB)
Used Dev Size : 15627725312 (14.55 TiB 16.00 TB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Fri Oct 15 15:42:47 2021
State : clean, reshaping
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : resync
Reshape Status : 0% complete
Delta Devices : -1, (5->4)
Name : localhost.localdomain:1 (local to host localhost.localdomain)
UUID : 5457f23e:faa7ee47:b2c62a37:f4c78526
Events : 1064286
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
2 8 33 1 active sync /dev/sdc1
4 8 65 2 active sync /dev/sde1
3 8 17 3 active sync /dev/sdb1
5 9 0 - spare /dev/md0
Nevertheless, dmesg
gives a direct hint to the error:
EXT4-fs (md1): bad geometry: block count 15627725312 exceeds size of device (11720793984 blocks)
I tried to cancel the reshape process, but to no avail. I'm stuck. Since there is already important data on md1, I am very interested into restoring md1. I have a backup, but this misses some recent data.
Is there a way to cancel the reshape progress? Since it is still at (1/15627725312) I would not expect loss of data.
Or is there any other suggestion to restore md1 (with or without md0)?
I'm thankful for every suggestion. If you need any further information, please let me know.
EDIT: I could run the command:
mdadm --create /dev/md1 --level=5 --raid-devices=5 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sde1 /dev/md0 --assume-clean --readonly
This created the array in the previous state. Nevertheless, the file system still seems to be broken. Since there was no HDD failure, I assume, that the data should still be there, non-overwritten.
Is there any way to restore the file system?
I tried testdisk, which detects linux sys. data
, but could not restore these files. Is there any other program, that could do the trick?
Again, any suggestion is highly appreciated! Thank you in advance!