Score:1

mdadm RAID: Stuck at 0% Grow (Shrink) reshape due to bad geometry

ke flag

I have a linux software Raid5 array (md1), containing 4 x 16TB + 2 x 8TB hard drives. 2 x 8TB hard drives were merged together (Raid0 array; md0), working as a (fifth) 16TB device. This is just for data storage. Since the 2 x 8TB needed to me removed, I decided to shrink the number of devices to 4. Therefore I performed following steps:

mdadm --grow /dev/md1 --array-size 46883175936  
mdadm --grow --raid-devices=4 /dev/md1 --backup-file=/home/backup 

The attentive reader will note, that there is one missing step, the resizing of the file system before dealing with mdadm.

The resulting reshaping process is now stuck at 0%:

md1 : active raid5 md0[5](S) sda1[0] sdb1[3] sde1[4] sdc1[2] 
      46883175936 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUUU] 
      [>....................]  reshape =  0.0% (1/15627725312) finish=2686015287.7min speed=0K/sec 
md0 : active raid0 sdd[0] sdf[1]
      15627788288 blocks super 1.2 512k chunks

Although, iostat does indicate some action on these hard drives:

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
nvme0n1         155.89       564.43       443.38  108549018   85268978
sda            2851.68     82114.67      2400.81 15791944384  461714155
sdb            2851.88     82115.09      2401.49 15792024433  461844519
sdc            2852.37     82143.80      2401.57 15797546322  461859251
sdd             866.01        13.34     41827.19    2565819 8044026920
sde            2852.42     82143.79      2402.46 15797544281  462031635
sdf             866.50        14.33     41835.12    2755774 8045552820

And mdadm --detail /dev/md1 looks fine:

    /dev/md1:
           Version : 1.2
     Creation Time : Tue Jun  1 17:25:18 2021
        Raid Level : raid5
        Array Size : 46883175936 (43.66 TiB 48.01 TB)
     Used Dev Size : 15627725312 (14.55 TiB 16.00 TB)
      Raid Devices : 4
     Total Devices : 5
       Persistence : Superblock is persistent

       Update Time : Fri Oct 15 15:42:47 2021
             State : clean, reshaping 
    Active Devices : 4
   Working Devices : 5
    Failed Devices : 0
     Spare Devices : 1

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

    Reshape Status : 0% complete
     Delta Devices : -1, (5->4)

              Name : localhost.localdomain:1  (local to host localhost.localdomain)
              UUID : 5457f23e:faa7ee47:b2c62a37:f4c78526
            Events : 1064286

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       2       8       33        1      active sync   /dev/sdc1
       4       8       65        2      active sync   /dev/sde1
       3       8       17        3      active sync   /dev/sdb1

       5       9        0        -      spare   /dev/md0

Nevertheless, dmesg gives a direct hint to the error:

EXT4-fs (md1): bad geometry: block count 15627725312 exceeds size of device (11720793984 blocks)

I tried to cancel the reshape process, but to no avail. I'm stuck. Since there is already important data on md1, I am very interested into restoring md1. I have a backup, but this misses some recent data. Is there a way to cancel the reshape progress? Since it is still at (1/15627725312) I would not expect loss of data. Or is there any other suggestion to restore md1 (with or without md0)?

I'm thankful for every suggestion. If you need any further information, please let me know.

EDIT: I could run the command:

mdadm --create /dev/md1 --level=5 --raid-devices=5 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sde1 /dev/md0 --assume-clean --readonly

This created the array in the previous state. Nevertheless, the file system still seems to be broken. Since there was no HDD failure, I assume, that the data should still be there, non-overwritten. Is there any way to restore the file system? I tried testdisk, which detects linux sys. data, but could not restore these files. Is there any other program, that could do the trick?

Again, any suggestion is highly appreciated! Thank you in advance!

djdomi avatar
za flag
you csn shutdown the Maschine, remove one hdd and then mount the partion one by one on a recovery system this is how i did to recover a nas with raid 1
ke flag
Thank you for your suggestion. Could you please describe your procedure in detail?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.