Score:2

Rebuilding an inactive RAID5

us flag

I have a 7 x 14TB RAID5 in my workstation with Centos 7. Last week one of the drives was marked as faulty by SMART (/dev/sde). I used mdadm to mark this drive as faulty and to remove it from the array and ... long story short... I ended up pulling out the wrong drive!

Now I have Centos in emergency mode (my operating system resides on a drive outside the array) and I am able to run mdadm to analyze the array. It seems my /dev/md127 array is inactive with all drives marked as spares.

cat /proc/mdstat
Personalities :
md127 : inactive sdc[6](S) sdf[9](S) sdg[10](S) sde[8](S) sdd[7](S) sdb[5](S) sdh[11](S)
95705752576 blocks super 1.2

unused devices: <none>

For some reason here it shows as raid0:

mdadm -D /dev/md127

/dev/md127:
Version : 1.2
Raid Level : raid0
Total Devices : 7
Persistence : Superblock is persistent

State : inactive
Working Devices : 7

Name : c103950:127
UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Events : 539502
Number  Major   Minor   RaidDevice

   -      8   16        -        /dev/sdb
   -      8   32        -        /dev/sdc
   -      8   48        -        /dev/sdd
   -      8   64        -        /dev/sde
   -      8   80        -        /dev/sdf
   -      8   96        -        /dev/sdg
   -      8  112        -        /dev/sdh

And when I examine the individual drives:


mdadm -E /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352blea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : clean
Device UUID : 136b95a5:1589d83d:bdb059dd:e2e9e02f

Update Time : Thu Jul 15 12:47:37 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: 4e727166 - correct
Events : 539502

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 1
Array State : AAAA..A ('A'== active, '.' == missing, 'R' == replacing)

****** 

mdadm -E /dev/sdc
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : clean
Device UUID : 64cac230:bc1e2bf5:65323067:5439f101

Update Time : Thu Jul 15 12:47:37 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: ecd93778 - correct
Events : 539502

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 6
Array State : AAAA..A ('A'== active, '.' == missing, 'R' == replacing)

******

mdadm -E /dev/sdd
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : clean
Device UUID : 2dd7e6d6:6c035b33:0072796b:d3685558

Update Time : Thu Jul 15 12:47:37 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: 2bda98d - correct
Events : 539502

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 0
Array State : AAAA..A ('A'== active, '.' == missing, 'R' == replacing)

******

mdadm -E /dev/sde
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : active
Device UUID : 8e6bd6de:15483efa:82c1917d:569ee387

Update Time : Thu Jul 13 10:30:54 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: c050eb4 - correct
Events : 539489

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 4
Array State : AAAAAAA ('A'== active, '.' == missing, 'R' == replacing)

******

mdadm -E /dev/sdf
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : clean
Device UUID : 93452dc8:3fba28ce:c7d33d00:7c1838fd

Update Time : Thu Jul 15 12:47:37 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: e995ceb8 - correct
Events : 539502

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 2
Array State : AAAA..A ('A'== active, '.' == missing, 'R' == replacing)

******

mdadm -E /dev/sdg
/dev/sdg:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : clean
Device UUID : 48fe7b1b:751e6993:4eb73b66:a1313185

Update Time : Thu Jul 15 12:47:37 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: f81be84f - correct
Events : 539502

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 3
Array State : AAAA..A ('A'== active, '.' == missing, 'R' == replacing)

******

mdadm -E /dev/sdh
/dev/sdh:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a6f44e2c:352b1ea0:bd25d626:cac0177c
Name : c103950:127
Creation Time : Thu Jul 26 12:21:27 2018
Raid Level : raid5
Raid Devices : 7

Avail Dev Size : 27344500736 sectors (13038.87 GiB 14000.38 GB)
Array Size : 82033502208 KiB (78233.24 GiB 84002.31 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before-264112 sectors, after-0 sectors
State : clean
Device UUID : 80448326:c8b82624:a8e31b97:18246b58

Update Time : Thu Jul 15 12:04:35 2021
Bad Block Log : 512 entries available at offset 32 sectors
Checksum: 9800dd88 - correct
Events : 539497

Layout left-symmetric
Chunk Size : 512K

Device Role : Active device 5
Array State : AAAA.AA ('A'== active, '.' == missing, 'R' == replacing)****** 

/dev/sde is the faulty drive, while the /dev/sdh is the one I pulled by mistake. Notice the difference in events and times of update. I now want to reassemble the array and wonder what is the safest way to do so.

Please help! Thank you for reading.

djdomi avatar
za flag
Oh Great, you had Raid ZERO - restore the backup, well done :-)
shodanshok avatar
ca flag
Can you try with `mdadm --incremental /dev/sd[abcdfgh]`?
Mike Andrews avatar
ng flag
Mdadm should prevent you from making a mistake, so long as you do NOT use `--force`. It's the use of `--force` where people run into trouble. You're looking to assemble the array including the drive you pulled, but without the drive that you `--fail`ed out. Then you'll `--re-add` the drive you `--fail`ed once the array is up and running. You'll need to get the array up and running. I agree with @shodanshok, you may be able to simply use incremental assembly to get back up and running.
lalmagor avatar
us flag
Thanks for your answers. --incremental seems to be exactly what I need but I am still worried that it will try to build this as Raid0 because this is what it is showing now when I check with mdadm --D. Can I do "mdadm --incremental --level=5 /dev/sd[abcdfgh]"? or should I do "mdadm --create --verbose /dev/md127 --level=5 /dev/sdb /dev/sdc /dev/sdd /dev/sdf /dev/sdg /dev/sdh"?
shodanshok avatar
ca flag
I think `--incremental` does not permit to specify the raid level. Anyway your HDD superblocks seems to correctly describe a raid5 arrays, so I would try `--incremental` (***without*** forcing anything) to start the array.
lalmagor avatar
us flag
I used "mdadm --stop /dev/md127" and then was able to run "mdadm --incremental" to each of my six good drives but it still says "not enough to start". When I run "mdadm --D /dev/md127", it is still the same, with all empty and thinking it is raid0.
lalmagor avatar
us flag
what about "mdadm --assemble /dev/md127 /dev/sdb /dev/sdc /dev/sdd /dev/sdf /dev/sdg /dev/sdh" ? Do you think this would work? Can I specify the raid level here too?
Score:0
us flag

I was able to solve this by running:

mdadm --assemble --force /dev/md127 /dev/sdb /dev/sdc /dev/sdd /dev/sdf /dev/sdg /dev/sdh

Which restored my array in a degraded state with 6/7 drives. It did not work without the --force option. I guess I was lucky there weren't so many event count differences between /dev/sdh and the rest. Afterwards was able to add the new disk to the array with:

mdadm --manage /dev/md127 --add /dev/sde

After 49 hours of rebuilding, my array was complete again.

I think my problem was similar to: https://unix.stackexchange.com/questions/163672/missing-mdadm-raid5-array-reassembles-as-raid0-after-powerout

I also used this guide: https://web.archive.org/web/20210302160944/http://www.tjansson.dk/2013/12/replacing-a-failed-disk-in-a-mdadm-raid/

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.