Score:0

Unable to rebuild the Raid 1 Array (intel RST Raid 1) in BIOS and in OS(Ubuntu server 20.04.4)

in flag

we have HP server HP ProLiant ML10 Gen 9 with Ubuntu 20.04.4 LTS. We have enabled Raid 1 array for two HDD sized 2TB using Intel RST Raid configuration (which is an fake/firmware raid). Now my goal is to replace the faulty Drive and rebuild the Raid 1 array.

Below is the output of the Raid Status cat /proc/mdstat

surya@himalaya:~$ cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sda[1] sdb[0]
      1953511424 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sda[1](S) sdb[0](S)
      6320 blocks super external:imsm

unused devices: <none>

Below is the output of the HDD info lsblk

surya@himalaya:~$ lsblk
NAME                        MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
loop0                         7:0    0 61.9M  1 loop  /snap/core20/1361
loop1                         7:1    0 67.9M  1 loop  /snap/lxd/22526
loop2                         7:2    0 55.5M  1 loop  /snap/core18/2284
loop3                         7:3    0 43.6M  1 loop  /snap/snapd/14978
loop4                         7:4    0 55.4M  1 loop  /snap/core18/2128
loop5                         7:5    0 43.6M  1 loop  /snap/snapd/15177
loop6                         7:6    0 67.2M  1 loop  /snap/lxd/21835
loop7                         7:7    0 61.9M  1 loop  /snap/core20/1376
sda                           8:0    0  1.8T  0 disk
└─md126                       9:126  0  1.8T  0 raid1
  ├─md126p1                 259:0    0  1.1G  0 part  /boot/efi
  ├─md126p2                 259:1    0  1.5G  0 part  /boot
  └─md126p3                 259:2    0  1.8T  0 part
    ├─ubuntu--vg-ubuntu--lv 253:0    0  100G  0 lvm   /
    └─ubuntu--vg-lv--0      253:1    0  1.7T  0 lvm   /home
sdb                           8:16   0  1.8T  0 disk
└─md126                       9:126  0  1.8T  0 raid1
  ├─md126p1                 259:0    0  1.1G  0 part  /boot/efi
  ├─md126p2                 259:1    0  1.5G  0 part  /boot
  └─md126p3                 259:2    0  1.8T  0 part
    ├─ubuntu--vg-ubuntu--lv 253:0    0  100G  0 lvm   /
    └─ubuntu--vg-lv--0      253:1    0  1.7T  0 lvm   /home
sr0                          11:0    1 1024M  0 rom

I used the below command to replace the faulty drive sdb as shown above.

mdadm --manage /dev/md126 --fail /dev/sdb and I shutdown the system and replaced the Harddrive in the same port.

now when I try to rebuild the array using this command mdadm --manage /dev/md126 --add /dev/sdb I get the below message.

root@himalaya:~# mdadm --manage /dev/md126 --add /dev/sdb
mdadm: Cannot add disks to a 'member' array, perform this operation on the parent container

now the output of cat /proc/mdstat is below.

root@himalaya:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sda[0]
      1953511424 blocks super external:/md127/0 [2/1] [U_]

md127 : inactive sda[0](S)
      3160 blocks super external:imsm

unused devices: <none>

I also tried to enter the the Intel ROM option in BIOS using (Ctrl + i) I have set the OROM UI normal delay to 4 seconds under SATA configuration in BIOS setting. but I couldn't get that screen to rebuild the array in BIOS. It would be a great help if someone can assist me on how to rebuild and restore the Raid 1 array.

Score:0
in flag

So I answering my own question for everyone's benefit who has to deal with these type of fake raid controllers.

Here is what I found

Interestingly the md126 is not the main RAID array here, it is md127, so all I did was re-adding this new drive to md127 with:

mdadm --manage /dev/md127 --force --add /dev/sdb

and the Raid started to rebuild itself.

now the results of cat/proc/mdstat are:

root@himalaya:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sda[1] sdb[0]
      1953511424 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sdb[1](S) sda[0](S)
      6320 blocks super external:imsm

unused devices: <none>

And this changes were reflected in the BIOS screen as well. Intel RST RAID Volumes status was Normal.

Below are the list of commands I used to restore this RAID 1 Array successfully.

To check the raid status:

cat /proc/mdstat

Removing the failed disk: First we mark the disk as failed and then remove it from the array:

mdadm --manage /dev/md126 --fail /dev/sdb
mdadm --manage /dev/md126 --remove /dev/sdb

Then power down the system and replace the new drive:

shutdown -h now

Adding the new hard drive: First you must create the exact same partitioning as on /dev/sda:

sfdisk -d /dev/sda | sfdisk /dev/sdb

To check if both the harddrive are having the same partitioning:

fdisk -l

Next we add this drive to the RAID array (you can use md126 or md127 accordingly whichever is your main RAID array) below is the command I used:

mdadm --manage /dev/md127 --force --add /dev/sdb

That's it. You can now see the Raid started to rebuild.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.