Score:-1

mdadm array state not fully active - faulty device

gu flag

I use 4 drives as a raid 5 on my Raspberry Pi. I shutdown my Pi every night and start the next morning. Sometimes the array appears to be faulty. Maybe the Pi doesn't properly mounts the drives correctly, but is not mentioning the problem in dmesg:

[   10.538758] scsi 0:0:0:0: Direct-Access     ACASIS                    8034 PQ: 0 ANSI: 6
[   10.541035] sd 0:0:0:0: [sda] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[   10.541282] sd 0:0:0:0: [sda] Write Protect is off
[   10.541290] sd 0:0:0:0: [sda] Mode Sense: 67 00 10 08
[   10.541658] scsi 0:0:0:1: Direct-Access     ACASIS                    8034 PQ: 0 ANSI: 6
[   10.541767] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   10.542490] sd 0:0:0:0: [sda] Optimal transfer size 33553920 bytes
[   10.544213] sd 0:0:0:1: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[   10.544465] sd 0:0:0:1: [sdb] Write Protect is off
[   10.544473] sd 0:0:0:1: [sdb] Mode Sense: 67 00 10 08
[   10.544919] sd 0:0:0:1: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   10.545643] sd 0:0:0:1: [sdb] Optimal transfer size 33553920 bytes
[   10.603258] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   10.603350] sd 0:0:0:1: Attached scsi generic sg1 type 0
[   10.631296] sd 0:0:0:0: [sda] Attached SCSI disk
[   10.633209] sd 0:0:0:1: [sdb] Attached SCSI disk
[   11.022152] usb 2-1: new SuperSpeed Gen 1 USB device number 3 using xhci_hcd
[   11.043358] usb 2-1: New USB device found, idVendor=1058, idProduct=0a10, bcdDevice=80.34
[   11.043370] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=5
[   11.043376] usb 2-1: Product: Go To Final Lap
[   11.043381] usb 2-1: SerialNumber: 1234567890123
[   11.051424] scsi host1: uas
[   11.052496] scsi 1:0:0:0: Direct-Access     ACASIS                    8034 PQ: 0 ANSI: 6
[   11.054130] sd 1:0:0:0: Attached scsi generic sg2 type 0
[   11.058494] sd 1:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[   11.058746] sd 1:0:0:0: [sdc] Write Protect is off
[   11.058754] sd 1:0:0:0: [sdc] Mode Sense: 67 00 10 08
[   11.059094] scsi 1:0:0:1: Direct-Access     ACASIS                    8034 PQ: 0 ANSI: 6
[   11.059279] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   11.060062] sd 1:0:0:0: [sdc] Optimal transfer size 33553920 bytes
[   11.061458] sd 1:0:0:1: Attached scsi generic sg3 type 0
[   11.061797] sd 1:0:0:1: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[   11.062062] sd 1:0:0:1: [sdd] Write Protect is off
[   11.062072] sd 1:0:0:1: [sdd] Mode Sense: 67 00 10 08
[   11.062546] sd 1:0:0:1: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   11.063295] sd 1:0:0:1: [sdd] Optimal transfer size 33553920 bytes
[   11.145514] sd 1:0:0:1: [sdd] Attached SCSI disk
[   11.146878] sd 1:0:0:0: [sdc] Attached SCSI disk

I notice that the array is inactive and my 4th drive is wrong positioned. /dev/sdd should be IMO [3] not 4:

Personalities : 
md127 : inactive sdd[4](S) sdc[2](S) sda[0](S) sdb[1](S)
      7813529952 blocks super 1.2

I stopped the array and forced to reassamble.

OK root@ncloud:~# mdadm --stop /dev/md127
mdadm: stopped /dev/md127
OK root@ncloud:~# mdadm --assemble --force /dev/md127 /dev/sd[abcd]
mdadm: forcing event count in /dev/sdc(2) from 16244 upto 16251
mdadm: forcing event count in /dev/sdd(3) from 16244 upto 16251
mdadm: clearing FAULTY flag for device 2 in /dev/md127 for /dev/sdc
mdadm: Marking array /dev/md127 as 'clean'
mdadm: /dev/md127 has been started with 4 drives.
OK root@ncloud:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] 
md127 : active (auto-read-only) raid5 sda[0] sdd[4] sdc[2] sdb[1]
      5860147200 blocks super 1.2 level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

The details of earch disc:

/dev/sda:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 9fd0e97e:8379c390:5b0dac21:462d643c
           Name : ncloud:vo1  (local to host ncloud)
  Creation Time : Tue Jan 25 11:23:04 2022
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
     Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
  Used Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=176 sectors
          State : clean
    Device UUID : acc331ad:9f38d203:2e5be32f:8f149f1b

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Oct 17 06:27:09 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 90184840 - correct
         Events : 16251

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 0
   Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 9fd0e97e:8379c390:5b0dac21:462d643c
           Name : ncloud:vo1  (local to host ncloud)
  Creation Time : Tue Jan 25 11:23:04 2022
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
     Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
  Used Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=176 sectors
          State : clean
    Device UUID : 0aa352e2:0e6e6da8:76e7f142:a6a97fb0

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Oct 17 06:27:09 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 11c37e6f - correct
         Events : 16251

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 1
   Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 9fd0e97e:8379c390:5b0dac21:462d643c
           Name : ncloud:vo1  (local to host ncloud)
  Creation Time : Tue Jan 25 11:23:04 2022
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
     Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
  Used Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=176 sectors
          State : clean
    Device UUID : 56ce4443:3dae7622:91bb141e:9da1916d

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Oct 17 06:24:41 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 84f2b563 - correct
         Events : 16244

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 9fd0e97e:8379c390:5b0dac21:462d643c
           Name : ncloud:vo1  (local to host ncloud)
  Creation Time : Tue Jan 25 11:23:04 2022
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3906764976 (1862.89 GiB 2000.26 GB)
     Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
  Used Dev Size : 3906764800 (1862.89 GiB 2000.26 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=176 sectors
          State : clean
    Device UUID : 5ccd1f95:179ed088:5fdd6f33:502f7804

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon Oct 17 06:24:41 2022
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : e96953c6 - correct
         Events : 16244

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)

How do I repair this correctly without losing data?

br flag
As we make very clear when you sign up for serverfault.com is for sysadmins working in a professional environment. Please do not post here again until you have read our help pages and observed the workings of the site.
Score:0
za flag

The order of drives in RAID is not as important, because each drive has an on-disk metadata block, which contains the position in the array this particular drive should occupy, amongst other things.

Most probably this is because your array is set to be assembled as early as enough components appear, not necessarily all of them. For RAID 5 any n-1 data disks is enough, so it assembles as three of the four disks are available. And in your particular case of USB disks appear slowly, because they are slow and they are connected to USB.

Don't build RAID arrays over USB. Don't build RAID 5 arrays out of hard disks, especially modern consumer hard disks. Each of them alone is a guaranteed way to lose data, both mixed together make even more guaranteed way. Both issues are not considered to be a reasonable business practice which is this site all about.

Raspberry Pi is not suited to build RAIDs. So this problem is not something you can "fix" or "repair", you've just went the wrong way all from the very beginning. Probably, a "compute module" flavour of Raspberry Pi plugged into some specially designed hosting board that exposed PCIe to SAS/SATA HBA or to a PCIe switch and a bunch of NVMe drives could do software RAID in an acceptable way. But this is impossible with "the" Raspberry Pi.

RandomGuy avatar
gu flag
I have to correct you. This is not USB mounted. I use this SATA platine: https://rockpi.dev/blog/2019/12/08/introduce-rock-pi-sata-hats/
Nikita Kipriyanov avatar
za flag
This is USB SATA controller. So disks *are* plugged through USB.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.