Score:1

Unable to add new Disk to MDADM Raid: Failed to write metadata

in flag

When I try to add a new disk to mdadm, I am getting back an error:
sudo mdadm --add /dev/md0 /dev/sdd --verbose

mdadm: Failed to write metadata to /dev/sdd

Is this a problem with my setup or something else.
I purchased 4 replacement disks that were reportedly new, however I suspect they were either wiped and marked as new or factory refurbished due to the difficulty I have had with working them and the listed accumulated time. I have only tried 2 of the 4 disks so far. My setup is a HB-1235 Disk Enclosure connected via LSI2308. typically using Multipath, however I have since disabled Multipath, and disconnected the second cable to trying to identify why I'm not able to setup the disks. Trying to run fdisk mkfs.ext4 or other disk utilities have not been able to write to it. I ran a badblocks scan which returned no bad blocks. Checking hdparm, the read only flag is not set. One other odd thing I have found is that even though I disabled multipathd and have rebooted, I still see two drives listed with their multipath aliases. Is there a second application that could be running multipath? The disks arrived sealed but I am aware that anyone could reseal a disk into a static bag. Do refurbished disk show factory time?

System Details:

  • Ubuntu 22.04.1 LTS
  • mdadm - v4.2 - 2021-12-30
  • Dmsetup
    • Library version: 1.02.175 (2021-01-08)
    • Driver version: 4.45.0

sudo mdadm --query --detail /dev/md0

/dev/md0:
           Version : 1.2
     Creation Time : Sat Feb  2 22:55:00 2019
        Raid Level : raid6
        Array Size : 19533824000 (18.19 TiB 20.00 TB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 12
     Total Devices : 11
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sat Dec 31 04:39:07 2022
             State : clean, degraded
    Active Devices : 11
   Working Devices : 11
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : media:0  (local to host media)
              UUID : 1599e3ae:2bb24f48:a9524f60:02b6cb8c
            Events : 5824802

    Number   Major   Minor   RaidDevice State
       0       8      144        0      active sync   /dev/sdj
       1       8      176        1      active sync   /dev/sdl
       2       8      128        2      active sync   /dev/sdi
       3       8      112        3      active sync   /dev/sdh
       4       8       96        4      active sync   /dev/sdg
       5       8      192        5      active sync   /dev/sdm
       -       0        0        6      removed
       7       8       80        7      active sync   /dev/sdf
       8       8      160        8      active sync   /dev/sdk
       9     253        1        9      active sync   /dev/dm-1
      10     253        0       10      active sync   /dev/dm-0
      11       8       32       11      active sync   /dev/sdc


cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sdk[8] sdj[0] sdh[3] sdi[2] sdg[4] sdc[11] sdl[1] sdm[5] sdf[7] dm-1[9] dm-0[10]
      19533824000 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [UUUUUU_UUUUU]
      bitmap: 15/15 pages [60KB], 65536KB chunk

sudo smartctl -a /dev/sdd

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-56-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              DKS2P-H2R0SS
Revision:             4F06
Compliance:           SPC-3
User Capacity:        2,000,398,934,016 bytes [2.00 TB]
Logical block size:   512 bytes
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c50041070d0b
Serial number:        Z1P1AFWD0000S138114Q
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Sat Dec 31 12:31:04 2022 PST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     23 C
Drive Trip Temperature:        68 C

Accumulated power on time, hours:minutes 54263:40
Manufactured in week 06 of year 2012
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  151
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  151
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 89937483
  Blocks received from initiator = 195186019
  Blocks read from cache and sent to initiator = 750296
  Number of read and write commands whose size <= segment size = 3072072
  Number of read and write commands whose size > segment size = 0

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 54263.67
  number of minutes until next internal SMART test = 39

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   2324326148        0         0  2324326148          0       1149.377           0
write:         0        0         0         0          0        101.683           0
verify:   459775        0         0    459775          0          0.000           0

Non-medium error count:        0


[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   54215                 - [-   -    -]

Long (extended) Self-test duration: 5 seconds [0.1 minutes]

Here is historical reference from when I last created the array: Multipath Raid 6 - MDADM can't find superblocks - 18.04

EDIT: After reviewing some information online, I suspect my disks were at one point formatted as 520 byte sectors disks. However current review do not show this:

REF: https://www.youtube.com/watch?v=DAaTfv96V9w&ab_channel=ArtofServer https://www.reddit.com/r/homelab/comments/9bu8tf/is_this_drive_actually_bad_or_did_i_screw/

I ran sdparm -all /dev/sdd >> sdd_parm.txt against a few different disks. The existing disks only different by the disk id. However the new disk has multiple differences.

colordiff sdc_parm.txt sdd_parm.txt -B --ignore-matching-lines=RE -W 200

1c1
<     /dev/sdc: SEAGATE   ST2000NM0001      XRBA
---
>     /dev/sdd: SEAGATE   DKS2P-H2R0SS      4F06
10c10
<   EER           0  [cha: y, def:  0, sav:  0]  Enable early recovery (obsolete)
---
>   EER           1  [cha: y, def:  0, sav:  1]  Enable early recovery (obsolete)
18c18
<   RRC           20  [cha: y, def: 20, sav: 20]  Read retry count
---
>   RRC           10  [cha: y, def: 20, sav: 10]  Read retry count
28c28
<   RTL           8000  [cha: y, def: -1, sav:8000]  Recovery time limit (ms)
---
>   RTL           2000  [cha: y, def: -1, sav:2000]  Recovery time limit (ms)
41c41
<   MBS           314  [cha: y, def:314, sav:314]  Maximum burst size (512 bytes)
---
>   MBS           1040  [cha: y, def:314, sav:1040]  Maximum burst size (512 bytes)
54c54
<   DBPPS         512  [cha: n, def:512, sav:512]  Data bytes per physical sector
---
>   DBPPS         512  [cha: n, def:520, sav:512]  Data bytes per physical sector
75c75
<   V_DTE         0  [cha: y, def:  0, sav:  0]  Data terminate on error
---
>   V_DTE         1  [cha: y, def:  0, sav:  1]  Data terminate on error
77c77
<   V_RC          20  [cha: y, def: 20, sav: 20]  Verify retry count
---
>   V_RC          5  [cha: y, def: 20, sav:  5]  Verify retry count
79c79
<   V_RTL         8000  [cha: y, def: -1, sav:8000]  Verify recovery time limit (ms)
---
>   V_RTL         1000  [cha: y, def: -1, sav:1000]  Verify recovery time limit (ms)
92c92
<   WCE           0  [cha: y, def:  0, sav:  0]  Write cache enable
---
>   WCE           0  [cha: y, def:  1, sav:  0]  Write cache enable
118c118
<   NCS           32  [cha: n, def: 32, sav: 32]  Number of cache segments
---
>   NCS           3  [cha: y, def: 32, sav:  3]  Number of cache segments
126,127c126,127
<   D_SENSE       1  [cha: y, def:  0, sav:  1]  Descriptor format sense data
<   GLTSD         0  [cha: y, def:  1, sav:  0]  Global logging target save disable
---
>   D_SENSE       0  [cha: y, def:  0, sav:  0]  Descriptor format sense data
>   GLTSD         1  [cha: y, def:  1, sav:  1]  Global logging target save disable
155c155
<   ESTCT         18500  [cha: n, def:18500, sav:18500]  Extended self test completion time (sec)
---
>   ESTCT         5  [cha: y, def: 14, sav:  5]  Extended self test completion time (sec)
172,174c172,174
<   IDLE_C        0  [cha: n, def:  0, sav:  0]  Idle_c timer enable
<   IDLE_B        1  [cha: y, def:  0, sav:  1]  Idle_b timer enable
<   IDLE          1  [cha: y, def:  0, sav:  1]  Idle_a timer enable
---
>   IDLE_C        0  [cha: y, def:  0, sav:  0]  Idle_c timer enable
>   IDLE_B        0  [cha: y, def:  0, sav:  0]  Idle_b timer enable
>   IDLE          0  [cha: y, def:  0, sav:  0]  Idle_a timer enable
183c183
<   ICCT          0  [cha: n, def:  0, sav:  0]  Idle_c condition timer (100 ms)
---
>   ICCT          18000  [cha: y, def:18000, sav:18000]  Idle_c condition timer (100 ms)
195c195
<   PERF          0  [cha: y, def:  0, sav:  0]  Performance (impact of ie operations)
---
>   PERF          1  [cha: y, def:  0, sav:  1]  Performance (impact of ie operations)
202,203c202,203
<   LOGERR        1  [cha: y, def:  0, sav:  1]  Log informational exception errors
<   MRIE          4  [cha: y, def:  0, sav:  4]  Method of reporting informational exceptions
---
>   LOGERR        0  [cha: y, def:  0, sav:  0]  Log informational exception errors
>   MRIE          0  [cha: y, def:  0, sav:  0]  Method of reporting informational exceptions
207,217c207,208
<   INTT          600  [cha: y, def:  0, sav:600]  Interval timer (100 ms)
<   REPC          0  [cha: y, def:  1, sav:  0]  Report count (or Test flag number [SSC-3])
< Background control (SBC) [bc] mode page [PS=1]:
<   S_L_FULL      0  [cha: n, def:  0, sav:  0]  Suspend on log full
<   LOWIR         0  [cha: n, def:  0, sav:  0]  Log only when intervention required
<   EN_BMS        1  [cha: n, def:  1, sav:  1]  Enable background medium scan
<   EN_PS         0  [cha: y, def:  0, sav:  0]  Enable pre-scan
<   BMS_I         72  [cha: y, def: 72, sav: 72]  Background medium scan interval time (hour)
<   BPS_TL        24  [cha: y, def: 24, sav: 24]  Background pre-scan time limit (hour)
<   MIN_IDLE      250  [cha: y, def:500, sav:250]  Minumum idle time before background scan (ms)
<   MAX_SUSP      0  [cha: y, def:  0, sav:  0]  Maximum time to suspend background scan (ms)
---
>   INTT          0  [cha: y, def:  0, sav:  0]  Interval timer (100 ms)
>   REPC          1  [cha: y, def:  1, sav:  1]  Report count (or Test flag number [SSC-3])

Drive Diff Page 1/5 Drive Diff Page 2/5 Drive Diff Page 3/5 Drive Diff Page 4/5 Drive Diff Page 5/5

Score:0
in flag

Found a solution here: https://www.reddit.com/r/homelab/comments/9bu8tf/is_this_drive_actually_bad_or_did_i_screw/

  1. Final Solution was to download firmware from HP for the drive model
  2. Flash it to the drive using OpenSeaChest ( Note Sectors and Bytes will show 0, this is ok. )
  3. Reboot - I'm not sure if its necessary but I did it anyways
  4. Format the disk using the 512 bytes parameter. I used the OpenSeaChest_Format but I'm sure sg_format would work just the same.

NOTES: I suspect the seller or the people who vendor who provides the drives to the seller, previously wiped the drive and reformatted it to 512, but never flashed the firmware, so the disks were looking for a 520 sector partition but formatted for 512.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.