Recreation of mdadm RAID backing store for drbd yields The peer's disk size is too small!

Question

Score:1

Server

Recreation of mdadm RAID backing store for drbd yields The peer's disk size is too small!

PoC

1/19/24, 1:01 PM

Some years ago, I've set up a storage cluster with stock Debian Linux using mdadm RAID as backing store by using the Debian provided drbd 0.8. Currently there are two RAID6 volumes, comprised of some 2, and 14 TB disks.

Meanwhile, disks have been added to both nodes to provide more space, defective disks have been replaced, and upgrades to newer Debian releases have been applied.

Some background story for context: I experienced unusual high load and lots of KO timeouts on the active node 01 lately. Watching the output of iostat -x 2 on 01, I observed one disk to go up in average queue size numbers for brief periods, much more than each other. For fault isolation I manually set the secondary node 02 to be active and disconnected the drbd network connection volume by volume. With this I finally found the fault to be on one of two RAID volumes on the former primary node 01.

I used smartctl -t long to run extended checks on the disks, and checked the output with smartctl -a. There were no obvious values hinting to imminent failure.

To rule out that particular disk, I recreated the volume as RAID5, leaving the suspicious disk out of the new volume. After creating new drbd metadata with default values (drbdadm create-md resname) and forcing the active node 02 to be sync source, I get the following message in 02's log:

The peer's disk size is too small! (39066455672 < 39066476152 sectors)

The new volume is 20,480 sectors too small!

Comparing values from mdadm --detail:

 Host | Array Size (blocks) | Used Dev Size (blocks)
------+---------------------------------------------
 01   |      19,533,834,240 |          1,953,382,400
 02   |      19,533,824,000 |          1,953,383,424

Next, I recreated the array again as RAID6, like it was before. Still, there's a size difference

Inspecting the arrays on both machines 01 (new array) and 02 (old array) with mdadm --detail and shoving them through diff shows:

--- 01  2023-01-19 13:37:48.552858896 +0100
+++ 02  2023-01-19 13:34:58.098143189 +0100
@@ -1,17 +1,17 @@
 /dev/md4:
            Version : 1.2
-     Creation Time : Thu Jan 19 13:37:11 2023
+     Creation Time : Fri Nov 26 11:23:33 2021
         Raid Level : raid6
-        Array Size : 19533824000 (18628.91 GiB 20002.64 GB)
-     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
+        Array Size : 19533834240 (18628.92 GiB 20002.65 GB)
+     Used Dev Size : 1953383424 (1862.89 GiB 2000.26 GB)
       Raid Devices : 12
      Total Devices : 12
        Persistence : Superblock is persistent
 
      Intent Bitmap : Internal
 
-       Update Time : Thu Jan 19 13:37:24 2023
-             State : clean, resyncing 
+       Update Time : Thu Jan 19 13:34:11 2023
+             State : active 
     Active Devices : 12
    Working Devices : 12
     Failed Devices : 0
@@ -22,22 +22,20 @@
 
 Consistency Policy : bitmap
 
-     Resync Status : 0% complete
-
               Name : 4
-              UUID : 4a730173:b97ac886:8194cbed:f30861d2
-            Events : 3
+              UUID : d6e60c19:5b08166e:a6b78c2b:a2676f7d
+            Events : 4580
 
     Number   Major   Minor   RaidDevice State
        0       8        0        0      active sync   /dev/sda
        1       8       16        1      active sync   /dev/sdb
-       2       8       64        2      active sync   /dev/sde
+       2       8       48        2      active sync   /dev/sdd
        3       8       32        3      active sync   /dev/sdc
-       4       8       48        4      active sync   /dev/sdd
-       5       8       80        5      active sync   /dev/sdf
-       6       8      144        6      active sync   /dev/sdj
-       7       8      240        7      active sync   /dev/sdp
-       8       8      128        8      active sync   /dev/sdi
-       9      65        0        9      active sync   /dev/sdq
-      10       8      176       10      active sync   /dev/sdl
+       4       8       80        4      active sync   /dev/sdf
+       5       8      128        5      active sync   /dev/sdi
+       6       8      112        6      active sync   /dev/sdh
+       7       8      144        7      active sync   /dev/sdj
+       8       8      160        8      active sync   /dev/sdk
+       9       8      176        9      active sync   /dev/sdl
+      10       8      192       10      active sync   /dev/sdm
       11       8      208       11      active sync   /dev/sdn

Forcing the Used Dev Size value from 02 yields:

mdadm: /dev/sda is smaller than given size. 1953382400K < 1953383424K + metadata

For multiple disk drives, not only for /dev/sda. How could this array have been functioning when mdadm now claims a lack of space per device?

Questions:

Why is the new array using 1024 blocks less space per device?
I have never used the mdadm option -z max on any occasion. Testing shows that even with -z max the Used Dev Size stays at 1953382400 blocks.
Is there a way to force-create the array with whatever mdadm compatibility option to provide enough space to provide an equally sized backing store for drbd again?

84

0 + 0

debian

mdadm

drbd

Recreation of mdadm RAID backing store for drbd yields *The peer's disk size is too small!*