Some years ago, I've set up a storage cluster with stock Debian Linux using mdadm RAID as backing store by using the Debian provided drbd 0.8. Currently there are two RAID6 volumes, comprised of some 2, and 14 TB disks.
Meanwhile, disks have been added to both nodes to provide more space, defective disks have been replaced, and upgrades to newer Debian releases have been applied.
Some background story for context: I experienced unusual high load and lots of KO timeouts on the active node 01 lately. Watching the output of iostat -x 2
on 01, I observed one disk to go up in average queue size numbers for brief periods, much more than each other. For fault isolation I manually set the secondary node 02 to be active and disconnected the drbd network connection volume by volume. With this I finally found the fault to be on one of two RAID volumes on the former primary node 01.
I used smartctl -t long
to run extended checks on the disks, and checked the output with smartctl -a
. There were no obvious values hinting to imminent failure.
To rule out that particular disk, I recreated the volume as RAID5, leaving the suspicious disk out of the new volume. After creating new drbd metadata with default values (drbdadm create-md resname
) and forcing the active node 02 to be sync source, I get the following message in 02's log:
The peer's disk size is too small! (39066455672 < 39066476152 sectors)
The new volume is 20,480 sectors too small!
Comparing values from mdadm --detail
:
Host | Array Size (blocks) | Used Dev Size (blocks)
------+---------------------------------------------
01 | 19,533,834,240 | 1,953,382,400
02 | 19,533,824,000 | 1,953,383,424
Next, I recreated the array again as RAID6, like it was before. Still, there's a size difference
Inspecting the arrays on both machines 01 (new array) and 02 (old array) with mdadm --detail
and shoving them through diff shows:
--- 01 2023-01-19 13:37:48.552858896 +0100
+++ 02 2023-01-19 13:34:58.098143189 +0100
@@ -1,17 +1,17 @@
/dev/md4:
Version : 1.2
- Creation Time : Thu Jan 19 13:37:11 2023
+ Creation Time : Fri Nov 26 11:23:33 2021
Raid Level : raid6
- Array Size : 19533824000 (18628.91 GiB 20002.64 GB)
- Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
+ Array Size : 19533834240 (18628.92 GiB 20002.65 GB)
+ Used Dev Size : 1953383424 (1862.89 GiB 2000.26 GB)
Raid Devices : 12
Total Devices : 12
Persistence : Superblock is persistent
Intent Bitmap : Internal
- Update Time : Thu Jan 19 13:37:24 2023
- State : clean, resyncing
+ Update Time : Thu Jan 19 13:34:11 2023
+ State : active
Active Devices : 12
Working Devices : 12
Failed Devices : 0
@@ -22,22 +22,20 @@
Consistency Policy : bitmap
- Resync Status : 0% complete
-
Name : 4
- UUID : 4a730173:b97ac886:8194cbed:f30861d2
- Events : 3
+ UUID : d6e60c19:5b08166e:a6b78c2b:a2676f7d
+ Events : 4580
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 8 16 1 active sync /dev/sdb
- 2 8 64 2 active sync /dev/sde
+ 2 8 48 2 active sync /dev/sdd
3 8 32 3 active sync /dev/sdc
- 4 8 48 4 active sync /dev/sdd
- 5 8 80 5 active sync /dev/sdf
- 6 8 144 6 active sync /dev/sdj
- 7 8 240 7 active sync /dev/sdp
- 8 8 128 8 active sync /dev/sdi
- 9 65 0 9 active sync /dev/sdq
- 10 8 176 10 active sync /dev/sdl
+ 4 8 80 4 active sync /dev/sdf
+ 5 8 128 5 active sync /dev/sdi
+ 6 8 112 6 active sync /dev/sdh
+ 7 8 144 7 active sync /dev/sdj
+ 8 8 160 8 active sync /dev/sdk
+ 9 8 176 9 active sync /dev/sdl
+ 10 8 192 10 active sync /dev/sdm
11 8 208 11 active sync /dev/sdn
Forcing the Used Dev Size value from 02 yields:
mdadm: /dev/sda is smaller than given size. 1953382400K < 1953383424K + metadata
For multiple disk drives, not only for /dev/sda. How could this array have been functioning when mdadm now claims a lack of space per device?
Questions:
- Why is the new array using 1024 blocks less space per device?
- I have never used the mdadm option
-z max
on any occasion. Testing shows that even with -z max
the Used Dev Size stays at 1953382400 blocks.
- Is there a way to force-create the array with whatever mdadm compatibility option to provide enough space to provide an equally sized backing store for drbd again?