
Why is ceph is not detecting ssd device on a new node?

aw flag

I have installed a ceph cluster (quincy) with already 2 nodes and 4 OSDs. Now I added a 3rd host running Debian (bullseye) to the cluster. The new host is deteced correctly and runs a mom.

The problem is that no OSDs are listed on the new host even if there should be 2 disks available. When I run the command on one of my nodes:

$ sudo ceph orch device ls

I can only see the devices from the other nodes. But the new node is not listed

But lsblk shows the two available disks on the new host:

$ lsblk
sda      8:0    1 476.9G  0 disk 
sdb      8:16   1 476.9G  0 disk 
├─sdb1   8:17   1    16G  0 part [SWAP]
├─sdb2   8:18   1     1G  0 part /boot
└─sdb3   8:19   1 459.9G  0 part /
sdc      8:32   1 476.9G  0 disk 

I also tried the ceph-volume command on the new host. But also this command did not find any disk:

$ sudo cephadm ceph-volume inventory
Inferring fsid e79............
Device Path               Size         Device nodes    rotates available Model name

I already removed the new host and installed the host with a fresh os. But I can't figure out what the reason can be that ceph does not find any disks.

Is it possible that Ceph does not allow mixing nodes with SSD/SATA and SSD/NVME ?

The cephadm.log output during the call ceph-volume inventory seems not to provide any additional information:

2022-12-08 00:15:15,432 7fdca25ac740 DEBUG ---------------------------------------------
cephadm ['ceph-volume', 'inventory']
2022-12-08 00:15:15,432 7fdca25ac740 DEBUG Using default config /etc/ceph/ceph.conf
2022-12-08 00:15:16,131 7fee4d4c8740 DEBUG ---------------------------------------------
cephadm ['check-host']
2022-12-08 00:15:16,131 7fee4d4c8740 INFO docker (/usr/bin/docker) is present
2022-12-08 00:15:16,131 7fee4d4c8740 INFO systemctl is present
2022-12-08 00:15:16,131 7fee4d4c8740 INFO lvcreate is present
2022-12-08 00:15:16,176 7fee4d4c8740 INFO Unit ntp.service is enabled and running
2022-12-08 00:15:16,176 7fee4d4c8740 INFO Host looks OK
2022-12-08 00:15:16,444 7f370bfbf740 DEBUG ---------------------------------------------
cephadm ['--image', '', 'ls']
2022-12-08 00:15:20,100 7fdca25ac740 INFO Inferring fsid 0f3cd66c-74e5-11ed-813b-901b0e95a162
2022-12-08 00:15:20,121 7fdca25ac740 DEBUG /usr/bin/docker: stdout|cc65afd6173a|v17|2022-10-18 01:41:41 +0200 CEST
2022-12-08 00:15:22,253 7f6f2e30a740 DEBUG ---------------------------------------------
cephadm ['gather-facts']
2022-12-08 00:15:22,482 7f82221ce740 DEBUG ---------------------------------------------
cephadm ['--image', '', 'list-networks']
2022-12-08 00:15:24,261 7fdca25ac740 DEBUG Using container info for daemon 'mon'
2022-12-08 00:15:24,261 7fdca25ac740 INFO Using ceph image with id 'cc65afd6173a' and tag 'v17' created on 2022-10-18 01:41:41 +0200 CEST

ceph-volume.log output:

[2022-12-07 23:24:00,496][ceph_volume.main][INFO  ] Running command: ceph-volume  inventory
[2022-12-07 23:24:00,499][ceph_volume.util.system][INFO  ] Executable lvs found on the host, will use /sbin/lvs
[2022-12-07 23:24:00,499][ceph_volume.process][INFO  ] Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S  -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size
[2022-12-07 23:24:00,569][ceph_volume.process][INFO  ] stdout NAME="sda" KNAME="sda" PKNAME="" MAJ:MIN="8:0" FSTYPE="" MOUNTPOINT="" LABEL="" UUID="" RO="0" RM="1" MODEL="Crucial_CT500MX2" SIZE="465.8G" STATE="running" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="0" SCHED="mq-deadline" TYPE="disk" DISC-ALN="0" DISC-GRAN="4K" DISC-MAX="2G" DISC-ZERO="0" PKNAME="" PARTLABEL=""
[2022-12-07 23:24:00,570][ceph_volume.process][INFO  ] stdout NAME="sda1" KNAME="sda1" PKNAME="sda" MAJ:MIN="8:1" FSTYPE="swap" MOUNTPOINT="[SWAP]" LABEL="" UUID="51f95805-2d5f-4cba-a885-775a0c19ad53" RO="0" RM="1" MODEL="" SIZE="32G" STATE="" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="0" SCHED="mq-deadline" TYPE="part" DISC-ALN="0" DISC-GRAN="4K" DISC-MAX="2G" DISC-ZERO="0" PKNAME="sda" PARTLABEL=""
[2022-12-07 23:24:00,570][ceph_volume.process][INFO  ] stdout NAME="sda2" KNAME="sda2" PKNAME="sda" MAJ:MIN="8:2" FSTYPE="ext3" MOUNTPOINT="/rootfs/boot" LABEL="" UUID="676438b6-3214-4c05-bc6b-94bd7a88c26f" RO="0" RM="1" MODEL="" SIZE="1G" STATE="" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="0" SCHED="mq-deadline" TYPE="part" DISC-ALN="0" DISC-GRAN="4K" DISC-MAX="2G" DISC-ZERO="0" PKNAME="sda" PARTLABEL=""
[2022-12-07 23:24:00,570][ceph_volume.process][INFO  ] stdout NAME="sda3" KNAME="sda3" PKNAME="sda" MAJ:MIN="8:3" FSTYPE="ext4" MOUNTPOINT="/rootfs" LABEL="" UUID="a251c9b0-a91c-4768-bd42-5730e032ce58" RO="0" RM="1" MODEL="" SIZE="432.8G" STATE="" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="0" SCHED="mq-deadline" TYPE="part" DISC-ALN="0" DISC-GRAN="4K" DISC-MAX="2G" DISC-ZERO="0" PKNAME="sda" PARTLABEL=""
[2022-12-07 23:24:00,570][ceph_volume.process][INFO  ] stdout NAME="sdb" KNAME="sdb" PKNAME="" MAJ:MIN="8:16" FSTYPE="" MOUNTPOINT="" LABEL="" UUID="" RO="0" RM="1" MODEL="Crucial_CT500MX2" SIZE="465.8G" STATE="running" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="0" SCHED="mq-deadline" TYPE="disk" DISC-ALN="0" DISC-GRAN="4K" DISC-MAX="2G" DISC-ZERO="0" PKNAME="" PARTLABEL=""
[2022-12-07 23:24:00,573][ceph_volume.util.system][INFO  ] Executable pvs found on the host, will use /sbin/pvs
[2022-12-07 23:24:00,573][ceph_volume.process][INFO  ] Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/pvs --noheadings --readonly --units=b --nosuffix --separator=";" -o pv_name,vg_name,pv_count,lv_count,vg_attr,vg_extent_count,vg_free_count,vg_extent_size
us flag
Although I don't have a Quincy cluster yet I don't think a mix of disks could be a problem here. What does the cephadm.log tell you? It should have some debug information why the disks were filtered.
Ralph avatar
aw flag
I have added the log output into my question. It seems the log does not provide additional info. Strange that ceph-volume did not show any device. I have tested it now on different servers and it seems the problem is only some hosts - seems to depend on the hardware .... Also opened this bug report now:
Ralph avatar
aw flag
I added also the ceph-volume.log showing more info - it lists the devices..
us flag
Interesting, I haven't seen that yet. Maybe it's really a bug, thanks for reporting. One last thought though, is it possible that on those hosts you have different lvm.conf filter? I don't know if that makes sense, but maybe worth checking.
Ralph avatar
aw flag
I checked the conf with `cat /etc/lvm/lvm.conf` - they are identically on both test machines
ki flag

After searching for quiete some time and not being able to detect the SAS-Devices in my node, I managed to get my HDD up as OSD by adding them manually with the following commands:

cephadm shell
ceph orch daemon add osd --method raw host1:/dev/sda
I sit in a Tesla and translated this thread with Ai:


Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.