Bug 2181121
| Summary: | [cee/sd][cephadm]The Dedicated db device is not creating for the newly deployed OSDs for non-collocated scenario. | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Geo Jose <gjose> |
| Component: | Cephadm | Assignee: | Adam King <adking> |
| Status: | CLOSED ERRATA | QA Contact: | Manisha Saini <msaini> |
| Severity: | high | Docs Contact: | Akash Raj <akraj> |
| Priority: | unspecified | ||
| Version: | 5.3 | CC: | adking, akraj, atoborek, bhull, bkunal, cephqe-warriors, linuxkidd, lithomas, milverma, msaini, seamurph, shtiwari, sostapov, tserlin, vereddy |
| Target Milestone: | --- | ||
| Target Release: | 5.3z2 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
.Replacing non-collocated OSDs with shared DB device works as expected
Previously, in `cephadm`, devices used as DB devices by OSDs were marked as unavailable and filtered out when deploying subsequent OSDs. Due to this, replacement of an individual non-collocated OSD that was using a shared DB device would not work and deployed the OSD as a collocated OSD.
With this fix, devices used as DB devices by OSDs are properly marked as Ceph devices and are no longer filtered out. Replacing non-collocated OSDs that uses a shared DB device now work as expected.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-04-11 20:07:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2185621 | ||
|
Description
Geo Jose
2023-03-23 06:58:23 UTC
#### Additional info:
The Dedicated db device is not creating for the newly deployed OSDs for non-collocated scenario.
=======================================================
Environment:
- Tested ceph version: 16.2.10-138.el8cp
- OSD configuration details:
~~~
[root@02-91-05-node1 ~]# ceph orch ps --service_name osd.osd_fast_big
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
osd.1 02-91-05-node1 running (23h) 4m ago 23h 26.0M 4096M 16.2.10-138.el8cp 8400da5f0ec0 3bbe1c4d989e <--Try to replace this OSD
osd.2 02-91-05-node2 running (23h) 4m ago 23h 29.0M 4096M 16.2.10-138.el8cp 8400da5f0ec0 6044ed0e865e
osd.4 02-91-05-node3 running (23h) 4m ago 23h 30.3M 4096M 16.2.10-138.el8cp 8400da5f0ec0 18c01c0e748b
osd.5 02-91-05-node1 running (23h) 4m ago 23h 25.1M 4096M 16.2.10-138.el8cp 8400da5f0ec0 c05075b024d4
osd.6 02-91-05-node3 running (23h) 4m ago 23h 30.5M 4096M 16.2.10-138.el8cp 8400da5f0ec0 8ba4125dc732
osd.7 02-91-05-node2 running (23h) 4m ago 23h 27.4M 4096M 16.2.10-138.el8cp 8400da5f0ec0 02bfb75e0549
[root@02-91-05-node1 ~]# ceph orch ls --service_name osd.osd_fast_big --export
service_type: osd
service_id: osd_fast_big
service_name: osd.osd_fast_big
placement:
label: osd
spec:
block_db_size: 4000000000
data_devices:
limit: 2
size: 18GB:21GB
db_devices:
size: 14GB:16GB
filter_logic: AND
objectstore: bluestore
[root@02-91-05-node1 ~]#
~~~
- Disk details from node1:
~~~
[root@02-91-05-node1 ~]# lsscsi
[0:0:0:2] disk QEMU QEMU HARDDISK 2.5+ /dev/sde
[0:0:0:3] disk QEMU QEMU HARDDISK 2.5+ /dev/sdd
[0:0:0:4] disk QEMU QEMU HARDDISK 2.5+ /dev/sdc
[0:0:0:5] disk QEMU QEMU HARDDISK 2.5+ /dev/sda
[0:0:0:6] disk QEMU QEMU HARDDISK 2.5+ /dev/sdb
[1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.5+ /dev/sr0
[2:0:0:0] disk ATA QEMU HARDDISK 2.5+ /dev/sdf <<---Free disk
[N:0:0:1] disk QEMU NVMe Ctrl__1 /dev/nvme0n1
[N:1:0:1] disk QEMU NVMe Ctrl__1 /dev/nvme1n1
[root@02-91-05-node1 ~]#
[ceph: root@02-91-05-node1 /]# ceph-volume lvm list
[...]
====== osd.1 =======
[block] /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
block device /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
block uuid vee5uv-7ATj-ecTC-zjFe-SFNI-gCjm-XC47nn
cephx lockbox secret
cluster fsid 2a07d5d0-a714-11ed-916a-525400af8347
cluster name ceph
crush device class
db device /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f
db uuid c4CDco-E1C6-uHOZ-0b3w-9WUf-ufr1-SK3qm4
encrypted 0
osd fsid b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
osd id 1
osdspec affinity osd_fast_big
type block
vdo 0
devices /dev/sda
[db] /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f
block device /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
block uuid vee5uv-7ATj-ecTC-zjFe-SFNI-gCjm-XC47nn
cephx lockbox secret
cluster fsid 2a07d5d0-a714-11ed-916a-525400af8347
cluster name ceph
crush device class
db device /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f
db uuid c4CDco-E1C6-uHOZ-0b3w-9WUf-ufr1-SK3qm4
encrypted 0
osd fsid b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
osd id 1
osdspec affinity osd_fast_big
type db
vdo 0
devices /dev/nvme1n1
[...]
[ceph: root@02-91-05-node1 /]#
~~~
=======================================================
- For testing purpose, will try to remove the `osd.1` which is running on the node1. This is the device details:
~~~
====== osd.1 =======
[block] /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
devices /dev/sda
[db] /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f
devices /dev/nvme1n1
~~~
- Try to sumulate the hardware issue by removing the disk and eventually osd will fail(to speed up the failure, try to restart the osd daemon)
~~~
[root@02-91-05-node1 ~]# lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 20G 0 disk
└─ceph--cbd631ba--85a6--4ae3--90fd--18128111f5fa-osd--block--b687c9e6--d85d--4c4c--ae6e--cadc8c0d562c 253:3 0 20G 0 lvm
[root@02-91-05-node1 ~]# echo 1 > /sys/block/sda/device/delete
[root@02-91-05-node1 ~]# lsblk /dev/sda
lsblk: /dev/sda: not a block device
[root@02-91-05-node1 ~]#
[root@02-91-05-node1 ~]# ceph orch ps --service_name osd.osd_fast_big --daemon_id 1
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID
osd.1 02-91-05-node1 error 2m ago 0h - 4096M <unknown> <unknown>
[root@02-91-05-node1 ~]#
~~~
- Remove the faulty osd(--zap will clear the db device). This may take some time:
~~~
[root@02-91-05-node1 ~]# ceph orch osd rm 1 --force --zap
Scheduled OSD(s) for removal
[root@02-91-05-node1 ~]# ceph orch osd rm status
OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT
1 02-91-05-node1 done, waiting for purge 0 False True True
[root@02-91-05-node1 ~]#
[root@02-91-05-node1 ~]# ceph orch osd rm status
No OSD remove/replace operations reported
[root@02-91-05-node1 ~]#
~~~
- Since there is a free disk(sdf) is already available, and the db was cleared in the above step(with --zap option), the spec should apply automatically.
~~~
[root@02-91-05-node1 ~]# ceph orch ps --service_name osd.osd_fast_big
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
osd.1 02-91-05-node1 running (29s) 23s ago 29s 61.4M 4096M 16.2.10-138.el8cp 8400da5f0ec0 920ac5e8445a <<---New osd
osd.2 02-91-05-node2 running (0h) 5m ago 0h 29.0M 4096M 16.2.10-138.el8cp 8400da5f0ec0 6044ed0e865e
osd.4 02-91-05-node3 running (0h) 5m ago 0h 30.1M 4096M 16.2.10-138.el8cp 8400da5f0ec0 18c01c0e748b
osd.5 02-91-05-node1 running (0h) 23s ago 0h 25.8M 4096M 16.2.10-138.el8cp 8400da5f0ec0 c05075b024d4
osd.6 02-91-05-node3 running (0h) 5m ago 0h 30.4M 4096M 16.2.10-138.el8cp 8400da5f0ec0 8ba4125dc732
osd.7 02-91-05-node2 running (0h) 5m ago 0h 27.7M 4096M 16.2.10-138.el8cp 8400da5f0ec0 02bfb75e0549
[root@02-91-05-node1 ~]#
[ceph: root@02-91-05-node1 /]# ceph-volume lvm list
====== osd.1 =======
[block] /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472
block device /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472
block uuid pVXJ9P-wF7X-lL9o-exvo-t7NH-wsK1-df8bvT
cephx lockbox secret
cluster fsid 2a07d5d0-a714-11ed-916a-525400af8347
cluster name ceph
crush device class
encrypted 0
osd fsid 6d5d7241-c2cf-4d0e-9770-ab5e8608a472
osd id 1
osdspec affinity osd_fast_big
type block
vdo 0
devices /dev/sdf
[ceph: root@02-91-05-node1 /]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 20G 0 disk
`-ceph--d870aa44--88cc--4cc2--b36f--c09ba6cdaaa1-osd--block--51a5d283--0ce6--4f9a--b56a--d453d5550384 253:2 0 20G 0 lvm
sdc 8:32 0 20G 0 disk
`-ceph--3f8c787d--74cf--4030--aeda--55b314b1d0e9-osd--block--27280198--ad6c--44b1--8864--362b4fb526c3 253:5 0 20G 0 lvm
sdd 8:48 0 20G 0 disk
`-ceph--6654d961--44af--4003--ba53--fa85309bd045-osd--block--5485c5d5--5fba--462d--90c7--28019ddba01f 253:7 0 20G 0 lvm
sde 8:64 0 20G 0 disk
`-ceph--617096a3--b3dc--42bd--8747--b80fd85da496-osd--block--38aff90c--1b71--469a--b47f--872542130f4e 253:9 0 20G 0 lvm
sdf 8:80 0 20G 0 disk
`-ceph--84167a64--b20b--46ba--8ce2--84909462c5ca-osd--block--6d5d7241--c2cf--4d0e--9770--ab5e8608a472 253:4 0 20G 0 lvm <<----Newly created OSD(collocated).
sr0 11:0 1 1024M 0 rom
vda 252:0 0 20G 0 disk
|-vda1 252:1 0 1G 0 part /rootfs/boot
`-vda2 252:2 0 19G 0 part
|-rhel9-root 253:0 0 17G 0 lvm /rootfs
`-rhel9-swap 253:1 0 2G 0 lvm [SWAP]
nvme0n1 259:0 0 10G 0 disk
|-ceph--506b6221--0d85--404f--a8ef--c14604e74cbe-osd--db--90bf7133--70a8--4e46--9784--0b08dd6ccab0 253:8 0 3.7G 0 lvm
`-ceph--506b6221--0d85--404f--a8ef--c14604e74cbe-osd--db--e0225565--d939--4e1f--974f--3f66f0de4203 253:10 0 3.7G 0 lvm
nvme1n1 259:1 0 15G 0 disk
`-ceph--db62cd30--09b9--454c--b893--9fb6bfbb63bf-osd--db--c49c720c--8315--4057--941a--38cccedee3be 253:6 0 3.7G 0 lvm <<----NOT created DB even though space is available.
[ceph: root@02-91-05-node1 /]#
~~~
From the above data, I can see the db device was not created(osd created as collocated instead of non-collocated osd).
- These are the cephadm logs which I can see:
~~~
2023-03-23 10:53:45,918 7f6859fe0b80 DEBUG --------------------------------------------------------------------------------
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=osd_fast_big', '--image', 'registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:8aed15890a6b27a02856e66bf13611a15e6dba71c781a0ae09b3ecc8616ab8fa', 'ceph-volume', '--fsid', '2a07d5d0-a714-11ed-916a-525400af8347', '--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/sdf', '--block-db-size', '4000000000', '--yes', '--no-systemd']
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: --> passed data devices: 1 physical, 0 LVM
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: --> relative data size: 1.0
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-authtool --gen-print-key
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6d5d7241-c2cf-4d0e-9770-ab5e8608a472
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-84167a64-b20b-46ba-8ce2-84909462c5ca /dev/sdf
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: stdout: Physical volume "/dev/sdf" successfully created.
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: stdout: Volume group "ceph-84167a64-b20b-46ba-8ce2-84909462c5ca" successfully created
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 5119 -n osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 ceph-84167a64-b20b-46ba-8ce2-84909462c5ca
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: stdout: Logical volume "osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472" created.
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-authtool --gen-print-key
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-4
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ln -s /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 /var/lib/ceph/osd/ceph-1/block
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-1/activate.monmap
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: stderr: got monmap epoch 3
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> Creating keyring file for osd.1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/keyring
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 1 --monmap /var/lib/ceph/osd/ceph-1/activate.monmap --keyfile - --osdspec-affinity osd_fast_big --osd-data /var/lib/ceph/osd/ceph-1/ --osd-uuid 6d5d7241-c2cf-4d0e-9770-ab5e8608a472 --setuser ceph --setgroup ceph
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: stderr: 2023-03-23T05:23:48.963+0000 7fb2dd194200 -1 bluestore(/var/lib/ceph/osd/ceph-1/) _read_fsid unparsable uuid
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> ceph-volume lvm prepare successful for: /dev/sdf
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 --path /var/lib/ceph/osd/ceph-1 --no-mon-config
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ln -snf /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 /var/lib/ceph/osd/ceph-1/block
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-4
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> ceph-volume lvm activate successful for osd ID: 1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> ceph-volume lvm create successful for: /dev/sdf
~~~
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.3 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:1732 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |