Bug 2181121 - [cee/sd][cephadm]The Dedicated db device is not creating for the newly deployed OSDs for non-collocated scenario.
Summary: [cee/sd][cephadm]The Dedicated db device is not creating for the newly deploy...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 5.3z2
Assignee: Adam King
QA Contact: Manisha Saini
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 2185621
TreeView+ depends on / blocked
 
Reported: 2023-03-23 06:58 UTC by Geo Jose
Modified: 2024-04-24 18:57 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
.Replacing non-collocated OSDs with shared DB device works as expected Previously, in `cephadm`, devices used as DB devices by OSDs were marked as unavailable and filtered out when deploying subsequent OSDs. Due to this, replacement of an individual non-collocated OSD that was using a shared DB device would not work and deployed the OSD as a collocated OSD. With this fix, devices used as DB devices by OSDs are properly marked as Ceph devices and are no longer filtered out. Replacing non-collocated OSDs that uses a shared DB device now work as expected.
Clone Of:
Environment:
Last Closed: 2023-04-11 20:07:59 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-6294 0 None None None 2023-03-23 06:59:30 UTC
Red Hat Knowledge Base (Solution) 6982812 0 None None None 2023-03-23 08:15:46 UTC
Red Hat Product Errata RHBA-2023:1732 0 None None None 2023-04-11 20:08:18 UTC

Description Geo Jose 2023-03-23 06:58:23 UTC
Description of problem:
 - The Dedicated db device is not creating for the newly deployed OSDs for non-collocated scenario.
 - DB devices are ignored and OSDs created as collocated instead of non-colocated scenario.
 - This happens while osd service specification contains the filters.

Version-Release number of selected component (if applicable):
 - RHCS 5.3z1 / 16.2.10-138.el8cp

How reproducible:
Deploy non-collocated osds with advanced service specifications and filters. Then re-deploy any one of the OSD.

Steps to Reproduce:
1. Deploy osds with advanced service specifications and filters(non-collocated scenario)
2. Replace one hardware or re-deploy one osd.
3. Check "ceph-volume lvm list" for the newly deployed 

Actual results:
 - DB devices are ignored and OSDs created as collocated instead of non-colocated scenario.

Expected results:
 - The osds should be deployed as per the given specification.

Comment 1 Geo Jose 2023-03-23 07:02:36 UTC
#### Additional info:

The Dedicated db device is not creating for the newly deployed OSDs for non-collocated scenario.

=======================================================
Environment: 

- Tested ceph version: 16.2.10-138.el8cp

- OSD configuration details:
~~~
[root@02-91-05-node1 ~]# ceph orch ps --service_name osd.osd_fast_big
NAME   HOST            PORTS  STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION            IMAGE ID      CONTAINER ID  
osd.1  02-91-05-node1         running (23h)     4m ago  23h    26.0M    4096M  16.2.10-138.el8cp  8400da5f0ec0  3bbe1c4d989e    <--Try to replace this OSD
osd.2  02-91-05-node2         running (23h)     4m ago  23h    29.0M    4096M  16.2.10-138.el8cp  8400da5f0ec0  6044ed0e865e  
osd.4  02-91-05-node3         running (23h)     4m ago  23h    30.3M    4096M  16.2.10-138.el8cp  8400da5f0ec0  18c01c0e748b  
osd.5  02-91-05-node1         running (23h)     4m ago  23h    25.1M    4096M  16.2.10-138.el8cp  8400da5f0ec0  c05075b024d4  
osd.6  02-91-05-node3         running (23h)     4m ago  23h    30.5M    4096M  16.2.10-138.el8cp  8400da5f0ec0  8ba4125dc732  
osd.7  02-91-05-node2         running (23h)     4m ago  23h    27.4M    4096M  16.2.10-138.el8cp  8400da5f0ec0  02bfb75e0549  
[root@02-91-05-node1 ~]# ceph orch ls --service_name osd.osd_fast_big --export
service_type: osd
service_id: osd_fast_big
service_name: osd.osd_fast_big
placement:
  label: osd
spec:
  block_db_size: 4000000000
  data_devices:
    limit: 2
    size: 18GB:21GB
  db_devices:
    size: 14GB:16GB
  filter_logic: AND
  objectstore: bluestore
[root@02-91-05-node1 ~]# 
~~~

- Disk details from node1:
~~~
[root@02-91-05-node1 ~]# lsscsi 
[0:0:0:2]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sde 
[0:0:0:3]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdd 
[0:0:0:4]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdc 
[0:0:0:5]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sda 
[0:0:0:6]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdb 
[1:0:0:0]    cd/dvd  QEMU     QEMU DVD-ROM     2.5+  /dev/sr0 
[2:0:0:0]    disk    ATA      QEMU HARDDISK    2.5+  /dev/sdf                         <<---Free disk
[N:0:0:1]    disk    QEMU NVMe Ctrl__1                          /dev/nvme0n1
[N:1:0:1]    disk    QEMU NVMe Ctrl__1                          /dev/nvme1n1
[root@02-91-05-node1 ~]#
[ceph: root@02-91-05-node1 /]# ceph-volume lvm list

[...]

====== osd.1 =======

  [block]       /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c

      block device              /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
      block uuid                vee5uv-7ATj-ecTC-zjFe-SFNI-gCjm-XC47nn
      cephx lockbox secret      
      cluster fsid              2a07d5d0-a714-11ed-916a-525400af8347
      cluster name              ceph
      crush device class        
      db device                 /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f
      db uuid                   c4CDco-E1C6-uHOZ-0b3w-9WUf-ufr1-SK3qm4
      encrypted                 0
      osd fsid                  b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
      osd id                    1
      osdspec affinity          osd_fast_big
      type                      block
      vdo                       0
      devices                   /dev/sda

  [db]          /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f

      block device              /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
      block uuid                vee5uv-7ATj-ecTC-zjFe-SFNI-gCjm-XC47nn
      cephx lockbox secret      
      cluster fsid              2a07d5d0-a714-11ed-916a-525400af8347
      cluster name              ceph
      crush device class        
      db device                 /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f
      db uuid                   c4CDco-E1C6-uHOZ-0b3w-9WUf-ufr1-SK3qm4
      encrypted                 0
      osd fsid                  b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
      osd id                    1
      osdspec affinity          osd_fast_big
      type                      db
      vdo                       0
      devices                   /dev/nvme1n1

[...]

[ceph: root@02-91-05-node1 /]# 
~~~

=======================================================

- For testing purpose, will try to remove the `osd.1` which is running on the node1. This is the device details:

~~~
====== osd.1 =======
  [block]       /dev/ceph-cbd631ba-85a6-4ae3-90fd-18128111f5fa/osd-block-b687c9e6-d85d-4c4c-ae6e-cadc8c0d562c
    devices                   /dev/sda
  [db]          /dev/ceph-db62cd30-09b9-454c-b893-9fb6bfbb63bf/osd-db-80a47374-0ecf-46bd-b088-a3568fe8698f      
    devices                   /dev/nvme1n1
~~~

- Try to sumulate the hardware issue by removing the disk and eventually osd will fail(to speed up the failure, try to restart the osd daemon)
~~~
[root@02-91-05-node1 ~]# lsblk /dev/sda 
NAME                                                                                                  MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda                                                                                                     8:0    0  20G  0 disk 
└─ceph--cbd631ba--85a6--4ae3--90fd--18128111f5fa-osd--block--b687c9e6--d85d--4c4c--ae6e--cadc8c0d562c 253:3    0  20G  0 lvm  
[root@02-91-05-node1 ~]# echo 1 > /sys/block/sda/device/delete 
[root@02-91-05-node1 ~]# lsblk /dev/sda 
lsblk: /dev/sda: not a block device
[root@02-91-05-node1 ~]# 
[root@02-91-05-node1 ~]# ceph orch ps --service_name osd.osd_fast_big --daemon_id 1
NAME   HOST            PORTS  STATUS  REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID   
osd.1  02-91-05-node1         error      2m ago   0h        -    4096M  <unknown>  <unknown>  
[root@02-91-05-node1 ~]# 
~~~

- Remove the faulty osd(--zap will clear the db device). This may take some time:
~~~
[root@02-91-05-node1 ~]# ceph orch osd rm 1 --force --zap
Scheduled OSD(s) for removal
[root@02-91-05-node1 ~]# ceph orch osd rm status
OSD  HOST            STATE                    PGS  REPLACE  FORCE  ZAP   DRAIN STARTED AT  
1    02-91-05-node1  done, waiting for purge    0  False    True   True                    
[root@02-91-05-node1 ~]#
[root@02-91-05-node1 ~]# ceph orch osd rm status
No OSD remove/replace operations reported
[root@02-91-05-node1 ~]# 
~~~

- Since there is a free disk(sdf) is already available, and the db was cleared in the above step(with --zap option), the spec should apply automatically.  
~~~
[root@02-91-05-node1 ~]#  ceph orch ps --service_name osd.osd_fast_big 
NAME   HOST            PORTS  STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION            IMAGE ID      CONTAINER ID  
osd.1  02-91-05-node1         running (29s)    23s ago  29s    61.4M    4096M  16.2.10-138.el8cp  8400da5f0ec0  920ac5e8445a    <<---New osd
osd.2  02-91-05-node2         running (0h)      5m ago   0h    29.0M    4096M  16.2.10-138.el8cp  8400da5f0ec0  6044ed0e865e  
osd.4  02-91-05-node3         running (0h)      5m ago   0h    30.1M    4096M  16.2.10-138.el8cp  8400da5f0ec0  18c01c0e748b  
osd.5  02-91-05-node1         running (0h)     23s ago   0h    25.8M    4096M  16.2.10-138.el8cp  8400da5f0ec0  c05075b024d4  
osd.6  02-91-05-node3         running (0h)      5m ago   0h    30.4M    4096M  16.2.10-138.el8cp  8400da5f0ec0  8ba4125dc732  
osd.7  02-91-05-node2         running (0h)      5m ago   0h    27.7M    4096M  16.2.10-138.el8cp  8400da5f0ec0  02bfb75e0549  
[root@02-91-05-node1 ~]# 


[ceph: root@02-91-05-node1 /]# ceph-volume lvm list

====== osd.1 =======

  [block]       /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472

      block device              /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472
      block uuid                pVXJ9P-wF7X-lL9o-exvo-t7NH-wsK1-df8bvT
      cephx lockbox secret      
      cluster fsid              2a07d5d0-a714-11ed-916a-525400af8347
      cluster name              ceph
      crush device class        
      encrypted                 0
      osd fsid                  6d5d7241-c2cf-4d0e-9770-ab5e8608a472
      osd id                    1
      osdspec affinity          osd_fast_big
      type                      block
      vdo                       0
      devices                   /dev/sdf



[ceph: root@02-91-05-node1 /]# lsblk 
NAME                                                                                                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdb                                                                                                     8:16   0   20G  0 disk 
`-ceph--d870aa44--88cc--4cc2--b36f--c09ba6cdaaa1-osd--block--51a5d283--0ce6--4f9a--b56a--d453d5550384 253:2    0   20G  0 lvm  
sdc                                                                                                     8:32   0   20G  0 disk 
`-ceph--3f8c787d--74cf--4030--aeda--55b314b1d0e9-osd--block--27280198--ad6c--44b1--8864--362b4fb526c3 253:5    0   20G  0 lvm  
sdd                                                                                                     8:48   0   20G  0 disk 
`-ceph--6654d961--44af--4003--ba53--fa85309bd045-osd--block--5485c5d5--5fba--462d--90c7--28019ddba01f 253:7    0   20G  0 lvm  
sde                                                                                                     8:64   0   20G  0 disk 
`-ceph--617096a3--b3dc--42bd--8747--b80fd85da496-osd--block--38aff90c--1b71--469a--b47f--872542130f4e 253:9    0   20G  0 lvm  
sdf                                                                                                     8:80   0   20G  0 disk 
`-ceph--84167a64--b20b--46ba--8ce2--84909462c5ca-osd--block--6d5d7241--c2cf--4d0e--9770--ab5e8608a472 253:4    0   20G  0 lvm                     <<----Newly created OSD(collocated).
sr0                                                                                                    11:0    1 1024M  0 rom  
vda                                                                                                   252:0    0   20G  0 disk 
|-vda1                                                                                                252:1    0    1G  0 part /rootfs/boot
`-vda2                                                                                                252:2    0   19G  0 part 
  |-rhel9-root                                                                                        253:0    0   17G  0 lvm  /rootfs
  `-rhel9-swap                                                                                        253:1    0    2G  0 lvm  [SWAP]
nvme0n1                                                                                               259:0    0   10G  0 disk 
|-ceph--506b6221--0d85--404f--a8ef--c14604e74cbe-osd--db--90bf7133--70a8--4e46--9784--0b08dd6ccab0    253:8    0  3.7G  0 lvm  
`-ceph--506b6221--0d85--404f--a8ef--c14604e74cbe-osd--db--e0225565--d939--4e1f--974f--3f66f0de4203    253:10   0  3.7G  0 lvm  
nvme1n1                                                                                               259:1    0   15G  0 disk 
`-ceph--db62cd30--09b9--454c--b893--9fb6bfbb63bf-osd--db--c49c720c--8315--4057--941a--38cccedee3be    253:6    0  3.7G  0 lvm                    <<----NOT created DB even though space is available.
[ceph: root@02-91-05-node1 /]# 
~~~
From the above data, I can see the db device was not created(osd created as collocated instead of non-collocated osd).



- These are the cephadm logs which I can see:
~~~
2023-03-23 10:53:45,918 7f6859fe0b80 DEBUG --------------------------------------------------------------------------------
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=osd_fast_big', '--image', 'registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:8aed15890a6b27a02856e66bf13611a15e6dba71c781a0ae09b3ecc8616ab8fa', 'ceph-volume', '--fsid', '2a07d5d0-a714-11ed-916a-525400af8347', '--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/sdf', '--block-db-size', '4000000000', '--yes', '--no-systemd']


2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: --> passed data devices: 1 physical, 0 LVM
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: --> relative data size: 1.0
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-authtool --gen-print-key
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6d5d7241-c2cf-4d0e-9770-ab5e8608a472
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-84167a64-b20b-46ba-8ce2-84909462c5ca /dev/sdf
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman:  stdout: Physical volume "/dev/sdf" successfully created.
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman:  stdout: Volume group "ceph-84167a64-b20b-46ba-8ce2-84909462c5ca" successfully created
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 5119 -n osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 ceph-84167a64-b20b-46ba-8ce2-84909462c5ca
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman:  stdout: Logical volume "osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472" created.
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-authtool --gen-print-key
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
2023-03-23 10:53:51,170 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-4
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ln -s /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 /var/lib/ceph/osd/ceph-1/block
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-1/activate.monmap
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman:  stderr: got monmap epoch 3
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> Creating keyring file for osd.1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/keyring
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 1 --monmap /var/lib/ceph/osd/ceph-1/activate.monmap --keyfile - --osdspec-affinity osd_fast_big --osd-data /var/lib/ceph/osd/ceph-1/ --osd-uuid 6d5d7241-c2cf-4d0e-9770-ab5e8608a472 --setuser ceph --setgroup ceph
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman:  stderr: 2023-03-23T05:23:48.963+0000 7fb2dd194200 -1 bluestore(/var/lib/ceph/osd/ceph-1/) _read_fsid unparsable uuid
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> ceph-volume lvm prepare successful for: /dev/sdf
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 --path /var/lib/ceph/osd/ceph-1 --no-mon-config
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/ln -snf /dev/ceph-84167a64-b20b-46ba-8ce2-84909462c5ca/osd-block-6d5d7241-c2cf-4d0e-9770-ab5e8608a472 /var/lib/ceph/osd/ceph-1/block
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-4
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> ceph-volume lvm activate successful for osd ID: 1
2023-03-23 10:53:51,171 7f6859fe0b80 DEBUG /bin/podman: --> ceph-volume lvm create successful for: /dev/sdf
~~~

Comment 52 errata-xmlrpc 2023-04-11 20:07:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.3 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:1732

Comment 55 Red Hat Bugzilla 2023-10-13 04:25:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.