Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2067952

Summary: [CEE/SD][ceph-volume] ceph-volume fails to remove the lvm tags while removing the OSD.
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Geo Jose <gjose>
Component: Ceph-VolumeAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED NOTABUG QA Contact: Ameena Suhani S H <amsyedha>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: ceph-eng-bugs, gabrioux, mmuench, msaini
Target Milestone: ---   
Target Release: 5.1z3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2099281 (view as bug list) Environment:
Last Closed: 2022-06-20 12:48:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2099281    

Description Geo Jose 2022-03-24 06:15:30 UTC
Description of problem:
 - ceph-volume fails to remove the lvm tags while removing the OSD.
 - After removing OSD from RHCS, ceph-volume lvm list reports incorrect results since the tags are not removed properly.

Version-Release number of selected component (if applicable):
 - RHCS 5

How reproducible:
100%

Steps to Reproduce:
1. Install RHCS 5 cluster and add osds.
2. Remove any of the osd.
3. Check lvm tags.

Actual results:
 - The osds are removing from the ceph cluster. But the lvm tags(which is added while osd deployment) are not removing. 

Expected results:
 - The tags which added by ceph-volume during the osd deployment, should be removed while cleaning it(while removing the osd from the cluster).

Additional info:
 - Due to this, ceph-volume lvm list will provide incorrect result.

Comment 1 Geo Jose 2022-03-24 06:20:22 UTC
#### Test Results

1. Environment details 
~~~
[ceph: root@machine1 /]# lsblk -o NAME,SIZE,ROTA
NAME               SIZE ROTA
sda                 15G    1
`-vg_hdd-lv_hdd     15G    1
sr0               1024M    1
vda                 20G    1
|-vda1               1G    1
`-vda2              19G    1
  |-rhel-root       17G    1
  `-rhel-swap        2G    1
nvme0n1             10G    0
`-vg_nvme-lv_nvme   10G    0
[ceph: root@machine1 /]#
[ceph: root@machine1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.07315  root default
-5         0.02438      host machine1
 0    hdd  0.01459          osd.0          up   1.00000  1.00000
 1    ssd  0.00980          osd.1          up   1.00000  1.00000
-3         0.02438      host machine2
 4    hdd  0.01459          osd.4          up   1.00000  1.00000
 6    ssd  0.00980          osd.6          up   1.00000  1.00000
-7         0.02438      host machine3
 3    hdd  0.01459          osd.3          up   1.00000  1.00000
 8    ssd  0.00980          osd.8          up   1.00000  1.00000
[ceph: root@machine1 /]#

[ceph: root@machine1 /]# lvs -ao+devices,tags vg_hdd vg_nvme
  LV      VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices         LV Tags
  lv_hdd  vg_hdd  -wi-ao---- <15.00g                                                     /dev/sda(0)     ceph.block_device=/dev/vg_hdd/lv_hdd,ceph.block_uuid=lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98,ceph.osd_id=0,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
  lv_nvme vg_nvme -wi-ao---- <10.00g                                                     /dev/nvme0n1(0) ceph.block_device=/dev/vg_nvme/lv_nvme,ceph.block_uuid=0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=030bb560-834e-42d5-a34c-6de19eed7de6,ceph.osd_id=1,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
[ceph: root@machine1 /]# ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/vg_hdd/lv_hdd

      block device              /dev/vg_hdd/lv_hdd
      block uuid                lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98
      osd id                    0
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/sda

====== osd.1 =======

  [block]       /dev/vg_nvme/lv_nvme

      block device              /dev/vg_nvme/lv_nvme
      block uuid                0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  030bb560-834e-42d5-a34c-6de19eed7de6
      osd id                    1
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/nvme0n1
[ceph: root@machine1 /]#
~~~



2. Delete the OSD:
~~~
[ceph: root@machine1 /]# ceph orch osd rm 0 1
Scheduled OSD(s) for removal
[ceph: root@machine1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.04877  root default
-5               0      host machine1
-3         0.02438      host machine2
 4    hdd  0.01459          osd.4          up   1.00000  1.00000
 6    ssd  0.00980          osd.6          up   1.00000  1.00000
-7         0.02438      host machine3
 3    hdd  0.01459          osd.3          up   1.00000  1.00000
 8    ssd  0.00980          osd.8          up   1.00000  1.00000
[ceph: root@machine1 /]#
~~~


3. Check LVM tags:
~~~
[ceph: root@machine1 /]# lvs -ao+devices,tags vg_hdd vg_nvme
  LV      VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices         LV Tags
  lv_hdd  vg_hdd  -wi-a----- <15.00g                                                     /dev/sda(0)     ceph.block_device=/dev/vg_hdd/lv_hdd,ceph.block_uuid=lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98,ceph.osd_id=0,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
  lv_nvme vg_nvme -wi-a----- <10.00g                                                     /dev/nvme0n1(0) ceph.block_device=/dev/vg_nvme/lv_nvme,ceph.block_uuid=0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=030bb560-834e-42d5-a34c-6de19eed7de6,ceph.osd_id=1,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
[ceph: root@machine1 /]#
~~~


4. Even after removing the OSD, ceph-volume lvm list reports the below output:
~~~
[ceph: root@machine1 /]#  ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/vg_hdd/lv_hdd

      block device              /dev/vg_hdd/lv_hdd
      block uuid                lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98
      osd id                    0
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/sda

====== osd.1 =======

  [block]       /dev/vg_nvme/lv_nvme

      block device              /dev/vg_nvme/lv_nvme
      block uuid                0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  030bb560-834e-42d5-a34c-6de19eed7de6
      osd id                    1
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/nvme0n1
[ceph: root@machine1 /]#
~~~

Comment 2 Geo Jose 2022-03-24 06:25:21 UTC
#### Other Impact due to this tag

5. Add new disk and add it to OSD:
~~~
[ceph: root@machine1 /]# lsblk -o NAME,SIZE,ROTA
NAME               SIZE ROTA
sda                 15G    1
`-vg_hdd-lv_hdd     15G    1
sdb                 15G    1
sr0               1024M    1
vda                 20G    1
|-vda1               1G    1
`-vda2              19G    1
  |-rhel-root       17G    1
  `-rhel-swap        2G    1
nvme0n1             10G    0
`-vg_nvme-lv_nvme   10G    0
[ceph: root@machine1 /]# ceph orch daemon add osd machine1:/dev/sdb  <<==
Created osd(s) 0 on host 'machine1'
[ceph: root@machine1 /]#
[ceph: root@machine1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.06335  root default
-5         0.01459      host machine1
 0    hdd  0.01459          osd.0          up   1.00000  1.00000      <<==
-3         0.02438      host machine2
 4    hdd  0.01459          osd.4          up   1.00000  1.00000
 6    ssd  0.00980          osd.6          up   1.00000  1.00000
-7         0.02438      host machine3
 3    hdd  0.01459          osd.3          up   1.00000  1.00000
 8    ssd  0.00980          osd.8          up   1.00000  1.00000
~~~


6. "ceph-volume lvm list" provides wrong information:
~~~
[ceph: root@machine1 /]# ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/ceph-0cf65f13-6e64-40e5-94c7-8eff67f436fb/osd-block-0c83ea78-6c47-4706-a714-bc8b4712c038

      block device              /dev/ceph-0cf65f13-6e64-40e5-94c7-8eff67f436fb/osd-block-0c83ea78-6c47-4706-a714-bc8b4712c038
      block uuid                x9jFVg-1UXr-TXXE-dDOy-gC6i-w7oX-evZrOH
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  0c83ea78-6c47-4706-a714-bc8b4712c038
      osd id                    0
      osdspec affinity          None
      type                      block
      vdo                       0
      devices                   /dev/sdb

  [block]       /dev/vg_hdd/lv_hdd                                          <<==Wrong

      block device              /dev/vg_hdd/lv_hdd
      block uuid                lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98
      osd id                    0
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/sda

====== osd.1 =======

  [block]       /dev/vg_nvme/lv_nvme

      block device              /dev/vg_nvme/lv_nvme
      block uuid                0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  030bb560-834e-42d5-a34c-6de19eed7de6
      osd id                    1
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/nvme0n1
[ceph: root@machine1 /]#
~~~