Bug 2067952 - [CEE/SD][ceph-volume] ceph-volume fails to remove the lvm tags while removing the OSD.
Summary: [CEE/SD][ceph-volume] ceph-volume fails to remove the lvm tags while removing...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Volume
Version: 5.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 5.1z3
Assignee: Guillaume Abrioux
QA Contact: Ameena Suhani S H
URL:
Whiteboard:
Depends On:
Blocks: 2099281
TreeView+ depends on / blocked
 
Reported: 2022-03-24 06:15 UTC by Geo Jose
Modified: 2022-06-23 21:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2099281 (view as bug list)
Environment:
Last Closed: 2022-06-20 12:48:03 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-3846 0 None None None 2022-03-24 06:23:44 UTC

Description Geo Jose 2022-03-24 06:15:30 UTC
Description of problem:
 - ceph-volume fails to remove the lvm tags while removing the OSD.
 - After removing OSD from RHCS, ceph-volume lvm list reports incorrect results since the tags are not removed properly.

Version-Release number of selected component (if applicable):
 - RHCS 5

How reproducible:
100%

Steps to Reproduce:
1. Install RHCS 5 cluster and add osds.
2. Remove any of the osd.
3. Check lvm tags.

Actual results:
 - The osds are removing from the ceph cluster. But the lvm tags(which is added while osd deployment) are not removing. 

Expected results:
 - The tags which added by ceph-volume during the osd deployment, should be removed while cleaning it(while removing the osd from the cluster).

Additional info:
 - Due to this, ceph-volume lvm list will provide incorrect result.

Comment 1 Geo Jose 2022-03-24 06:20:22 UTC
#### Test Results

1. Environment details 
~~~
[ceph: root@machine1 /]# lsblk -o NAME,SIZE,ROTA
NAME               SIZE ROTA
sda                 15G    1
`-vg_hdd-lv_hdd     15G    1
sr0               1024M    1
vda                 20G    1
|-vda1               1G    1
`-vda2              19G    1
  |-rhel-root       17G    1
  `-rhel-swap        2G    1
nvme0n1             10G    0
`-vg_nvme-lv_nvme   10G    0
[ceph: root@machine1 /]#
[ceph: root@machine1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.07315  root default
-5         0.02438      host machine1
 0    hdd  0.01459          osd.0          up   1.00000  1.00000
 1    ssd  0.00980          osd.1          up   1.00000  1.00000
-3         0.02438      host machine2
 4    hdd  0.01459          osd.4          up   1.00000  1.00000
 6    ssd  0.00980          osd.6          up   1.00000  1.00000
-7         0.02438      host machine3
 3    hdd  0.01459          osd.3          up   1.00000  1.00000
 8    ssd  0.00980          osd.8          up   1.00000  1.00000
[ceph: root@machine1 /]#

[ceph: root@machine1 /]# lvs -ao+devices,tags vg_hdd vg_nvme
  LV      VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices         LV Tags
  lv_hdd  vg_hdd  -wi-ao---- <15.00g                                                     /dev/sda(0)     ceph.block_device=/dev/vg_hdd/lv_hdd,ceph.block_uuid=lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98,ceph.osd_id=0,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
  lv_nvme vg_nvme -wi-ao---- <10.00g                                                     /dev/nvme0n1(0) ceph.block_device=/dev/vg_nvme/lv_nvme,ceph.block_uuid=0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=030bb560-834e-42d5-a34c-6de19eed7de6,ceph.osd_id=1,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
[ceph: root@machine1 /]# ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/vg_hdd/lv_hdd

      block device              /dev/vg_hdd/lv_hdd
      block uuid                lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98
      osd id                    0
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/sda

====== osd.1 =======

  [block]       /dev/vg_nvme/lv_nvme

      block device              /dev/vg_nvme/lv_nvme
      block uuid                0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  030bb560-834e-42d5-a34c-6de19eed7de6
      osd id                    1
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/nvme0n1
[ceph: root@machine1 /]#
~~~



2. Delete the OSD:
~~~
[ceph: root@machine1 /]# ceph orch osd rm 0 1
Scheduled OSD(s) for removal
[ceph: root@machine1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.04877  root default
-5               0      host machine1
-3         0.02438      host machine2
 4    hdd  0.01459          osd.4          up   1.00000  1.00000
 6    ssd  0.00980          osd.6          up   1.00000  1.00000
-7         0.02438      host machine3
 3    hdd  0.01459          osd.3          up   1.00000  1.00000
 8    ssd  0.00980          osd.8          up   1.00000  1.00000
[ceph: root@machine1 /]#
~~~


3. Check LVM tags:
~~~
[ceph: root@machine1 /]# lvs -ao+devices,tags vg_hdd vg_nvme
  LV      VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices         LV Tags
  lv_hdd  vg_hdd  -wi-a----- <15.00g                                                     /dev/sda(0)     ceph.block_device=/dev/vg_hdd/lv_hdd,ceph.block_uuid=lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98,ceph.osd_id=0,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
  lv_nvme vg_nvme -wi-a----- <10.00g                                                     /dev/nvme0n1(0) ceph.block_device=/dev/vg_nvme/lv_nvme,ceph.block_uuid=0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=401ab5f4-9f90-11ec-98dc-5254002a8865,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=030bb560-834e-42d5-a34c-6de19eed7de6,ceph.osd_id=1,ceph.osdspec_affinity=ssd.machine1,ceph.type=block,ceph.vdo=0
[ceph: root@machine1 /]#
~~~


4. Even after removing the OSD, ceph-volume lvm list reports the below output:
~~~
[ceph: root@machine1 /]#  ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/vg_hdd/lv_hdd

      block device              /dev/vg_hdd/lv_hdd
      block uuid                lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98
      osd id                    0
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/sda

====== osd.1 =======

  [block]       /dev/vg_nvme/lv_nvme

      block device              /dev/vg_nvme/lv_nvme
      block uuid                0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  030bb560-834e-42d5-a34c-6de19eed7de6
      osd id                    1
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/nvme0n1
[ceph: root@machine1 /]#
~~~

Comment 2 Geo Jose 2022-03-24 06:25:21 UTC
#### Other Impact due to this tag

5. Add new disk and add it to OSD:
~~~
[ceph: root@machine1 /]# lsblk -o NAME,SIZE,ROTA
NAME               SIZE ROTA
sda                 15G    1
`-vg_hdd-lv_hdd     15G    1
sdb                 15G    1
sr0               1024M    1
vda                 20G    1
|-vda1               1G    1
`-vda2              19G    1
  |-rhel-root       17G    1
  `-rhel-swap        2G    1
nvme0n1             10G    0
`-vg_nvme-lv_nvme   10G    0
[ceph: root@machine1 /]# ceph orch daemon add osd machine1:/dev/sdb  <<==
Created osd(s) 0 on host 'machine1'
[ceph: root@machine1 /]#
[ceph: root@machine1 /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.06335  root default
-5         0.01459      host machine1
 0    hdd  0.01459          osd.0          up   1.00000  1.00000      <<==
-3         0.02438      host machine2
 4    hdd  0.01459          osd.4          up   1.00000  1.00000
 6    ssd  0.00980          osd.6          up   1.00000  1.00000
-7         0.02438      host machine3
 3    hdd  0.01459          osd.3          up   1.00000  1.00000
 8    ssd  0.00980          osd.8          up   1.00000  1.00000
~~~


6. "ceph-volume lvm list" provides wrong information:
~~~
[ceph: root@machine1 /]# ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/ceph-0cf65f13-6e64-40e5-94c7-8eff67f436fb/osd-block-0c83ea78-6c47-4706-a714-bc8b4712c038

      block device              /dev/ceph-0cf65f13-6e64-40e5-94c7-8eff67f436fb/osd-block-0c83ea78-6c47-4706-a714-bc8b4712c038
      block uuid                x9jFVg-1UXr-TXXE-dDOy-gC6i-w7oX-evZrOH
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  0c83ea78-6c47-4706-a714-bc8b4712c038
      osd id                    0
      osdspec affinity          None
      type                      block
      vdo                       0
      devices                   /dev/sdb

  [block]       /dev/vg_hdd/lv_hdd                                          <<==Wrong

      block device              /dev/vg_hdd/lv_hdd
      block uuid                lcCArZ-bWET-15dz-1sIc-o0US-v5bN-l0JJMg
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  ff1a32ea-53d7-4ae1-8329-b2f4d08bdc98
      osd id                    0
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/sda

====== osd.1 =======

  [block]       /dev/vg_nvme/lv_nvme

      block device              /dev/vg_nvme/lv_nvme
      block uuid                0MawfQ-bs04-XGZn-VtDc-2kh9-mFKc-UamAdQ
      cephx lockbox secret
      cluster fsid              401ab5f4-9f90-11ec-98dc-5254002a8865
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  030bb560-834e-42d5-a34c-6de19eed7de6
      osd id                    1
      osdspec affinity          ssd.machine1
      type                      block
      vdo                       0
      devices                   /dev/nvme0n1
[ceph: root@machine1 /]#
~~~


Note You need to log in before you can comment on or make changes to this bug.