Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1569413 - Add support to shrink-osd.yml to shrink OSDs deployed with ceph-volume [NEEDINFO]
Add support to shrink-osd.yml to shrink OSDs deployed with ceph-volume
Status: ASSIGNED
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible (Show other bugs)
3.0
Unspecified Unspecified
medium Severity unspecified
: z1
: 3.2
Assigned To: Noah Watkins
Vasishta
Bara Ancincova
:
: 1564444 1643468 1643927 (view as bug list)
Depends On:
Blocks: 1557269 1629656 1584264
  Show dependency treegraph
 
Reported: 2018-04-19 04:59 EDT by leseb
Modified: 2018-10-31 10:45 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
.The `shrink-osd.yml` playbook currently has no support for removing OSDs created by `ceph-volume` The `shrink-osd.yml` playbook assumes all OSDs are created by the `ceph-disk` utility. Consequently, OSDs deployed by using the `ceph-volume` utility cannot be shrunk. To work around this issue, remove OSDs deployed by using `ceph-volume` manually.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
nwatkins: needinfo? (adeza)


Attachments (Terms of Use)
ceph-volume.log (68.82 KB, text/plain)
2018-10-31 10:33 EDT, Noah Watkins
no flags Details

  None (edit)
Description leseb 2018-04-19 04:59:42 EDT
Description of problem:

The shrink-osd.yml playbook currently has no support for removing OSDs created by ceph-volume. It assumes all OSDs were created using ceph-disk

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 3 leseb 2018-09-25 11:40:20 EDT
*** Bug 1564444 has been marked as a duplicate of this bug. ***
Comment 4 leseb 2018-10-26 08:51:59 EDT
*** Bug 1643468 has been marked as a duplicate of this bug. ***
Comment 5 leseb 2018-10-29 09:37:43 EDT
*** Bug 1643927 has been marked as a duplicate of this bug. ***
Comment 6 Ramakrishnan Periyasamy 2018-10-29 09:44:54 EDT
Adding this info from 1643927 bz to make sure this part is not missed for ceph-volume osd removal support.

After purging the cluster using purge-docker-cluster.yml cluster got purged without any issues but osd(ceph-volume) entries are not cleared properly from the baremetal disks. This issue should be observed in shrink-osd.yml too.

lsblk command output:

[ubuntu@host083 ~]$ lsblk
NAME                                              MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                                 8:0    0 931.5G  0 disk 
└─sda1                                              8:1    0 931.5G  0 part /
sdb                                                 8:16   0 931.5G  0 disk 
└─sdb1                                              8:17   0 931.5G  0 part 
  └─ceph--ee79538f--30c1--4dbb--915c--f7c31a283fdc-osd--data--dd3ebb52--41ff--4dd9--82b9--820b706aa8ca
                                                  253:1    0 931.5G  0 lvm  
sdc                                                 8:32   0 931.5G  0 disk 
└─sdc1                                              8:33   0 931.5G  0 part 
  └─ceph--2a28630c--dbd9--4532--b85f--19022326f5ac-osd--data--5e1d755b--37d2--4674--aab2--f2eee8ffc7d8
                                                  253:0    0 931.5G  0 lvm
Comment 7 Noah Watkins 2018-10-30 19:15:34 EDT
Alfredo,

This case mentioned by Ramakrishnan looks like a case that isn't handled by `ceph-volume zap` -- correct? For instance, say I want to purge osd.3:

NAME                   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                      8:0    0   50G  0 disk 
sdb                      8:16   0   50G  0 disk 
├─test_group-data--lv1 252:1    0   25G  0 lvm  /var/lib/ceph/osd/ceph-2
└─test_group-data--lv2 252:2    0 12.5G  0 lvm  /var/lib/ceph/osd/ceph-3

Then zap the storage for osd.3:

[vagrant@osd2 ~]$ sudo ceph-volume lvm zap --destroy test_group/data-lv2


But the logical volume is still present:


[vagrant@osd2 ~]$ lsblk
NAME                   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                      8:0    0   50G  0 disk
sdb                      8:16   0   50G  0 disk
├─test_group-data--lv1 252:1    0   25G  0 lvm  /var/lib/ceph/osd/ceph-2
└─test_group-data--lv2 252:2    0 12.5G  0 lvm


It would seem consistent to remove the volume like happens for raw devices and --destroy, but I suspect this has been thought through before. Should we add an option for this? Parsing through the JSON output of ceph-volume and deciding on what to do with ansible loops is fairly tedious.
Comment 8 Alfredo Deza 2018-10-30 19:27:57 EDT
`zap --destroy` should destroy the LVs, but the OSD should be stopped before. You didn't share the output/logs, but I am guessing it refused to destroy the LVs because they were in use.
Comment 9 Noah Watkins 2018-10-30 19:33:55 EDT
Here are the logs. I stopped the OSDs first. Not destroying the LVs seems to align with what I read in the ceph-volume docs. But maybe I am I invoking it correctly? BTW, I hope I am using the BZ `needsinfo` feature correctly here to grab your feedback!

[vagrant@osd2 ~]$ sudo ceph-volume lvm zap --destroy test_group/data-lv2   
                                                                                                                                        
--> Unmounting /var/lib/ceph/osd/ceph-3                                                                                                                                                                            
Running command: umount -v /var/lib/ceph/osd/ceph-3                                                                                                                                                                
 stderr: umount: /var/lib/ceph/osd/ceph-3 (/dev/mapper/test_group-data--lv2) unmounted                                                                                                                             
--> Zapping: /dev/test_group/data-lv2                                                                                                                                                                              
Running command: wipefs --all /dev/test_group/data-lv2
 stdout: /dev/test_group/data-lv2: 4 bytes were erased at offset 0x00000000 (xfs): 58 46 53 42
Running command: dd if=/dev/zero of=/dev/test_group/data-lv2 bs=1M count=10
 stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.00296541 s, 3.5 GB/s
Running command: lvchange --deltag ceph.type=data /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.journal_uuid=NWOSlT-Yvwq-VBn8-tuOr-ZuLW-euhL-3RORhe /dev/test_group/data-lv2                                                                                              
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.osd_id=3 /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.cluster_fsid=81c2cf74-7df3-4a46-a2ff-a0ddef4caf3f /dev/test_group/data-lv2                                                                                                
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.cluster_name=ceph /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.osd_fsid=062f766b-5d69-4c27-be48-e5e26eb5c6cc /dev/test_group/data-lv2                                                                                                    
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.encrypted=0 /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.data_uuid=rFCgPl-3qAc-JoUi-2B1W-iLkB-0qIS-z6Ggvw /dev/test_group/data-lv2                                                                                                 
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.cephx_lockbox_secret= /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.crush_device_class=None /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.data_device=/dev/test_group/data-lv2 /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.vdo=0 /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
Running command: lvchange --deltag ceph.journal_device=/dev/journals/journal1 /dev/test_group/data-lv2
 stdout: Logical volume test_group/data-lv2 changed.
--> Zapping successful for: test_group/data-lv2

[vagrant@osd2 ~]$ lsblk
NAME                   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                      8:0    0   50G  0 disk
sdb                      8:16   0   50G  0 disk
├─test_group-data--lv1 252:1    0   25G  0 lvm  /var/lib/ceph/osd/ceph-2
└─test_group-data--lv2 252:2    0 12.5G  0 lvm
Comment 10 Ramakrishnan Periyasamy 2018-10-31 04:38:54 EDT
As a QE, we would request to fix this bz in 3.2 itself. After every purge we are spending atleast 45mins to 1hr for re-imaging. Eg: ceph-volume, ceph-ansible tests we need to create and destroy the cluster lot of times and it is pain to spend 1hr everytime for re-imaging.
Comment 11 Alfredo Deza 2018-10-31 09:31:35 EDT
Can you include /var/log/ceph/ceph-volume.log ? The terminal output doesn't tell us the whole story
Comment 12 Noah Watkins 2018-10-31 10:33 EDT
Created attachment 1499429 [details]
ceph-volume.log
Comment 13 Alfredo Deza 2018-10-31 10:34:48 EDT
@noah the right thing to do here is to zap the device to get rid of the vgs and lvs.

In this case you want:

    ceph-volume lvm zap --destroy /dev/sdb
Comment 14 Noah Watkins 2018-10-31 10:43:25 EDT
@alfredo

Indeed, that does work. But that doesn't seem to handle a case like this:

[vagrant@osd2 ~]$ lsblk
NAME                   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                      8:0    0   50G  0 disk
sdb                      8:16   0   50G  0 disk
├─test_group-data--lv1 252:1    0   25G  0 lvm  /var/lib/ceph/osd/ceph-2
└─test_group-data--lv2 252:2    0 12.5G  0 lvm

where the osd that had been using test_group/data-lv2 is removed, but osd.2 remains. That is, a removal of a single OSD rather than a full purge. The lsblk out I posted here is what ceph-ansible is creating for functional testing.

So it seems like either (1) we should not support that layout (2) ceph-volume should have an option to run `lvremove` in this case or (3) ceph-ansible runs lvremove itself. If I'm not missing something fundamental here, solving this in (2) seems vastly simpler than in ceph-ansible.

Note You need to log in before you can comment on or make changes to this bug.