Red Hat Bugzilla – Bug 1569413
Add support to shrink-osd.yml to shrink OSDs deployed with ceph-volume
Last modified: 2018-10-31 10:45:56 EDT
Description of problem: The shrink-osd.yml playbook currently has no support for removing OSDs created by ceph-volume. It assumes all OSDs were created using ceph-disk Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
*** Bug 1564444 has been marked as a duplicate of this bug. ***
*** Bug 1643468 has been marked as a duplicate of this bug. ***
*** Bug 1643927 has been marked as a duplicate of this bug. ***
Adding this info from 1643927 bz to make sure this part is not missed for ceph-volume osd removal support. After purging the cluster using purge-docker-cluster.yml cluster got purged without any issues but osd(ceph-volume) entries are not cleared properly from the baremetal disks. This issue should be observed in shrink-osd.yml too. lsblk command output: [ubuntu@host083 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931.5G 0 disk └─sda1 8:1 0 931.5G 0 part / sdb 8:16 0 931.5G 0 disk └─sdb1 8:17 0 931.5G 0 part └─ceph--ee79538f--30c1--4dbb--915c--f7c31a283fdc-osd--data--dd3ebb52--41ff--4dd9--82b9--820b706aa8ca 253:1 0 931.5G 0 lvm sdc 8:32 0 931.5G 0 disk └─sdc1 8:33 0 931.5G 0 part └─ceph--2a28630c--dbd9--4532--b85f--19022326f5ac-osd--data--5e1d755b--37d2--4674--aab2--f2eee8ffc7d8 253:0 0 931.5G 0 lvm
Alfredo, This case mentioned by Ramakrishnan looks like a case that isn't handled by `ceph-volume zap` -- correct? For instance, say I want to purge osd.3: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm /var/lib/ceph/osd/ceph-3 Then zap the storage for osd.3: [vagrant@osd2 ~]$ sudo ceph-volume lvm zap --destroy test_group/data-lv2 But the logical volume is still present: [vagrant@osd2 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm It would seem consistent to remove the volume like happens for raw devices and --destroy, but I suspect this has been thought through before. Should we add an option for this? Parsing through the JSON output of ceph-volume and deciding on what to do with ansible loops is fairly tedious.
`zap --destroy` should destroy the LVs, but the OSD should be stopped before. You didn't share the output/logs, but I am guessing it refused to destroy the LVs because they were in use.
Here are the logs. I stopped the OSDs first. Not destroying the LVs seems to align with what I read in the ceph-volume docs. But maybe I am I invoking it correctly? BTW, I hope I am using the BZ `needsinfo` feature correctly here to grab your feedback! [vagrant@osd2 ~]$ sudo ceph-volume lvm zap --destroy test_group/data-lv2 --> Unmounting /var/lib/ceph/osd/ceph-3 Running command: umount -v /var/lib/ceph/osd/ceph-3 stderr: umount: /var/lib/ceph/osd/ceph-3 (/dev/mapper/test_group-data--lv2) unmounted --> Zapping: /dev/test_group/data-lv2 Running command: wipefs --all /dev/test_group/data-lv2 stdout: /dev/test_group/data-lv2: 4 bytes were erased at offset 0x00000000 (xfs): 58 46 53 42 Running command: dd if=/dev/zero of=/dev/test_group/data-lv2 bs=1M count=10 stderr: 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.00296541 s, 3.5 GB/s Running command: lvchange --deltag ceph.type=data /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.journal_uuid=NWOSlT-Yvwq-VBn8-tuOr-ZuLW-euhL-3RORhe /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.osd_id=3 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.cluster_fsid=81c2cf74-7df3-4a46-a2ff-a0ddef4caf3f /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.cluster_name=ceph /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.osd_fsid=062f766b-5d69-4c27-be48-e5e26eb5c6cc /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.encrypted=0 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.data_uuid=rFCgPl-3qAc-JoUi-2B1W-iLkB-0qIS-z6Ggvw /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.cephx_lockbox_secret= /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.crush_device_class=None /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.data_device=/dev/test_group/data-lv2 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.vdo=0 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.journal_device=/dev/journals/journal1 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. --> Zapping successful for: test_group/data-lv2 [vagrant@osd2 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm
As a QE, we would request to fix this bz in 3.2 itself. After every purge we are spending atleast 45mins to 1hr for re-imaging. Eg: ceph-volume, ceph-ansible tests we need to create and destroy the cluster lot of times and it is pain to spend 1hr everytime for re-imaging.
Can you include /var/log/ceph/ceph-volume.log ? The terminal output doesn't tell us the whole story
Created attachment 1499429 [details] ceph-volume.log
@noah the right thing to do here is to zap the device to get rid of the vgs and lvs. In this case you want: ceph-volume lvm zap --destroy /dev/sdb
@alfredo Indeed, that does work. But that doesn't seem to handle a case like this: [vagrant@osd2 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm where the osd that had been using test_group/data-lv2 is removed, but osd.2 remains. That is, a removal of a single OSD rather than a full purge. The lsblk out I posted here is what ceph-ansible is creating for functional testing. So it seems like either (1) we should not support that layout (2) ceph-volume should have an option to run `lvremove` in this case or (3) ceph-ansible runs lvremove itself. If I'm not missing something fundamental here, solving this in (2) seems vastly simpler than in ceph-ansible.