Description of problem: The shrink-osd.yml playbook currently has no support for removing OSDs created by ceph-volume. It assumes all OSDs were created using ceph-disk Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
*** Bug 1564444 has been marked as a duplicate of this bug. ***
*** Bug 1643468 has been marked as a duplicate of this bug. ***
*** Bug 1643927 has been marked as a duplicate of this bug. ***
Adding this info from 1643927 bz to make sure this part is not missed for ceph-volume osd removal support. After purging the cluster using purge-docker-cluster.yml cluster got purged without any issues but osd(ceph-volume) entries are not cleared properly from the baremetal disks. This issue should be observed in shrink-osd.yml too. lsblk command output: [ubuntu@host083 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931.5G 0 disk └─sda1 8:1 0 931.5G 0 part / sdb 8:16 0 931.5G 0 disk └─sdb1 8:17 0 931.5G 0 part └─ceph--ee79538f--30c1--4dbb--915c--f7c31a283fdc-osd--data--dd3ebb52--41ff--4dd9--82b9--820b706aa8ca 253:1 0 931.5G 0 lvm sdc 8:32 0 931.5G 0 disk └─sdc1 8:33 0 931.5G 0 part └─ceph--2a28630c--dbd9--4532--b85f--19022326f5ac-osd--data--5e1d755b--37d2--4674--aab2--f2eee8ffc7d8 253:0 0 931.5G 0 lvm
Alfredo, This case mentioned by Ramakrishnan looks like a case that isn't handled by `ceph-volume zap` -- correct? For instance, say I want to purge osd.3: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm /var/lib/ceph/osd/ceph-3 Then zap the storage for osd.3: [vagrant@osd2 ~]$ sudo ceph-volume lvm zap --destroy test_group/data-lv2 But the logical volume is still present: [vagrant@osd2 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm It would seem consistent to remove the volume like happens for raw devices and --destroy, but I suspect this has been thought through before. Should we add an option for this? Parsing through the JSON output of ceph-volume and deciding on what to do with ansible loops is fairly tedious.
`zap --destroy` should destroy the LVs, but the OSD should be stopped before. You didn't share the output/logs, but I am guessing it refused to destroy the LVs because they were in use.
Here are the logs. I stopped the OSDs first. Not destroying the LVs seems to align with what I read in the ceph-volume docs. But maybe I am I invoking it correctly? BTW, I hope I am using the BZ `needsinfo` feature correctly here to grab your feedback! [vagrant@osd2 ~]$ sudo ceph-volume lvm zap --destroy test_group/data-lv2 --> Unmounting /var/lib/ceph/osd/ceph-3 Running command: umount -v /var/lib/ceph/osd/ceph-3 stderr: umount: /var/lib/ceph/osd/ceph-3 (/dev/mapper/test_group-data--lv2) unmounted --> Zapping: /dev/test_group/data-lv2 Running command: wipefs --all /dev/test_group/data-lv2 stdout: /dev/test_group/data-lv2: 4 bytes were erased at offset 0x00000000 (xfs): 58 46 53 42 Running command: dd if=/dev/zero of=/dev/test_group/data-lv2 bs=1M count=10 stderr: 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.00296541 s, 3.5 GB/s Running command: lvchange --deltag ceph.type=data /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.journal_uuid=NWOSlT-Yvwq-VBn8-tuOr-ZuLW-euhL-3RORhe /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.osd_id=3 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.cluster_fsid=81c2cf74-7df3-4a46-a2ff-a0ddef4caf3f /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.cluster_name=ceph /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.osd_fsid=062f766b-5d69-4c27-be48-e5e26eb5c6cc /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.encrypted=0 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.data_uuid=rFCgPl-3qAc-JoUi-2B1W-iLkB-0qIS-z6Ggvw /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.cephx_lockbox_secret= /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.crush_device_class=None /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.data_device=/dev/test_group/data-lv2 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.vdo=0 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. Running command: lvchange --deltag ceph.journal_device=/dev/journals/journal1 /dev/test_group/data-lv2 stdout: Logical volume test_group/data-lv2 changed. --> Zapping successful for: test_group/data-lv2 [vagrant@osd2 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm
As a QE, we would request to fix this bz in 3.2 itself. After every purge we are spending atleast 45mins to 1hr for re-imaging. Eg: ceph-volume, ceph-ansible tests we need to create and destroy the cluster lot of times and it is pain to spend 1hr everytime for re-imaging.
Can you include /var/log/ceph/ceph-volume.log ? The terminal output doesn't tell us the whole story
Created attachment 1499429 [details] ceph-volume.log
@noah the right thing to do here is to zap the device to get rid of the vgs and lvs. In this case you want: ceph-volume lvm zap --destroy /dev/sdb
@alfredo Indeed, that does work. But that doesn't seem to handle a case like this: [vagrant@osd2 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk sdb 8:16 0 50G 0 disk ├─test_group-data--lv1 252:1 0 25G 0 lvm /var/lib/ceph/osd/ceph-2 └─test_group-data--lv2 252:2 0 12.5G 0 lvm where the osd that had been using test_group/data-lv2 is removed, but osd.2 remains. That is, a removal of a single OSD rather than a full purge. The lsblk out I posted here is what ceph-ansible is creating for functional testing. So it seems like either (1) we should not support that layout (2) ceph-volume should have an option to run `lvremove` in this case or (3) ceph-ansible runs lvremove itself. If I'm not missing something fundamental here, solving this in (2) seems vastly simpler than in ceph-ansible.
This is now supported in https://github.com/ceph/ceph-ansible/pull/3280 but the issue of removing the lv/vg is pending changes in ceph-volume.
Merged upstream.
Hi Harish, what information do you need? I'm on PTO, but the work for this bug is upstream in ceph-ansible. Only the ability to do full removal of logical volumes and partitions is missing, and that handled by work being completed in ceph-volume.
(In reply to Noah Watkins from comment #19) > Hi Harish, what information do you need? I'm on PTO, but the work for this > bug is upstream in ceph-ansible. Only the ability to do full removal of > logical volumes and partitions is missing, and that handled by work being > completed in ceph-volume. Sorry to bother you. We want to know by when this BZ will be in ON_QA state. Should I check that with Alfredo?
The work to destroy/remove LVs is already done by BZ 1644828 and committed to ceph-3.2-rhel-7 in RHEL dist-git: http://pkgs.devel.redhat.com/cgit/rpms/ceph/commit/?id=371594a23d4fa8cadb4b462351d5ef201d7d2ab9 There isn't any ceph-volume work pending here for this to work. I'm unsure what needs to happen on the ceph-ansible side. Sebastien might now since Noah is out
The work done by BZ 1644828 was a blocker for this, but it looks like that is done now according to comment #22, so I removed the blocker status here. I don't think there is anything else to do on this ticket. The ceph-volume work was for making sure the lg/lv is removed, and that should now happen transparently.
Why did we retarget this for rc? This was implemented way after the official dev freeze.
Ken: The backport for this is https://github.com/ceph/ceph-ansible/pull/3530 and I just updated it to fix the previous conflicts, so it is awaiting review. Note that the backport of https://github.com/ceph/ceph-ansible/pull/3280 into 3.2 renamed shrink-osd.yml (the ceph-disk based version) to be shrink-osd-ceph-disk.yml and shrink-osd.yml is now the ceph-volume version.
There is an issue in ceph-volume preventing this bug from being fixed. In zap.py ceph-volume has incorrectly defined 'block' when it should be 'db', which causes the collection of related devices to skip block.db always. In addition to that, further enhancement of partition zapping needs to be done to prevent the following: Running command: /usr/sbin/wipefs --all /dev/sdz2 stderr: wipefs: error: /dev/sdz2: probing initialization failed: No such file or directory --> RuntimeError: command returned non-zero exit status: 1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3173