Bug 1644847

Summary: [RFE] ceph-volume zap enhancements based on the OSD ID instead of a device
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Sébastien Han <shan>
Component: Ceph-VolumeAssignee: Andrew Schoen <aschoen>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: high Docs Contact: Bara Ancincova <bancinco>
Priority: high    
Version: 3.2CC: adeza, aschoen, ceph-eng-bugs, ceph-qe-bugs, gmeno, kdreyer, nwatkins, pasik, seb, tchandra, tserlin, vashastr
Target Milestone: rcKeywords: FutureFeature
Target Release: 3.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.12-33.el7cp Ubuntu: ceph_12.2.12-30redhat1 Doc Type: Enhancement
Doc Text:
.New `ceph-volume lvm zap` options: `--osd.id` and `--osd-fsid` The `ceph-volume lvm zap` command now supports the `--osd.id` and `--osd-fsid` options. Use these options to remove any devices for an OSD by providing its ID or FSID, respectively. This is especially useful if you are not aware of the actual device names or logical volumes in use by that OSD.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-21 15:10:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1569413, 1572933, 1726135, 1728710    

Description Sébastien Han 2018-10-31 17:17:54 UTC
Description of problem:

New call that will zap based on an OSD ID.
When we want to zap an OSD, not necessarily a block device.

The idea would be to do "ceph-volume lvm zap 0", where 0 is the OSD ID.
ceph-volume will then:

* scan the OSD
* find all the associated lvs to that OSD
* remove all the LVs

Bonus, if there are no other LV present on the VG/PV the OSD was removed from we also purge the VG and PV.
If they are LVs remaining we don't do anything.

Comment 3 seb 2018-12-11 16:47:41 UTC
After discussing this further with Alfredo, we agreed to remove the support of --osd-id and only keep --osd-fsid.
Indeed, using --osd-fsid is much safer in the context of containers.
If someone deploys multiple clusters on the same machine we could accidentally remove all the OSD having the same ID.
If ceph-volume detects multiple OSD ID it will remove all of them where we only want to remove a single one (the one from the cluster we targetted).

So using osd-fsid will allow us to precisely match the right OSD in all cases.

Comment 4 Alfredo Deza 2019-03-08 13:46:58 UTC
@Sebastien, could you please confirm that we wouldn't require this BZ to track the feature request to drop --osd-id in favor of --osd-fsid since it isn't possible to run multiple clusters
on the same machine? (http://lists.ceph.com/pipermail/ceph-ansible-ceph.com/2019-January/000249.html)

Comment 5 Sébastien Han 2019-03-08 13:54:45 UTC
Alfredo, correct, we don't need to track the feature request to drop --osd-id in favor of --osd-fsid.
Even though ceph-ansible cannot do that, Rook can, so there is still a desire to drop --osd-id in favor of --osd-fsid.

Thanks.

Comment 16 Vasishta 2019-08-07 03:57:17 UTC
Thanks a lo Alfredo.

Based on inputs in Comment 14 and Comment 15 , I'm moving this BZ to VERIFIED state.

I've opened new BZ 1738379 to address issue mentioned in Comment 13 .

Taking off needinfo flags, feel free to update if there are any.


Regards,
Vasishta Shastry
QE, Ceph

Comment 21 errata-xmlrpc 2019-08-21 15:10:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538