Bug 2119513 - [4.8.z clone] OSD Removal template needs to expose option to force remove the OSD
Summary: [4.8.z clone] OSD Removal template needs to expose option to force remove the...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: OCS 4.8.15
Assignee: Subham Rai
QA Contact: Itzhak
URL:
Whiteboard:
Depends On: 2106025
Blocks: 2119522
TreeView+ depends on / blocked
 
Reported: 2022-08-18 17:25 UTC by Travis Nielsen
Modified: 2022-09-27 15:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
With this update, you can set a flag, `true` or `false` to indicate whether the OSD can be forcefully removed or not. By default, the flag is set to `false` and this ensures that there is no data loss due to the accidental removal of the OSD. When the OSD removal fails and if you are sure that the OSD needs to be removed, then you set the flag to `true` and run the job again.
Clone Of:
Environment:
Last Closed: 2022-09-27 15:37:51 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 1779 0 None open Bug 2119513: [release 4.8] expose option to force remove the OSD in osd… 2022-08-23 04:13:10 UTC
Red Hat Product Errata RHBA-2022:6723 0 None None None 2022-09-27 15:38:00 UTC

Description Travis Nielsen 2022-08-18 17:25:25 UTC
This bug was initially created as a copy of Bug #2027826

I am copying this bug because: 

Requested for backport

This bug was initially created as a copy of Bug #2026007

I am copying this bug because: 

An OCS operator update is needed to expose an option to force removal of an OSD if Ceph indicates the OSD is not safe-to-destroy.

If https://bugzilla.redhat.com/show_bug.cgi?id=2027396 is approved for 4.9.z, we will also need this for 4.9.z.


Description of problem (please be detailed as possible and provide log
snippets):
Use ceph 'osd safe-to-destroy' and 'osd ok-to-stop' feature in OSD purge job

[1] mgr: implement 'osd safe-to-destroy' and 'osd ok-to-stop' commands
     https://github.com/ceph/ceph/pull/16976 
     An osd is safe to destroy if
we have osd_stat for it
osd_stat indicates no pgs stored
all pgs are known
no pgs map to it
i.e., overall data durability will not be affected
An OSD is ok to stop if

we have the pg stats we need
no PGs will drop below min_size
i.e., availability won't be immediately compromised

Comment 12 Subham Rai 2022-09-26 11:29:54 UTC
changed a little bit in the docs. CC @agantony

Comment 18 errata-xmlrpc 2022-09-27 15:37:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.8.15 Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:6723


Note You need to log in before you can comment on or make changes to this bug.