Bug 2119525

Summary: [4.9.z docs clone] OSD Removal template needs to expose option to force remove the OSD
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Travis Nielsen <tnielsen>
Component: documentationAssignee: Agil Antony <agantony>
Status: CLOSED CURRENTRELEASE QA Contact: Rachael <rgeorge>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.8CC: agantony, kramdoss, ocs-bugs, odf-bz-bot, rcyriac, rgeorge, sheggodu
Target Milestone: ---   
Target Release: ODF 4.9.11   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-09 12:47:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2119510, 2120260    
Bug Blocks:    

Description Travis Nielsen 2022-08-18 17:39:41 UTC
This bug was initially created as a copy of Bug #2027826

I am copying this bug because: 

The docs must be updated if the dependent fixes are made. 
As seen in https://github.com/red-hat-storage/ocs-operator/pull/1518/files and documented in 4.10 already, we need the 4.9 docs to also reflect the change.


This bug was initially created as a copy of Bug #2026007

I am copying this bug because: 

An OCS operator update is needed to expose an option to force removal of an OSD if Ceph indicates the OSD is not safe-to-destroy.

If https://bugzilla.redhat.com/show_bug.cgi?id=2027396 is approved for 4.9.z, we will also need this for 4.9.z.


Description of problem (please be detailed as possible and provide log
snippets):
Use ceph 'osd safe-to-destroy' and 'osd ok-to-stop' feature in OSD purge job

[1] mgr: implement 'osd safe-to-destroy' and 'osd ok-to-stop' commands
     https://github.com/ceph/ceph/pull/16976 
     An osd is safe to destroy if
we have osd_stat for it
osd_stat indicates no pgs stored
all pgs are known
no pgs map to it
i.e., overall data durability will not be affected
An OSD is ok to stop if

we have the pg stats we need
no PGs will drop below min_size
i.e., availability won't be immediately compromised