Bug 2157593

Summary: [RFE] UX : orch host drain : handle/report active mgr drain in better way : currently reports stray daemon for a while
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: CephadmAssignee: Adam King <adking>
Status: NEW --- QA Contact:
Severity: low Docs Contact:
Priority: unspecified    
Version: 5.3CC: cephqe-warriors, saraut
Target Milestone: ---Keywords: FutureFeature
Target Release: 6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasishta 2023-01-02 09:07:54 UTC
Description of problem:
Tried ceph orch host drain on a node with active mgr, though eventually mgr was moved to different host and a new standby mgr was configured, the mgr daemon on the host was reported stray for a while.

>>    health: HEALTH_WARN
>>            Failed to apply 2 service(s): mds.cephfs,mon
>>            1 stray daemon(s) not managed by cephadm

Detail
>> [WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
>>    stray daemon mgr.e22-h18-b01-fc640.rdu2.scalelab.redhat.com.quesrw on host e22-h18-b01-fc640.rdu2.scalelab.redhat.com not managed by cephadm

This BZ is to improve handling and/or reporting in a better way that cluster health doesn't raise flags that daemons are stray as this is a planned maintenance activity.
 
Version-Release number of selected component (if applicable):
16.2.10-87.el8cp

How reproducible:
Tried once.

Steps to Reproduce:
1. Configure cluster and drain node with active mgr.

Actual results:
(Explained above)

Expected results:
Report activities being performed in nicer way as the daemon ging stray is due to planned maintenance activity. 

Additional info:
# ceph orch host drain e22-h18-b01-fc640.xxx.xxx.xxx.com
Scheduled to remove the following daemons from host 'e22-h18-b01-fc640.xxx.xxx.xxx.com'
type                 id             
-------------------- ---------------
mon                  e22-h18-b01-fc640.rdu2.xxx.xxx.xxx.com
mgr                  e22-h18-b01-fc640.xxx.xxx.xxx.com.quesrw
alertmanager         e22-h18-b01-fc640
crash                e22-h18-b01-fc640
grafana              e22-h18-b01-fc640
node-exporter        e22-h18-b01-fc640
prometheus           e22-h18-b01-fc640