Bug 1964570

Summary: Auto deletion of mon canary pod stuck in Terminating state and also not reconciled by the operator, leading to stuck deployment
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Neha Berry <nberry>
Component: rookAssignee: Travis Nielsen <tnielsen>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.8CC: madam, muagarwa, ocs-bugs, odf-bz-bot
Target Milestone: ---Keywords: AutomationBackLog
Target Release: ---Flags: tnielsen: needinfo? (nberry)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-02 15:33:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Travis Nielsen 2021-05-25 19:15:49 UTC
The Rook operator force deletes the canary pods, with the intention that they will stop as soon as possible and allow the real mon pods to start. This behavior has been the same for several releases, not sure what would have changed. If we cannot repro this, I would suggest it be closed as won't fix. 

If we do see it again, Rook could potentially force delete the mon canaries again if it sees they are stuck terminating for a long time.

Comment 4 Mudit Agarwal 2021-06-03 13:01:14 UTC
Neha, is this reproducible? Else can we close this? 
https://bugzilla.redhat.com/show_bug.cgi?id=1964570#c3

Comment 6 Mudit Agarwal 2021-06-10 10:06:03 UTC
Moving out after talking to Neha.

Comment 7 Travis Nielsen 2021-07-26 19:33:15 UTC
Neha shall we close this or could you get a repro?

Comment 8 Travis Nielsen 2021-08-02 15:33:36 UTC
Please reopen if you can repro