Description of problem (please be detailed as possible and provide log snippests): [DR] Volumes get stuck in split-brain after Failover Action is initiated Version of all relevant components (if applicable): ODF:- odr-cluster-operator.v4.9.0-164.ci OCP:- 4.9.0-0.nightly-2021-10-01-202059 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? yes Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy DR Over 2 OCP cluster 2. Deploy App 3. Perform the failover action Actual results: rbd images are in a Split-brain state after failover Expected results: There should not be any split-brain case Additional info: rbd image satus { "lastChecked": "2021-10-06T06:21:49Z", "summary": { "daemon_health": "OK", "health": "ERROR", "image_health": "ERROR", "states": { "error": 6 } } } bash-4.4$ rbd mirror image status ocs-storagecluster-cephblockpool/csi-vol-e21a2369-25e1-11ec-94bc-0a580a8301c5 csi-vol-e21a2369-25e1-11ec-94bc-0a580a8301c5: global_id: b8bb1ba4-7d03-4a88-aa57-2424112aa2b0 state: up+error description: split-brain service: a on vmware-dccp-one-f84rh-worker-hkg99 last_update: 2021-10-06 06:22:04 peer_sites: name: afc6aaac-199c-472e-bf35-390eb2799b3e state: up+stopped description: local image is primary last_update: 2021-10-06 06:21:44
Not a blocker as it is not reproducible.
Although not a test blocker, we should try fixing this for 4.9.
Ok, since this is not repro'd, removing the blocker flag. Will revert in case seen again
Not a blocker and not reproducible. Please reopen if this is seen again.