Description of problem (please be detailed as possible and provide log snippests): In my deployment I see this: pbalogh@MacBook-Pro ocs45 $ oc get pvc -n openshift-storage NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE db-noobaa-db-0 Bound pvc-6fbacf26-a00d-462d-94d1-8d99fa6516d8 50Gi RWO ocs-storagecluster-ceph-rbd 15h ocs-deviceset-0-data-0-q8k4g Bound pvc-d8c2eb68-0419-4f02-88a1-d9516eeacb8f 512Gi RWO thin 15h ocs-deviceset-1-data-0-hswzn Bound pvc-3c26fdbe-2914-4a41-8257-2a54a4f406cd 512Gi RWO thin 15h ocs-deviceset-2-data-0-s6fhd Bound pvc-40a28840-3816-4bf1-a584-1734a58ed64f 512Gi RWO thin 15h rook-ceph-mon-a Bound pvc-94fffde3-3056-4693-9852-537a010c3b48 10Gi RWO thin 15h rook-ceph-mon-b Bound pvc-800b8e57-39a7-4f95-9421-977419599da0 10Gi RWO thin 15h rook-ceph-mon-c Bound pvc-61dcaaa7-7ad0-4fc5-9f66-8e58cd082f52 10Gi RWO thin 15h rook-ceph-mon-d Bound pvc-35740e57-bbe1-4462-9c60-ed1becc87fe2 10Gi RWO thin 15h rook-ceph-mon-e Bound pvc-ca6d7e44-fbb2-4799-a3f9-fbd2e7175e32 10Gi RWO thin 15h rook-ceph-mon-f Bound pvc-3860c571-2d85-4d2a-9018-2565fdb37d13 10Gi RWO thin 15h Pods: $ oc get pod -n openshift-storage NAME READY STATUS RESTARTS AGE csi-cephfsplugin-b5622 3/3 Running 0 15h csi-cephfsplugin-provisioner-67655ccbc4-4jlg7 5/5 Running 0 15h csi-cephfsplugin-provisioner-67655ccbc4-xm656 5/5 Running 0 15h csi-cephfsplugin-rp2xl 3/3 Running 0 15h csi-cephfsplugin-w8xx6 3/3 Running 0 15h csi-rbdplugin-dzq52 3/3 Running 0 15h csi-rbdplugin-jr947 3/3 Running 0 15h csi-rbdplugin-provisioner-75965d8977-jlv7b 5/5 Running 0 15h csi-rbdplugin-provisioner-75965d8977-jvh86 5/5 Running 0 15h csi-rbdplugin-st495 3/3 Running 0 15h lib-bucket-provisioner-5b9cb4f848-rrstf 1/1 Running 0 15h noobaa-core-0 1/1 Running 0 15h noobaa-db-0 1/1 Running 0 15h noobaa-endpoint-54698c585-5n97s 1/1 Running 0 15h noobaa-operator-57df96c6c5-bnk5x 1/1 Running 0 15h ocs-operator-549c4598ff-cv2gm 1/1 Running 0 15h rook-ceph-crashcollector-rhel1-0.jnk-vu1rs33-t1.qe.rh-ocs.kdcwx 1/1 Running 0 15h rook-ceph-crashcollector-rhel1-1.jnk-vu1rs33-t1.qe.rh-ocs.q26cd 1/1 Running 0 15h rook-ceph-crashcollector-rhel1-2.jnk-vu1rs33-t1.qe.rh-ocs.97jj2 1/1 Running 0 15h rook-ceph-drain-canary-rhel1-0.jnk-vu1rs33-t1.qe.rh-ocs.cog5jnh 1/1 Running 0 15h rook-ceph-drain-canary-rhel1-1.jnk-vu1rs33-t1.qe.rh-ocs.co4qlqn 1/1 Running 0 15h rook-ceph-drain-canary-rhel1-2.jnk-vu1rs33-t1.qe.rh-ocs.co26mjp 1/1 Running 0 15h rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-747f6887m9zg8 1/1 Running 0 15h rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-b9d87b86x7kfx 1/1 Running 0 15h rook-ceph-mgr-a-86cfcfdd58-92g9c 1/1 Running 0 15h rook-ceph-mon-a-56f67f5968-6j4px 1/1 Running 0 15h rook-ceph-mon-d-849d8759b4-x2xw2 1/1 Running 0 15h rook-ceph-mon-f-678c96bbf6-grh6p 1/1 Running 0 15h rook-ceph-operator-54fd57594-9gqnm 1/1 Running 0 15h rook-ceph-osd-0-788dccc6ff-rlc9t 1/1 Running 0 15h rook-ceph-osd-1-6cb468b895-lxmh8 1/1 Running 0 15h rook-ceph-osd-2-76f8f48f68-rh6f8 1/1 Running 0 15h rook-ceph-osd-prepare-ocs-deviceset-0-data-0-q8k4g-kj9bb 0/1 Completed 0 15h rook-ceph-osd-prepare-ocs-deviceset-1-data-0-hswzn-hqx96 0/1 Completed 0 15h rook-ceph-osd-prepare-ocs-deviceset-2-data-0-s6fhd-4vct7 0/1 Completed 0 15h rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-65f674cdj967 1/1 Running 0 15h So there are 6 PVCs fro mons but actually we have only 3. Version of all relevant components (if applicable): OCP 4.4 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? No Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? Not sure, we saw it in the past I think. Can this issue reproduce from the UI? Not sure If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Install ocs over VmWare 2. Not sure it's always reproducible Actual results: 6 PVCs but only 3 mons running. Expected results: There should be just specific number of PVCs for existing pods IMO Additional info: Jenkins job: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/8017/console Logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jnk-vu1rs33-t1/jnk-vu1rs33-t1_20200525T161512/logs/failed_testcase_ocs_logs_1590423591/deployment_ocs_logs/
Do you see it in other setups? Post deployment? After some tests? After negative tests?
As this is not critical and basically do not break anything we do not have any test for this AFAIK. I noticed it in some of our deployment and remembered this was not first time hence opened the BZ. But as we don't do this check for: len(mon pods) = len(mon PVCs) now we will never have an issue with this if it will not be decided to have that check in place and then we will fail the deployment if we will hit such issue again. I think that Neha know more details about this, can you please answer more info if you've seen this in other cases as well? When I opened this BZ I thought we will get some input from Travis maybe if this can be handled on operator level?
Moving to rook, as creation of mon-PVCs is rook's responsibility.
(In reply to Petr Balogh from comment #3) > As this is not critical and basically do not break anything As this is not critical, why is it proposed for 4.4.z? Moving to 4.5 for now. @Travis can you confirm what is the intended behavior in rook?
Rook should be deleting the PVC that is no longer needed for the mon that was failed over. This is not currently implemented, definitely need to fix it.
Upstream fix: https://github.com/rook/rook/pull/5698
@Neha The code path for starting mons the first time when there is a long delay and it creates mons a,d,e is much different from the case where quorum is first established with a,b,c and then failover occurs later. This has long been a bug in Rook where b,c are skipped if the first reconcile takes too long. How about if we consider this issue verified for 4.5 and we open a new BZ for the remaining work? Fixing that scenario will require more than just deleting the unused PVCs. Since it is a bigger change I'd suggest opening it for 4.6.
From my point of view I am OK to move it to 4.6 as this is nothing urgent I think. But would like to hear also others opinion like was asked from Neha.
@Neha Anytime the operator finds that a mon is unhealthy for more than the timeout (10 min), it will attempt to start a new mon and complete the failover. 1. Deleting a mon deployment: The operator might failover to a new mon, or it might first try to re-create the same mon deployment that is missing. Starting in 4.6 you should see the same mon deployment re-created again, but in 4.5 you likely will see the failover scenario. 2. Node replacement: The mon will be force deleted by the operator if the node is not responding, so the mon likely will already move to another node before the failover will be triggered after 10 minutes. Again, it just depends on timing. 3. The most reliable way to test the failover scenario is likely to set the replicas=0 on the mon deployment. As long as the operator doesn't restart, it won't notice that the deployment doesn't match the desired state of replica=1.
*** Bug 1859960 has been marked as a duplicate of this bug. ***
with builds: ocs: ocs-operator.v4.5.0-54.ci ocp: 4.5.0-0.nightly-2020-08-15-052753 on vmware, tested with delete deployment, delete pod and set replica=0, looks good moving to verifed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754