Description of problem (please be detailed as possible and provide log snippests): -------------------------------------------------------------- This BZ is spawned based on the comment here - Bug 1840084#c16. With fix for Bug 1840084, rook operator deletes the unused MON PVCs in scenarios where " quorum is first established with a,b,c and then failover occurs later". But b,c are skipped if the first reconcile takes too long and those PVCs are not yet deleted by rook operator. This BZ is raised to track this particular use-case: More details (See Bug 1840084#c0 and Bug 1840084#c14 for more details) ------------------- In one of my normal installation on VMware(with a little slower hardware, hence clock skews are also detected), it is seen the mon-a, mon-d and mon-e are ultimately added to quorum instead of mon-a,b,c. But mon-b and mon-c PVCs are are still left unused and not deleted by the operator. Logs copied here - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1840084-after-deployment/ Zipped folder = http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1840084-after-deployment.zip P.S: The steps to reproduce match exactly with Comment#0. 1. Deploy OCS on vsphere 2. Check the pods and PVCs >> PODS rook-ceph-mon-a-5dc69f5f85-6mdcm 1/1 Running 0 78m 10.128.2.19 compute-0 <none> <none> rook-ceph-mon-d-85dbb6c849-b6zmg 1/1 Running 0 75m 10.129.2.20 compute-1 <none> <none> rook-ceph-mon-e-766fc9bd4f-499dt 1/1 Running 0 75m 10.131.0.16 compute-2 <none> <none> >> PVC openshift-storage rook-ceph-mon-a Bound pvc-9272fac1-1b03-4285-9d25-d3eea4bdff81 10Gi RWO thin 80m openshift-storage rook-ceph-mon-b Bound pvc-49a560d0-bd94-4b65-a989-1317f8a0b668 10Gi RWO thin 79m openshift-storage rook-ceph-mon-c Bound pvc-c4248d3c-7724-4362-ada0-18405a1274b9 10Gi RWO thin 79m openshift-storage rook-ceph-mon-d Bound pvc-52de08b7-e6e2-4439-a5ea-ef37b041fc49 10Gi RWO thin 77m openshift-storage rook-ceph-mon-e Bound pvc-595a988d-06fc-41c8-9c69-519816b85069 10Gi RWO thin 76m __________________ Version of all relevant components (if applicable): -------------------------------------------------------- OCS= 4.5.0-493.ci OCP = 4.5.0-0.nightly-2020-07-17-032241 Ceph = RHCS 4.1 - 14.2.8-59.el8cp Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? -------------------------------------------- No Is there any workaround available to the best of your knowledge? -------------------------------------- Manually delete the unused PVCs Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ----------------------------------------------- 3 and intermittently seen. Can this issue reproducible? ------------------------------ Intermittently. Can this issue reproduce from the UI? -------------------------------- Yes If this is a regression, please provide more details to justify this: -------------------------------------------- No. Behavior exists since OCS 4.2 Steps to Reproduce: ------------------- 1. Install ocs over VmWare 2. check the OCS pods, MON and OSD PVCs. Actual results: --------------------- In case first reconcile takes time, sometimes, we get mon-a, mon-d and mon-e instead of the usual mon-a,b,c. But the unused PVCs of mon-b,c are still left in the cluster. Expected results: ----------------------- Rook should delete the unused PVCs from first reconcile too (post fresh deployment) Additional info:
Moving to 4.6 per comments above
I believe the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1873118 will cover this scenario as well.
Hi Travis, Since initial slow deployment resulting in mon-a, d and e is not easily reproducible(depends on State of hardware at times), can you help with the steps to verify this BZ ?
I don't know an easy way to repro this specific scenario either. But the linked fix will delete any orphaned mon PVC that has the label "app=rook-ceph-mon" and does not have a corresponding mon deployment. The orphaned mons will only be removed after there is a failover of some mon. To verify the general fix, you could manually create a PVC that matches the label, then stop a mon, wait 10 minutes for the mon failover, then verify that the orphaned PVCs are removed after the mon failover.
@travis Also, if you confirm that reusing a manually created PVC (with NOT all appropriate labels) is not an issue, then IIUC, the fix is working fine and this BZ can be moved to verified state Please Note; I also created an PVC with "rook-ceph-mon-xx" name but with no label (app=rook-ceph-mon), and as expected, the PVC was not cleaned up during re-conciliation.
@Neha The validation steps are very thorough, thanks for all the details, including that we didn't remove any PVCs without the label. I do feel confident that we can move the BZ to verified state. I'm not worried about using the manually created PVC without all the labels. The labels are mostly a convenience for the users to query them if needed, we just query certain labels like the "app=rook-ceph-mon". But we should consider it an unsupported scenario if resources like PVCs are created manually. We do assume that resources exist with certain names. If they already exist, we use them and usually update them, although PVCs aren't updated. For example, if we found an existing deployment, we would update it with all desired pod spec if anything is different than expected.
(In reply to Travis Nielsen from comment #12) > @Neha The validation steps are very thorough, thanks for all the details, > including that we didn't remove any PVCs without the label. I do feel > confident that we can move the BZ to verified state. > thanks Travis. Will move this BZ to verified state. > I'm not worried about using the manually created PVC without all the labels. > The labels are mostly a convenience for the users to query them if needed, > we just query certain labels like the "app=rook-ceph-mon". But we should > consider it an unsupported scenario if resources like PVCs are created > manually. So how shall we handle it and not allow users to use a manually created PVC ? How do we mark it as unsupported? via docs ? > We do assume that resources exist with certain names. If they > already exist, we use them and usually update them, although PVCs aren't > updated. For example, if we found an existing deployment, we would update it > with all desired pod spec if anything is different than expected. I forgot to test one thing - in case I created a manual PVC - rook-ceph-mon-d but did not add the label (app=rook-ceph-mon) , would rook have used this PVC or created a new one with appropriate label (app=rook-ceph-mon, + all other labels) ? If yes to above query, then wouldn't it be better , to say if manual PVC with mon-d is already present, create a new PVC, e.g mon-e and create rook-ceph-mon-e deployment using the new rook PVC ? In that way, the maunal PVC would stay unused. Just a query.
(In reply to Neha Berry from comment #13) > > I'm not worried about using the manually created PVC without all the labels. > > The labels are mostly a convenience for the users to query them if needed, > > we just query certain labels like the "app=rook-ceph-mon". But we should > > consider it an unsupported scenario if resources like PVCs are created > > manually. > > So how shall we handle it and not allow users to use a manually created PVC > ? How do we mark it as unsupported? via docs ? This is one of those corner cases where there are a lot of things that the admin could do unsupported things if they chose to pre-create resources, delete resources, or otherwise do malicious things to Rook. If somebody is intentionally doing rook resource manipulation, we just aren't going to support it, and neither do I think we need to document that fact. I hope it's common sense. > > We do assume that resources exist with certain names. If they > > already exist, we use them and usually update them, although PVCs aren't > > updated. For example, if we found an existing deployment, we would update it > > with all desired pod spec if anything is different than expected. > > I forgot to test one thing - in case I created a manual PVC - > rook-ceph-mon-d but did not add the label (app=rook-ceph-mon) , would rook > have used this PVC or created a new one with appropriate label > (app=rook-ceph-mon, + all other labels) ? > > If yes to above query, then wouldn't it be better , to say if manual PVC > with mon-d is already present, create a new PVC, e.g mon-e and create > rook-ceph-mon-e deployment using the new rook PVC ? > > In that way, the maunal PVC would stay unused. Just a query. Rook would still use the PVC even if it didn't have that label, but I believe it would just not be deleted by the orphan PVC reaper that was just added for this BZ. Again, I would say we don't need to worry about this scenario since it's such a corner case. Nobody has a reason to create PVCs that Rook expects.
Ack. Thanks @travis for the confirmation. Based on Comment #8 through Comment#14, moving the BZ to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605