Bug 1859229
Summary: | Rook should delete extra MON PVCs in case first reconcile takes too long and rook skips "b" and "c" (spawned from Bug 1840084#c14) | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Neha Berry <nberry> |
Component: | rook | Assignee: | Travis Nielsen <tnielsen> |
Status: | CLOSED ERRATA | QA Contact: | Neha Berry <nberry> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.5 | CC: | madam, muagarwa, ocs-bugs, tnielsen |
Target Milestone: | --- | ||
Target Release: | OCS 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | 4.6.0-98.ci | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-12-17 06:23:00 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Neha Berry
2020-07-21 13:53:54 UTC
Moving to 4.6 per comments above I believe the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1873118 will cover this scenario as well. Hi Travis, Since initial slow deployment resulting in mon-a, d and e is not easily reproducible(depends on State of hardware at times), can you help with the steps to verify this BZ ? I don't know an easy way to repro this specific scenario either. But the linked fix will delete any orphaned mon PVC that has the label "app=rook-ceph-mon" and does not have a corresponding mon deployment. The orphaned mons will only be removed after there is a failover of some mon. To verify the general fix, you could manually create a PVC that matches the label, then stop a mon, wait 10 minutes for the mon failover, then verify that the orphaned PVCs are removed after the mon failover. @travis Also, if you confirm that reusing a manually created PVC (with NOT all appropriate labels) is not an issue, then IIUC, the fix is working fine and this BZ can be moved to verified state Please Note; I also created an PVC with "rook-ceph-mon-xx" name but with no label (app=rook-ceph-mon), and as expected, the PVC was not cleaned up during re-conciliation. @Neha The validation steps are very thorough, thanks for all the details, including that we didn't remove any PVCs without the label. I do feel confident that we can move the BZ to verified state. I'm not worried about using the manually created PVC without all the labels. The labels are mostly a convenience for the users to query them if needed, we just query certain labels like the "app=rook-ceph-mon". But we should consider it an unsupported scenario if resources like PVCs are created manually. We do assume that resources exist with certain names. If they already exist, we use them and usually update them, although PVCs aren't updated. For example, if we found an existing deployment, we would update it with all desired pod spec if anything is different than expected. (In reply to Travis Nielsen from comment #12) > @Neha The validation steps are very thorough, thanks for all the details, > including that we didn't remove any PVCs without the label. I do feel > confident that we can move the BZ to verified state. > thanks Travis. Will move this BZ to verified state. > I'm not worried about using the manually created PVC without all the labels. > The labels are mostly a convenience for the users to query them if needed, > we just query certain labels like the "app=rook-ceph-mon". But we should > consider it an unsupported scenario if resources like PVCs are created > manually. So how shall we handle it and not allow users to use a manually created PVC ? How do we mark it as unsupported? via docs ? > We do assume that resources exist with certain names. If they > already exist, we use them and usually update them, although PVCs aren't > updated. For example, if we found an existing deployment, we would update it > with all desired pod spec if anything is different than expected. I forgot to test one thing - in case I created a manual PVC - rook-ceph-mon-d but did not add the label (app=rook-ceph-mon) , would rook have used this PVC or created a new one with appropriate label (app=rook-ceph-mon, + all other labels) ? If yes to above query, then wouldn't it be better , to say if manual PVC with mon-d is already present, create a new PVC, e.g mon-e and create rook-ceph-mon-e deployment using the new rook PVC ? In that way, the maunal PVC would stay unused. Just a query. (In reply to Neha Berry from comment #13) > > I'm not worried about using the manually created PVC without all the labels. > > The labels are mostly a convenience for the users to query them if needed, > > we just query certain labels like the "app=rook-ceph-mon". But we should > > consider it an unsupported scenario if resources like PVCs are created > > manually. > > So how shall we handle it and not allow users to use a manually created PVC > ? How do we mark it as unsupported? via docs ? This is one of those corner cases where there are a lot of things that the admin could do unsupported things if they chose to pre-create resources, delete resources, or otherwise do malicious things to Rook. If somebody is intentionally doing rook resource manipulation, we just aren't going to support it, and neither do I think we need to document that fact. I hope it's common sense. > > We do assume that resources exist with certain names. If they > > already exist, we use them and usually update them, although PVCs aren't > > updated. For example, if we found an existing deployment, we would update it > > with all desired pod spec if anything is different than expected. > > I forgot to test one thing - in case I created a manual PVC - > rook-ceph-mon-d but did not add the label (app=rook-ceph-mon) , would rook > have used this PVC or created a new one with appropriate label > (app=rook-ceph-mon, + all other labels) ? > > If yes to above query, then wouldn't it be better , to say if manual PVC > with mon-d is already present, create a new PVC, e.g mon-e and create > rook-ceph-mon-e deployment using the new rook PVC ? > > In that way, the maunal PVC would stay unused. Just a query. Rook would still use the PVC even if it didn't have that label, but I believe it would just not be deleted by the orphan PVC reaper that was just added for this BZ. Again, I would say we don't need to worry about this scenario since it's such a corner case. Nobody has a reason to create PVCs that Rook expects. Ack. Thanks @travis for the confirmation. Based on Comment #8 through Comment#14, moving the BZ to verified state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605 |