Bug 1915737
| Summary: | Improve ocs-operator logging during uninstall to be more verbose, to understand reasons for failures - e.g. for Bug 1915445 | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Neha Berry <nberry> |
| Component: | ocs-operator | Assignee: | Nitin Goyal <nigoyal> |
| Status: | CLOSED ERRATA | QA Contact: | Neha Berry <nberry> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.7 | CC: | jarrpa, madam, muagarwa, nigoyal, ocs-bugs, rtalur, sostapov |
| Target Milestone: | --- | ||
| Target Release: | OCS 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.7.0-731.ci | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-19 09:18:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Backported PR https://github.com/openshift/ocs-operator/pull/1057 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041 |
Description of problem (please be detailed as possible and provide log snippests): =================================================================== Recently we faced an issue of failed deployment with KMS and had to uninstall the cluster to clean the setup. On deleting the storagecluster, the deletion was stuck indefinitely for hours, with only message in the ocs-operator being: Snip from ocs-operator logs ----------------------------- {"level":"info","ts":1610467637.3470006,"logger":"controllers.StorageCluster","msg":"Uninstall in progress","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","Status":"Uninstall: Waiting on NooBaa system to be deleted"} The logs should be more clear, telling exactly on which resource the uninstall is stuck. Talked to Nitin and he already has a PR for the fix ready. Version of all relevant components (if applicable): =================================================================== OCP = 4.7.0-0.nightly-2021-01-07-034013 OCS = ocs-operator.v4.7.0-230.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? =============================================================== Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Install OCP 4.7 on vmware 2. Install OCS 4.7 operator and then click on Create STorage cluster 3. In the configure section - enable cluster-wide encryption and add the KMS details from external vault server. 4. Click Create in Review and Create Page 5. If you hit Bug 1915202, edit the configmap below to add [VAULT_SKIP_VERIFY: "true"] 6. See if install succeeds, but it is seen OSD creation still fails due to KMS related permission denied issues 7. The noobaa-db-pg-0 PVC stays in pending state 8. Try to uninstall OCS by deleting the Storagecluster from UI or CLI. Make sure no extra OBCs or PVCs apart from OSD/MON/Nooobaa db PVCs exist. 9. If the deletion is stuck, the message in the logs are not clear to confirm the real cause of issue. Actual results: ===================== Logging keeps repeating the same message, without detailing the real resource on which the deletion is stuck Expected results: ==================== Logging needs to be removed Additional info: