Bug 1915737 - Improve ocs-operator logging during uninstall to be more verbose, to understand reasons for failures - e.g. for Bug 1915445
Summary: Improve ocs-operator logging during uninstall to be more verbose, to understa...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: OCS 4.7.0
Assignee: Nitin Goyal
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-13 11:31 UTC by Neha Berry
Modified: 2021-05-19 09:18 UTC (History)
7 users (show)

Fixed In Version: 4.7.0-731.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-19 09:18:01 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 1002 0 None closed Bug 1915737: [release-4.7] storagecluster: Enhance Noobaa ensureDeleted logging 2021-02-15 13:45:03 UTC
Github openshift ocs-operator pull 1057 0 None closed Bug 1915737: [release-4.7] storagecluster: Fix noobaa uninstall logs 2021-02-17 12:47:14 UTC
Red Hat Product Errata RHSA-2021:2041 0 None None None 2021-05-19 09:18:40 UTC

Description Neha Berry 2021-01-13 11:31:04 UTC
Description of problem (please be detailed as possible and provide log
snippests):
===================================================================
Recently we faced an issue of failed deployment with KMS and had to uninstall the cluster to clean the setup.

On deleting the storagecluster, the deletion was stuck indefinitely for hours, with only message in the ocs-operator being:

Snip from ocs-operator logs
-----------------------------

{"level":"info","ts":1610467637.3470006,"logger":"controllers.StorageCluster","msg":"Uninstall in progress","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","Status":"Uninstall: Waiting on NooBaa system to be deleted"}


The logs should be more clear, telling exactly on which resource the uninstall is stuck.

Talked to Nitin and he already has a PR for the fix ready.


Version of all relevant components (if applicable):
===================================================================
OCP = 4.7.0-0.nightly-2021-01-07-034013
OCS = ocs-operator.v4.7.0-230.ci



Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
===============================================================

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install OCP 4.7 on vmware
2. Install OCS 4.7 operator and then click on Create STorage cluster
3. In the configure section - enable cluster-wide encryption and add the KMS details from external vault server. 
4. Click Create in Review and Create Page
5. If you hit Bug 1915202, edit the configmap below to add [VAULT_SKIP_VERIFY: "true"] 
6. See if install succeeds, but it is seen OSD creation still fails due to KMS related permission denied issues
7. The noobaa-db-pg-0 PVC stays in pending state
8. Try to uninstall OCS by deleting the Storagecluster from UI or CLI. Make sure no extra OBCs or PVCs apart from OSD/MON/Nooobaa db PVCs exist.

9. If the deletion is stuck, the message in the logs are not clear to confirm the real cause of issue.


Actual results:
=====================
Logging keeps repeating the same message, without detailing the real resource on which the deletion is stuck

Expected results:
====================
Logging needs to be removed


Additional info:

Comment 10 Nitin Goyal 2021-02-11 14:31:35 UTC
Backported PR https://github.com/openshift/ocs-operator/pull/1057

Comment 14 errata-xmlrpc 2021-05-19 09:18:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041


Note You need to log in before you can comment on or make changes to this bug.