Bug 1915758 - improve noobaa logging in case of uninstall - logs do not specify clearly the resource on which deletion is stuck
Summary: improve noobaa logging in case of uninstall - logs do not specify clearly the...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: OCS 4.7.0
Assignee: Romy Ayalon
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-13 11:55 UTC by Neha Berry
Modified: 2021-05-19 09:18 UTC (History)
5 users (show)

Fixed In Version: v4.7.0-263.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-19 09:18:01 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github noobaa noobaa-operator pull 512 0 None closed KMS gaps fixes 2021-02-15 07:34:22 UTC
Github noobaa noobaa-operator pull 522 0 None closed Backport to 5.7 2021-02-15 07:34:22 UTC
Github noobaa noobaa-operator pull 542 0 None open Add more delete failed messages on external root key deletion 2021-02-14 12:43:02 UTC
Github noobaa noobaa-operator pull 546 0 None open Backport to 5.7 2021-02-15 07:34:47 UTC
Red Hat Product Errata RHSA-2021:2041 0 None None None 2021-05-19 09:18:40 UTC

Description Neha Berry 2021-01-13 11:55:42 UTC
Description of problem (please be detailed as possible and provide log
snippests):
=========================================================================
Recently we faced an issue of failed deployment with KMS and had to uninstall the cluster to clean the setup.

On deleting the storagecluster, the deletion was stuck indefinitely for hours, with only message in the ocs-operator being:

Snip from ocs-operator logs
-----------------------------

{"level":"info","ts":1610467637.3470006,"logger":"controllers.StorageCluster","msg":"Uninstall in progress","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","Status":"Uninstall: Waiting on NooBaa system to be deleted"}


The logs should be more clear, telling exactly on which resource the uninstall is stuck.

Raised bug for improving logging in ocs-operator : Bug 1915737

Raising this to handle noobaa side logging as well, since engg could not find details of why and which noobaa resource was making uninstall to get stuck.


Version of all relevant components (if applicable):
=====================================================
OCP = 4.7.0-0.nightly-2021-01-07-034013
OCS = ocs-operator.v4.7.0-230.ci


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


1. Install OCP 4.7 on vmware
2. Install OCS 4.7 operator and then click on Create STorage cluster
3. In the configure section - enable cluster-wide encryption and add the KMS details from external vault server. 
4. Click Create in Review and Create Page
5. If you hit Bug 1915202, edit the configmap below to add [VAULT_SKIP_VERIFY: "true"] 
6. See if install succeeds, but it is seen OSD creation still fails due to KMS related permission denied issues
7. The noobaa-db-pg-0 PVC stays in pending state
8. Try to uninstall OCS by deleting the Storagecluster from UI or CLI. Make sure no extra OBCs or PVCs apart from OSD/MON/Nooobaa db PVCs exist.

9. If the deletion is stuck, the message in the logs are not clear to confirm the real cause of issue.


Actual results:
=====================
Logging keeps repeating the same message, without detailing the real resource on which the deletion is stuck

Expected results:
====================
Logging needs to be improved

Comment 12 errata-xmlrpc 2021-05-19 09:18:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041


Note You need to log in before you can comment on or make changes to this bug.