Bug 1893619 - OCS must-gather: Inspect errors for cephobjectoreUser and few ceph commandd when storage cluster does not exist
Summary: OCS must-gather: Inspect errors for cephobjectoreUser and few ceph commandd w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: must-gather
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: OCS 4.7.0
Assignee: RAJAT SINGH
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-02 07:51 UTC by Neha Berry
Modified: 2021-05-19 09:16 UTC (History)
5 users (show)

Fixed In Version: 4.7.0-696.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-19 09:16:13 UTC
Embargoed:


Attachments (Terms of Use)
must-gather-no-storagecluster (96.79 KB, text/plain)
2020-11-02 07:51 UTC, Neha Berry
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 968 0 None closed must-gather: gather ceph logs only when storageclusters is present 2021-02-08 06:02:26 UTC
Red Hat Product Errata RHSA-2021:2041 0 None None None 2021-05-19 09:16:51 UTC

Description Neha Berry 2020-11-02 07:51:47 UTC
Created attachment 1725692 [details]
must-gather-no-storagecluster

Description of problem (please be detailed as possible and provide log
snippests):
------------------------------------------------------------------

When storagecluster does not exist, tmust-gather should skip or handle log collection accordingly, rather than throwing few inspect errors on the terminal:


E.g. must-gather throws following inspect error message when storagecluster does not exist(not-created/deleted). 


a) error for cephobjectstoreusers
[must-gather-g7fx7] POD collecting dump cephobjectstoreusers
[must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template:
[must-gather-g7fx7] POD         template was:
[must-gather-g7fx7] POD                 {range .items[*]}{@.metadata.name}{'\n'}{end}
[must-gather-g7fx7] POD         object given to jsonpath engine was:
[must-gather-g7fx7] POD                 map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}}
[must-gather-g7fx7] POD

b) error for some snapshot related ceph outputs (see attached file)

[must-gather-g7fx7] POD collecting snapshot info for ceph rbd volumes
[must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template:
[must-gather-g7fx7] POD         template was:
[must-gather-g7fx7] POD                 {range .items[*]}{@.metadata.name}{'\n'}{end}
[must-gather-g7fx7] POD         object given to jsonpath engine was:
[must-gather-g7fx7] POD                 map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}}
[must-gather-g7fx7] POD
[must-gather-g7fx7] POD
[must-gather-g7fx7] POD collecting snapshot info for ceph subvolumes
[must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template:
[must-gather-g7fx7] POD         template was:
[must-gather-g7fx7] POD                 {range .items[*]}{@.metadata.name}{'\n'}{end}
[must-gather-g7fx7] POD         object given to jsonpath engine was:
[must-gather-g7fx7] POD                 map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}}
[must-gather-g7fx7] POD
[must-gather-g7fx7] POD


P.S: In the absence of storagecluster, it should skip collecting few of these outputs and not attempt to create helper pod, followed by attempt to collect ceph outputs(which then throws few inspect errors)


Similar messages seen for uninstall+ must-gathr BZs  https://bugzilla.redhat.com/show_bug.cgi?id=1893611#c3 and https://bugzilla.redhat.com/show_bug.cgi?id=1893613

Version of all relevant components (if applicable):
--------------------------------------------------
OCS 4.6 = 4.6.0-147.ci 


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
------------------------------------------------
No. But there are a few error messages on the terminal, which can be misleading. 

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
----------------------------
yes always



Can this issue reproduce from the UI?
----------------------------------------
NA

If this is a regression, please provide more details to justify this:
---------------------------------------------------------------
Not sure

Steps to Reproduce:
------------------------
2 scenarios tested (both internal and external) to reproduce the issue

>> Scenario 1) Installed OCS operator and initiated must-gather
a)Operator Hub->Install OCS operator
b) When the operator pods are up and CSV is in succeeded state, initiate must-gather

oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6

Note: Do not install storage cluster

>> Scenario 2) Triggered Uninstall of OCS by deleting storage cluster ; initiate must-gather 

a) Delete storagecluster in a running OCS cluster (follow Uninstall docs)

$ oc delete storagecluster --all -n openshift-storage --wait=true --timeout=5m
initiate must-gather 

b) Once storagecluster and dependent ceph cluster are deleted, initiate must-gather

$ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6



Actual results:
-----------------------
1. inspect error for POD collecting dump cephobjectstoreusers
[must-gather-g7fx7] POD collecting dump cephobjectstoreusers
[must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template:
[must-gather-g7fx7] POD         template was:
[must-gather-g7fx7] POD                 {range .items[*]}{@.metadata.name}{'\n'}{end}
[must-gather-g7fx7] POD         object given to jsonpath engine was:
[must-gather-g7fx7] POD                 map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}}
[must-gather-g7fx7] POD
[must-gather-g7fx7] POD
[must-gather-g7fx7] POD Error from server (NotFound): pods "must-gather-g7fx7-helper" not found

2. Re-tries for creating helper pod, even though storage cluster does not exists, so not needed.
[must-gather-g7fx7] POD waiting for helper pod to come up in openshift-storage namespace. Retrying 1




Expected results:
--------------------
a) there should not be "error executing jsonpath" error messages for cephobjectstoreusers when storagecluster is not yet created OR does not exist

b) No need to attempt creation of helper pod, as storagecluster/cephcluster is not present


Additional info:
==========================

$ oc get csv; oc get pods
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.6.0-147.ci   OpenShift Container Storage   4.6.0-147.ci              Succeeded
NAME                                    READY   STATUS    RESTARTS   AGE
noobaa-operator-6c9b6c8694-d7zsf        1/1     Running   0          51s
ocs-metrics-exporter-6f954ff57c-cjhk6   1/1     Running   0          50s
ocs-operator-6fb8bbd874-jz78t           1/1     Running   0          51s
rook-ceph-operator-7c66c45775-6nrqh     1/1     Running   0          51s
[nberry@localhost nov2]$ 

----------------------------------------------------

Comment 3 Neha Berry 2020-11-02 07:55:54 UTC
Proposing as a blocker until inspected once from engg side.

Comment 13 errata-xmlrpc 2021-05-19 09:16:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041


Note You need to log in before you can comment on or make changes to this bug.