Created attachment 1725692 [details] must-gather-no-storagecluster Description of problem (please be detailed as possible and provide log snippests): ------------------------------------------------------------------ When storagecluster does not exist, tmust-gather should skip or handle log collection accordingly, rather than throwing few inspect errors on the terminal: E.g. must-gather throws following inspect error message when storagecluster does not exist(not-created/deleted). a) error for cephobjectstoreusers [must-gather-g7fx7] POD collecting dump cephobjectstoreusers [must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template: [must-gather-g7fx7] POD template was: [must-gather-g7fx7] POD {range .items[*]}{@.metadata.name}{'\n'}{end} [must-gather-g7fx7] POD object given to jsonpath engine was: [must-gather-g7fx7] POD map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}} [must-gather-g7fx7] POD b) error for some snapshot related ceph outputs (see attached file) [must-gather-g7fx7] POD collecting snapshot info for ceph rbd volumes [must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template: [must-gather-g7fx7] POD template was: [must-gather-g7fx7] POD {range .items[*]}{@.metadata.name}{'\n'}{end} [must-gather-g7fx7] POD object given to jsonpath engine was: [must-gather-g7fx7] POD map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}} [must-gather-g7fx7] POD [must-gather-g7fx7] POD [must-gather-g7fx7] POD collecting snapshot info for ceph subvolumes [must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template: [must-gather-g7fx7] POD template was: [must-gather-g7fx7] POD {range .items[*]}{@.metadata.name}{'\n'}{end} [must-gather-g7fx7] POD object given to jsonpath engine was: [must-gather-g7fx7] POD map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}} [must-gather-g7fx7] POD [must-gather-g7fx7] POD P.S: In the absence of storagecluster, it should skip collecting few of these outputs and not attempt to create helper pod, followed by attempt to collect ceph outputs(which then throws few inspect errors) Similar messages seen for uninstall+ must-gathr BZs https://bugzilla.redhat.com/show_bug.cgi?id=1893611#c3 and https://bugzilla.redhat.com/show_bug.cgi?id=1893613 Version of all relevant components (if applicable): -------------------------------------------------- OCS 4.6 = 4.6.0-147.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ------------------------------------------------ No. But there are a few error messages on the terminal, which can be misleading. Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? ---------------------------- yes always Can this issue reproduce from the UI? ---------------------------------------- NA If this is a regression, please provide more details to justify this: --------------------------------------------------------------- Not sure Steps to Reproduce: ------------------------ 2 scenarios tested (both internal and external) to reproduce the issue >> Scenario 1) Installed OCS operator and initiated must-gather a)Operator Hub->Install OCS operator b) When the operator pods are up and CSV is in succeeded state, initiate must-gather oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6 Note: Do not install storage cluster >> Scenario 2) Triggered Uninstall of OCS by deleting storage cluster ; initiate must-gather a) Delete storagecluster in a running OCS cluster (follow Uninstall docs) $ oc delete storagecluster --all -n openshift-storage --wait=true --timeout=5m initiate must-gather b) Once storagecluster and dependent ceph cluster are deleted, initiate must-gather $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6 Actual results: ----------------------- 1. inspect error for POD collecting dump cephobjectstoreusers [must-gather-g7fx7] POD collecting dump cephobjectstoreusers [must-gather-g7fx7] POD error: error executing jsonpath "{range .items[*]}{@.metadata.name}{'\\n'}{end}": Error executing template: not in range, nothing to end. Printing more information for debugging the template: [must-gather-g7fx7] POD template was: [must-gather-g7fx7] POD {range .items[*]}{@.metadata.name}{'\n'}{end} [must-gather-g7fx7] POD object given to jsonpath engine was: [must-gather-g7fx7] POD map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":"", "selfLink":""}} [must-gather-g7fx7] POD [must-gather-g7fx7] POD [must-gather-g7fx7] POD Error from server (NotFound): pods "must-gather-g7fx7-helper" not found 2. Re-tries for creating helper pod, even though storage cluster does not exists, so not needed. [must-gather-g7fx7] POD waiting for helper pod to come up in openshift-storage namespace. Retrying 1 Expected results: -------------------- a) there should not be "error executing jsonpath" error messages for cephobjectstoreusers when storagecluster is not yet created OR does not exist b) No need to attempt creation of helper pod, as storagecluster/cephcluster is not present Additional info: ========================== $ oc get csv; oc get pods NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.6.0-147.ci OpenShift Container Storage 4.6.0-147.ci Succeeded NAME READY STATUS RESTARTS AGE noobaa-operator-6c9b6c8694-d7zsf 1/1 Running 0 51s ocs-metrics-exporter-6f954ff57c-cjhk6 1/1 Running 0 50s ocs-operator-6fb8bbd874-jz78t 1/1 Running 0 51s rook-ceph-operator-7c66c45775-6nrqh 1/1 Running 0 51s [nberry@localhost nov2]$ ----------------------------------------------------
Logs - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bug-1893619/ Similar messages seen for uninstall+ must-gathr BZs https://bugzilla.redhat.com/show_bug.cgi?id=1893611#c3 and https://bugzilla.redhat.com/show_bug.cgi?id=1893613
Proposing as a blocker until inspected once from engg side.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041