Bug 2169367 - ODF4.11.5 post ODF/OCP upgrade, ODF Must gather Failed because "timed out waiting for the condition" [NEEDINFO]
Summary: ODF4.11.5 post ODF/OCP upgrade, ODF Must gather Failed because "timed out wai...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: must-gather
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: yati padia
QA Contact: Coady LaCroix
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-02-13 13:05 UTC by Oded
Modified: 2023-08-09 16:35 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-12 10:07:10 UTC
Embargoed:
ypadia: needinfo? (oviner)


Attachments (Terms of Use)

Description Oded 2023-02-13 13:05:15 UTC
Description of problem (please be detailed as possible and provide log
snippests):
MG helper pod struck on Running state becasue "timed out waiting for the condition"

Version of all relevant components (if applicable):

ODF Version: 4.11.5-9
OCP Version: 4.11.0-0.nightly-2023-01-31-120242
Platform: AWS

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
Test Process A:
1.Upgrade ODF 4.11.4 - > ODF 4.11.5-9
https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/6780/parameters/
2.Run MG api oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.11 
3.mg helper pod stuck on Running state
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-237ai3c33-ua/j-237ai3c33-ua_20230201T080833/logs/ocs-ci-logs-1675271598/tests/manage/z_cluster/test_must_gather.py/TestMustGather/test_must_gather-CEPH/logs



Test Process B:
1.1.Upgrade OCP 4.11 - > OCP 4.12
https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/6778/parameters/
2. Run MG api oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.11 
3. mg helper pod stuck on Ruuning state
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-172ai3c33-uo/j-172ai3c33-uo_20230201T080752/logs/ocs-ci-logs-1675275562/tests/manage/z_cluster/test_must_gather.py/TestMustGather/test_must_gather-CEPH/logs

LOGS:
2023-02-02 02:47:01,705 - MainThread - ERROR - ocs_ci.ocs.utils.run_must_gather.921 - Failed during must gather logs! Error: Error during execution of command: oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.11 --dest-dir=/tmp/tmpdvzvm59d_ocs_logs/ocs_must_gather.
Error is 

Error running must-gather collection:
    gather never finished for pod must-gather-rsrkz: timed out waiting for the condition

Falling back to `oc adm inspect clusteroperators.v1.config.openshift.io` to collect basic cluster information.
W0202 02:45:09.272789  263869 util.go:119] the server doesn't have a resource type egressfirewalls, skipping the inspection
W0202 02:45:09.281071  263869 util.go:119] the server doesn't have a resource type egressqoses, skipping the inspection
...
error running backup collection: errors occurred while gathering data:
    [skipping gathering secrets/support due to error: secrets "support" not found, skipping gathering endpoints/host-etcd-2 due to error: endpoints "host-etcd-2" not found, skipping gathering sharedconfigmaps.sharedresource.openshift.io due to error: the server doesn't have a resource type "sharedconfigmaps", skipping gathering sharedsecrets.sharedresource.openshift.io due to error: the server doesn't have a resource type "sharedsecrets"]
error: gather never finished for pod must-gather-rsrkz: timed out waiting for the condition

Actual results:


Expected results:


Additional info:

Comment 2 Oded 2023-02-14 14:03:04 UTC
The bug was not reproduced.
Steps:
OCP 4.10 + ODF 4.10 -> upgrade OCP to 4.11 [check mg]
OCP 4.11 + ODF 4.10 -> upgrade OCP to 4.11 [check mg]
OCP 4.11 + ODF 4.11 -> upgrade OCP to 4.12 [check mg]


Note You need to log in before you can comment on or make changes to this bug.