Bug 2127150
| Summary: | Must-Gather, After OCP upgrade [OCP4.6->OCP4.7], mg-helper pod stuck on Running state more than 22m | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Oded <oviner> |
| Component: | must-gather | Assignee: | yati padia <ypadia> |
| Status: | CLOSED DUPLICATE | QA Contact: | Prasad Desala <tdesala> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.6 | CC: | ocs-bugs, odf-bz-bot |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-10-13 05:51:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I see this bug is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2128217. We can close this bug. Please reopen if this needs additional attention. *** This bug has been marked as a duplicate of bug 2128217 *** |
Description of problem (please be detailed as possible and provide log snippests): Must-Gather, After OCP upgrade [OCP4.6->OCP4.7], mg-helper pod stuck on Running state more than 22m Version of all relevant components (if applicable): OCS Version: 4.6.15-209.ci OCP Version: 4.7.0-0.nightly-2022-09-12-140133 Provider: AWS-IPI-3AZ-RHCOS-3M-3W Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: Test Process: 1.Upgrade OCP4.6 -> OCP4.7 2.Collect MG: $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6 $ cat logs | grep "****helper pod must-gather-ss56n-helper describe****" -A 100 ****helper pod must-gather-ss56n-helper describe**** Name: must-gather-ss56n-helper Namespace: openshift-storage Priority: 0 Node: ip-10-0-136-154.us-east-2.compute.internal/10.0.136.154 Start Time: Wed, 14 Sep 2022 13:43:27 +0000 Labels: <none> Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "", "interface": "eth0", "ips": [ "10.131.3.2" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "", "interface": "eth0", "ips": [ "10.131.3.2" ], "default": true, "dns": {} }] openshift.io/scc: rook-ceph Status: Running IP: 10.131.3.2 IPs: IP: 10.131.3.2 Containers: must-gather-helper: Container ID: cri-o://21d692b0afbb33825e7f2ee72b2a265aee8b738e78ccc3c8beb9a9bf699473c2 Image: quay.io/rhceph-dev/rook-ceph@sha256:0ea697891033dd8875c2912a5f37587eee460e9536bf32e1ec20833641e30479 Image ID: quay.io/rhceph-dev/rook-ceph@sha256:0ea697891033dd8875c2912a5f37587eee460e9536bf32e1ec20833641e30479 Port: <none> Host Port: <none> Command: /tini Args: -g -- /usr/local/bin/toolbox.sh State: Running Started: Wed, 14 Sep 2022 13:43:29 +0000 Ready: True Restart Count: 0 Environment: ROOK_CEPH_USERNAME: <set to the key 'ceph-username' in secret 'rook-ceph-mon'> Optional: false ROOK_CEPH_SECRET: <set to the key 'ceph-secret' in secret 'rook-ceph-mon'> Optional: false Mounts: /dev from dev (rw) /etc/rook from mon-endpoint-volume (rw) /lib/modules from libmodules (rw) /sys/bus from sysbus (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-9vqvr (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: dev: Type: HostPath (bare host directory volume) Path: /dev HostPathType: sysbus: Type: HostPath (bare host directory volume) Path: /sys/bus HostPathType: libmodules: Type: HostPath (bare host directory volume) Path: /lib/modules HostPathType: mon-endpoint-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: rook-ceph-mon-endpoints Optional: false default-token-9vqvr: Type: Secret (a volume populated by a Secret) SecretName: default-token-9vqvr Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 22m default-scheduler Successfully assigned openshift-storage/must-gather-ss56n-helper to ip-10-0-136-154.us-east-2.compute.internal Normal AddedInterface 22m multus Add eth0 [10.131.3.2/23] Normal Pulled 22m kubelet Container image "quay.io/rhceph-dev/rook-ceph@sha256:0ea697891033dd8875c2912a5f37587eee460e9536bf32e1ec20833641e30479" already present on machine Normal Created 22m kubelet Created container must-gather-helper Normal Started 22m kubelet Started container must-gather-helper $ cat logs | grep "****helper pod must-gather-ss56n-helper logs****" -A 40 ****helper pod must-gather-ss56n-helper logs*** Wed Sep 14 13:43:29 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:43:39 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:44:59 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:46:20 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:47:40 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:49:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:50:10 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:51:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:52:50 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:54:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:55:10 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:56:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:57:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 13:59:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 14:00:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 14:01:50 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 14:03:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 14:04:20 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 Wed Sep 14 14:05:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789 2022-09-14 14:06:16,183 - MainThread - ERROR - ocs_ci.ocs.utils.run_must_gather.913 - Timeout 1500s for must-gather reached, command exited with error: Command '['oc', 'adm', 'must-gather', '--image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6', '--dest-dir=/tmp/tmpevg5vhad_ocs_logs/ocs_must_gather']' timed out after 1500 secondsMust-Gather Output: Actual results: Expected results: Additional info: