Bug 2127150

Summary: Must-Gather, After OCP upgrade [OCP4.6->OCP4.7], mg-helper pod stuck on Running state more than 22m
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Oded <oviner>
Component: must-gatherAssignee: yati padia <ypadia>
Status: CLOSED DUPLICATE QA Contact: Prasad Desala <tdesala>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.6CC: ocs-bugs, odf-bz-bot
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-13 05:51:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Oded 2022-09-15 13:27:49 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Must-Gather, After OCP upgrade [OCP4.6->OCP4.7], mg-helper pod stuck on Running state more than 22m

Version of all relevant components (if applicable):
OCS Version: 4.6.15-209.ci
OCP Version: 4.7.0-0.nightly-2022-09-12-140133
Provider: AWS-IPI-3AZ-RHCOS-3M-3W

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
Test Process:
1.Upgrade OCP4.6 -> OCP4.7
2.Collect MG:
$ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6

$ cat logs | grep "****helper pod must-gather-ss56n-helper describe****" -A 100
****helper pod must-gather-ss56n-helper describe****
Name:         must-gather-ss56n-helper
Namespace:    openshift-storage
Priority:     0
Node:         ip-10-0-136-154.us-east-2.compute.internal/10.0.136.154
Start Time:   Wed, 14 Sep 2022 13:43:27 +0000
Labels:       <none>
Annotations:  k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "",
                    "interface": "eth0",
                    "ips": [
                        "10.131.3.2"
                    ],
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "",
                    "interface": "eth0",
                    "ips": [
                        "10.131.3.2"
                    ],
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: rook-ceph
Status:       Running
IP:           10.131.3.2
IPs:
  IP:  10.131.3.2
Containers:
  must-gather-helper:
    Container ID:  cri-o://21d692b0afbb33825e7f2ee72b2a265aee8b738e78ccc3c8beb9a9bf699473c2
    Image:         quay.io/rhceph-dev/rook-ceph@sha256:0ea697891033dd8875c2912a5f37587eee460e9536bf32e1ec20833641e30479
    Image ID:      quay.io/rhceph-dev/rook-ceph@sha256:0ea697891033dd8875c2912a5f37587eee460e9536bf32e1ec20833641e30479
    Port:          <none>
    Host Port:     <none>
    Command:
      /tini
    Args:
      -g
      --
      /usr/local/bin/toolbox.sh
    State:          Running
      Started:      Wed, 14 Sep 2022 13:43:29 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      ROOK_CEPH_USERNAME:  <set to the key 'ceph-username' in secret 'rook-ceph-mon'>  Optional: false
      ROOK_CEPH_SECRET:    <set to the key 'ceph-secret' in secret 'rook-ceph-mon'>    Optional: false
    Mounts:
      /dev from dev (rw)
      /etc/rook from mon-endpoint-volume (rw)
      /lib/modules from libmodules (rw)
      /sys/bus from sysbus (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-9vqvr (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:  
  sysbus:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/bus
    HostPathType:  
  libmodules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
  mon-endpoint-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rook-ceph-mon-endpoints
    Optional:  false
  default-token-9vqvr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-9vqvr
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason          Age   From               Message
  ----    ------          ----  ----               -------
  Normal  Scheduled       22m   default-scheduler  Successfully assigned openshift-storage/must-gather-ss56n-helper to ip-10-0-136-154.us-east-2.compute.internal
  Normal  AddedInterface  22m   multus             Add eth0 [10.131.3.2/23]
  Normal  Pulled          22m   kubelet            Container image "quay.io/rhceph-dev/rook-ceph@sha256:0ea697891033dd8875c2912a5f37587eee460e9536bf32e1ec20833641e30479" already present on machine
  Normal  Created         22m   kubelet            Created container must-gather-helper
  Normal  Started         22m   kubelet            Started container must-gather-helper

$ cat logs | grep "****helper pod must-gather-ss56n-helper logs****" -A 40
****helper pod must-gather-ss56n-helper logs***
Wed Sep 14 13:43:29 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:43:39 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:44:59 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:46:20 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:47:40 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:49:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:50:10 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:51:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:52:50 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:54:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:55:10 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:56:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:57:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 13:59:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 14:00:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 14:01:50 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 14:03:00 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 14:04:20 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789
Wed Sep 14 14:05:30 UTC 2022 writing mon endpoints to /etc/ceph/ceph.conf: a=172.30.41.6:6789,d=172.30.78.28:6789,c=172.30.74.44:6789

2022-09-14 14:06:16,183 - MainThread - ERROR - ocs_ci.ocs.utils.run_must_gather.913 - Timeout 1500s for must-gather reached, command exited with error: Command '['oc', 'adm', 'must-gather', '--image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6', '--dest-dir=/tmp/tmpevg5vhad_ocs_logs/ocs_must_gather']' timed out after 1500 secondsMust-Gather Output: 


Actual results:


Expected results:


Additional info:

Comment 2 yati padia 2022-10-13 05:51:40 UTC
I see this bug is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2128217. We can close this bug. Please reopen if this needs additional attention.

*** This bug has been marked as a duplicate of bug 2128217 ***