Created attachment 1755077 [details] terminal output Description of problem (please be detailed as possible and provide log snippests): ============== This bug is in continuation with the fixes requested in Bug 1893619#c9. b) No need to attempt creation of helper pod, as storagecluster/cephcluster is not present Currently, even though storagecluster is deleted, must-gather still attempts to create a helper pod for 300s and then finally POD status stuck since rook-ceph-mon-endpoints already deleted ================ pod/must-gather-7564k-helper 0/1 ContainerCreating 0 5m7s pod describe ================== $ oc describe pod/must-gather-7564k-helper -n openshift-storage Name: must-gather-7564k-helper Namespace: openshift-storage Warning FailedMount 55s (x11 over 7m7s) kubelet MountVolume.SetUp failed for volume "mon-endpoint-volume" : configmap "rook-ceph-mon-endpoints" not found From terminal logs ====================== [must-gather-7564k] POD waiting for helper pod and debug pod for 0 seconds [must-gather-7564k] POD waiting for the ip-10-0-132-174us-east-2computeinternal-debug pod to be in ready state [must-gather-7564k] POD waiting for helper pod and debug pod for 3 seconds [must-gather-7564k] POD collecting dump of noobaa-operator-77464f9777-7nf9f pod from openshift-storage [must-gather-7564k] POD Skipping ceph collection as Storage Cluster is not present [must-gather-7564k] POD pod "must-gather-7564k-helper" deleted Discussion = https://chat.google.com/room/AAAAREGEba8/nh97ybwG23o Version of all relevant components (if applicable): ===================================================== OCS = 4.7.0-250.ci OCP = 4.7.0-0.nightly-2021-02-04-031352 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ========================================================================= No Is there any workaround available to the best of your knowledge? =================================================== Not sure Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ================================================= 3 Can this issue reproducible? ================================ Yes Can this issue reproduce from the UI? ======================================= NA If this is a regression, please provide more details to justify this: ============================================================= NO Steps to Reproduce: ======================= 2 scenarios can lead to this issue >> Scenario 1) Installed OCS operator and initiated must-gather a)Operator Hub->Install OCS operator b) When the operator pods are up and CSV is in succeeded state, initiate must-gather oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.7 Note: Do not install storage cluster >> Scenario 2) Triggered Uninstall of OCS by deleting storage cluster ; initiate must-gather a) Delete storagecluster in a running OCS cluster (follow Uninstall docs) $ oc delete storagecluster --all -n openshift-storage --wait=true --timeout=5m initiate must-gather b) Once storagecluster and dependent ceph cluster are deleted, initiate must-gather $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.7 Actual results: ===================== helper pod still tries to come up but fails when storagecluster is already deleted Expected results: ===================== With storagecluster deleted, MG should not attempt to bring up helper pod. Additional info: ======================= $ oc describe pod/must-gather-7564k-helper -n openshift-storage Name: must-gather-7564k-helper Namespace: openshift-storage Priority: 0 Node: ip-10-0-132-174.us-east-2.compute.internal/10.0.132.174 Start Time: Thu, 04 Feb 2021 17:06:05 +0530 Labels: must-gather-helper-pod= Annotations: openshift.io/scc: rook-ceph Status: Pending IP: IPs: <none> Containers: must-gather-helper: Container ID: Image: quay.io/rhceph-dev/rook-ceph@sha256:9ad045ea253aa7e0938307a182196759bd3f3dc29edd2e8cee8073ba3f7c2040 Image ID: Port: <none> Host Port: <none> Command: /tini Args: -g -- /usr/local/bin/toolbox.sh State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: ROOK_CEPH_USERNAME: <set to the key 'ceph-username' in secret 'rook-ceph-mon'> Optional: false ROOK_CEPH_SECRET: <set to the key 'ceph-secret' in secret 'rook-ceph-mon'> Optional: false Mounts: /dev from dev (rw) /etc/rook from mon-endpoint-volume (rw) /lib/modules from libmodules (rw) /sys/bus from sysbus (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-5nqfv (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: dev: Type: HostPath (bare host directory volume) Path: /dev HostPathType: sysbus: Type: HostPath (bare host directory volume) Path: /sys/bus HostPathType: libmodules: Type: HostPath (bare host directory volume) Path: /lib/modules HostPathType: mon-endpoint-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: rook-ceph-mon-endpoints Optional: false default-token-5nqfv: Type: Secret (a volume populated by a Secret) SecretName: default-token-5nqfv Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 7m7s default-scheduler Successfully assigned openshift-storage/must-gather-7564k-helper to ip-10-0-132-174.us-east-2.compute.internal Warning FailedMount 5m4s kubelet Unable to attach or mount volumes: unmounted volumes=[mon-endpoint-volume], unattached volumes=[mon-endpoint-volume default-token-5nqfv dev sysbus libmodules]: timed out waiting for the condition Warning FailedMount 58s (x2 over 3m1s) kubelet Unable to attach or mount volumes: unmounted volumes=[mon-endpoint-volume], unattached volumes=[dev sysbus libmodules mon-endpoint-volume default-token-5nqfv]: timed out waiting for the condition Warning FailedMount 55s (x11 over 7m7s) kubelet MountVolume.SetUp failed for volume "mon-endpoint-volume" : configmap "rook-ceph-mon-endpoints" not found
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041