Bug 1870338
| Summary: | OCS 4.6 must-gather : ocs-must-gather-xxx-helper pod in ContainerCreationError (couldn't find key admin-secret) | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Neha Berry <nberry> |
| Component: | must-gather | Assignee: | Pulkit Kundra <pkundra> |
| Status: | CLOSED ERRATA | QA Contact: | Shay Rozen <srozen> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.6 | CC: | ebenahar, muagarwa, ocs-bugs, sabose |
| Target Milestone: | --- | Keywords: | AutomationBackLog |
| Target Release: | OCS 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.6.0-98.ci | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-12-17 06:23:47 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Verified on version 4.6.0-149.ci openshift-storage must-gather-6h9l7-helper 1/1 Running 0 31s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605 |
Description of problem (please be detailed as possible and provide log snippests): ---------------------------------------------------------------------- OCS 4.6 must-gather -xxx-helper pod unable to come to Running state due to following error Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <unknown> Successfully assigned openshift-storage/must-gather-l69j4-helper to ip-10-0-218-163.us-east-2.compute.internal Normal AddedInterface 21s multus Add eth0 [10.129.2.94/23] Normal Pulled 7s (x4 over 21s) kubelet, ip-10-0-218-163.us-east-2.compute.internal Container image "quay.io/rhceph-dev/rook-ceph@sha256:12f0c3a9b0bf8d9b48b1b7f8a73e5826e43d87fb1d67b16d58a01efb47ea0b8b" already present on machine Warning Failed 7s (x4 over 21s) kubelet, ip-10-0-218-163.us-east-2.compute.internal Error: couldn't find key admin-secret in Secret openshift-storage/rook-ceph-mon $ oc logs pod must-gather-l69j4-helper -n openshift-storage Error from server (NotFound): pods "pod" not found $ oc logs must-gather-l69j4-helper -n openshift-storage Error from server (BadRequest): container "must-gather-helper" in pod "must-gather-l69j4-helper" is waiting to start: CreateContainerConfigError Moreover, the must-gather keeps waiting for the pod for 50 retries, which results in extra time to be wasted. waiting for helper pod to come up in openshift-storage namespace. Retrying 1 waiting for helper pod to come up in openshift-storage namespace. Retrying 2 Version of all relevant components (if applicable): ---------------------------------------------------------------------- $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-08-18-165040 True False 4h5m Cluster version is 4.6.0-0.nightly-2020-08-18-165040 $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.6.0-533.ci OpenShift Container Storage 4.6.0-533.ci Succeeded Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ---------------------------------------------------------------------- Yes the Clone PVC is in Pending state and endless retries are happening Is there any workaround available to the best of your knowledge? ---------------------------------------------------------------------- Not sure Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ---------------------------------------------------------------------- 2 Can this issue reproducible? ---------------------------------------------------------------------- Yes Can this issue reproduce from the UI? ---------------------------------------------------------------------- Yes If this is a regression, please provide more details to justify this: ---------------------------------------------------------------------- No Steps to Reproduce: ---------------------------------------------------------------------- 1. Create an OCS + OCP 4.6 cluster on AWS/Vmware 2. Start OCS must-gather and confirm that the must-gather-helper pod stays in ContainerCreationError due to "Error: couldn't find key admin-secret in Secret openshift-storage/rook-ceph-mon" Actual results: ---------------------------------------------------------------------- Must-gather collects log but the must-gather-helper pod stays in ContainerCreationError state Error: couldn't find key admin-secret in Secret openshift-storage/rook-ceph-mon Expected results: ---------------------------------------------------------------------- No error should be seen Additional info: ---------------------------------------------------------------------- $ oc get pods -A -w|grep must-gather openshift-must-gather-5fqhk must-gather-vh4jw 1/1 Running 0 79m openshift-must-gather-dch9x must-gather-l69j4 0/1 Init:0/1 0 39s openshift-must-gather-jcm49 must-gather-pk42m 1/1 Running 0 21m openshift-storage must-gather-l69j4-helper 0/1 CreateContainerConfigError 0 30s openshift-must-gather-dch9x ip-10-0-151-182us-east-2computeinternal-debug 0/1 Pending 0 0s openshift-must-gather-dch9x ip-10-0-151-182us-east-2computeinternal-debug 0/1 ContainerCreating 0 1s openshift-must-gather-dch9x ip-10-0-151-182us-east-2computeinternal-debug 0/1 Error 0 2s openshift-must-gather-dch9x ip-10-0-151-182us-east-2computeinternal-debug 0/1 Terminating 0 2s openshift-must-gather-dch9x ip-10-0-151-182us-east-2computeinternal-debug 0/1 Terminating 0 2s openshift-must-gather-dch9x ip-10-0-165-190us-east-2computeinternal-debug 0/1 Pending 0 0s openshift-must-gather-dch9x ip-10-0-165-190us-east-2computeinternal-debug 0/1 ContainerCreating 0 0s openshift-must-gather-dch9x ip-10-0-165-190us-east-2computeinternal-debug 0/1 Error 0 1s openshift-must-gather-dch9x ip-10-0-165-190us-east-2computeinternal-debug 0/1 Terminating 0 1s openshift-must-gather-dch9x ip-10-0-165-190us-east-2computeinternal-debug 0/1 Terminating 0 1s openshift-must-gather-dch9x ip-10-0-218-163us-east-2computeinternal-debug 0/1 Pending 0 0s openshift-must-gather-dch9x ip-10-0-218-163us-east-2computeinternal-debug 0/1 ContainerCreating 0 0s openshift-must-gather-dch9x ip-10-0-218-163us-east-2computeinternal-debug 0/1 Error 0 1s openshift-must-gather-dch9x ip-10-0-218-163us-east-2computeinternal-debug 0/1 Terminating 0 2s openshift-must-gather-dch9x ip-10-0-218-163us-east-2computeinternal-debug 0/1 Terminating 0 2s openshift-storage must-gather-l69j4-helper 0/1 Terminating 0 4m46s openshift-storage must-gather-l69j4-helper 0/1 Terminating 0 4m53s openshift-storage must-gather-l69j4-helper 0/1 Terminating 0 4m53s openshift-must-gather-dch9x must-gather-l69j4 0/1 PodInitializing 0 5m2s openshift-must-gather-dch9x must-gather-l69j4 1/1 Running 0 5m3s openshift-must-gather-dch9x must-gather-l69j4 1/1 Terminating 0 6m31s openshift-must-gather-dch9x must-gather-l69j4 1/1 Terminating 0 6m31s $ oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.6 [must-gather ] OUT Using must-gather plugin-in image: quay.io/rhceph-dev/ocs-must-gather:latest-4.6 [must-gather ] OUT namespace/openshift-must-gather-dch9x created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-fp728 created [must-gather ] OUT pod for plug-in image quay.io/rhceph-dev/ocs-must-gather:latest-4.6 created ... [must-gather-l69j4] POD collecting dump of clusterresourceversion [must-gather-l69j4] POD waiting for helper pod to come up in openshift-storage namespace. Retrying 1 [must-gather-l69j4] POD waiting for helper pod to come up in openshift-storage namespace. Retrying 2 ... [must-gather-l69j4] POD waiting for helper pod to come up in openshift-storage namespace. Retrying 50 [must-gather-l69j4] POD collecting command output for: ceph auth list ... [must-gather-l69j4] POD error: unable to upgrade connection: container not found ("must-gather-helper") [must-gather-l69j4] POD collecting command output for: ceph-volume lvm list