Putting UpcomingSprint just in case 4.5 backport PR does not get merged EOW
I was able to reproduce the issue (or a variation ) on 4.5.0-0.nightly-2020-09-14-124053 which should have the fix. In a 50 worker AWS cluster, 1 minute after running oc adm must-gather, the following alert fired: AL KubePodNotReady Pod openshift-must-gather-gbggn/must-gather-sd5n7 has been in a non-ready state for longer than 15 minutes. It had not been 15 minutes since I started the must-gather. See pod info below at the time the alert fired. # oc get pods --all-namespaces | grep must-gather openshift-must-gather-gbggn must-gather-sd5n7 0/1 Init:0/1 0 86s
> openshift-must-gather-gbggn must-gather-sd5n7 0/1 Init:0/1 0 86s After applying the fix, must-gather pod has no longer the init container. Can you run `oc get -o yaml` over the must-gather pod?
Moving back ON_QA to try later build.
4.5.0-0.nightly-2020-09-14-124053 is the latest 4.5 nightly and it does not have the fix. Not sure why automation moved it to ON_QA. Moving it back to MODIFIED while awaiting a nightly with the PR
@Jan In 4.5.12 I still see the Init container. The diff for 4.5.12 (https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4-stable/release/4.5.12?from=4.5.9) shows that the merged PR is included: CLI, CLI-ARTIFACTS, DEPLOYER, TOOLS Bug 1876238: fix typo in oc adm upgrade help #557 Bug 1875551: must-gather: move gather init container under containers #554 Full changelog Does the presence of the Init container mean the fix is not correct? Ref: your comment 6 From oc get pods --all-namespaces -w openshift-must-gather-bdmnq must-gather-9972t 0/1 Pending 0 0s openshift-must-gather-bdmnq must-gather-9972t 0/1 Pending 0 0s openshift-must-gather-bdmnq must-gather-9972t 0/1 Init:0/1 0 0s openshift-must-gather-bdmnq must-gather-9972t 0/1 Init:0/1 0 2s openshift-must-gather-bdmnq must-gather-9972t 0/1 Init:0/1 0 4s The alert was raised: KubePodNotReady Pod openshift-must-gather-bdmnq/must-gather-9972t has been in a non-ready state for longer than 15 minutes. Since the release diff indicates the fix should be in 4.5.12 I am moving this back to ASSIGNED for investigation
That is strange. How do you download the oc binary for 4.5.12?
> How do you download the oc binary for 4.5.12? https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.5.12/ links to openshift-client-linux-4.5.12.tar.gz, etc. and a signed checksum file.
Checking with https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.5.12/openshift-client-linux-4.5.12.tar.gz I don't see any init container in the pod's spec. What command do you use to run the must-gather? Can you run `oc version` as well?
In my case: ``` $ ./oc version Client Version: 4.5.12 Server Version: 4.6.0-0.nightly-2020-08-10-233406 Kubernetes Version: v1.19.0-rc.2+5241b27-dirty ``` ``` $ ./oc adm must-gather [must-gather ] OUT Using must-gather plugin-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8a84ccbdd140bb7151a774df04b6ba8a310ab4bbf025405a9af18b5e63847912 [must-gather ] OUT namespace/openshift-must-gather-bjbvl created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-72xwb created [must-gather ] OUT pod for plug-in image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8a84ccbdd140bb7151a774df04b6ba8a310ab4bbf025405a9af18b5e63847912 created [must-gather-mrp5v] POD Wrote inspect data to must-gather. ... ``` ``` ./oc get pods --all-namespaces -w ... openshift-must-gather-bjbvl must-gather-mrp5v 0/2 Pending 0 0s openshift-must-gather-bjbvl must-gather-mrp5v 0/2 Pending 0 0s openshift-must-gather-bjbvl must-gather-mrp5v 0/2 ContainerCreating 0 0s openshift-must-gather-bjbvl must-gather-mrp5v 0/2 ContainerCreating 0 2s openshift-must-gather-bjbvl must-gather-mrp5v 2/2 Running 0 4s ```
client issue on my side. re-testing.
Verified with: $ oc version Client Version: 4.5.14 Server Version: 4.5.14 Kubernetes Version: v1.18.3+5302882 No init pod, no alerts. Problem was old client on my system.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.15 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4228