release-openshift-origin-installer-e2e-aws-upgrade-4.6-to-4.7-to-4.8-to-4.9-ci is failing frequently in CI, see: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#release-openshift-origin-installer-e2e-aws-upgrade-4.6-to-4.7-to-4.8-to-4.9-ci Job URI: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.6-to-4.7-to-4.8-to-4.9-ci/1463537431096594432 Log: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.6-to-4.7-to-4.8-to-4.9-ci/1463537431096594432/artifacts/e2e-aws-upgrade/clusterversion.json | jq -r '.items[].status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")' 2021-11-24T16:06:34Z RetrievedUpdates=False NoChannel: The update channel has not been configured. 2021-11-24T16:30:08Z Available=True -: Done applying 4.8.0-0.nightly-2021-11-24-020113 2021-11-24T19:32:46Z Failing=False -: - 2021-11-24T19:03:45Z Progressing=True ClusterOperatorUpdating: Working towards 4.9.0-0.ci-2021-11-24-092816: 205 of 738 done (27% complete), waiting on openshift-apiserver 2021-11-24T19:04:01Z Upgradeable=False AdminAckRequired: Kubernetes 1.22 and therefore OpenShift 4.9 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6329921 for details and instructions.
I closed this CURRENTRELEASE, but then realized that we should see improvements in the 4.7 -> ... -> 4.10 job [1] now that the master/4.10 PR has landed with this bug. I'm agnostic about whether we stay in CURRENTRELEASE or move back to MODIFIED/ON_QA as we wait for that to get some new runs. I've also opened the 4.9 backport bug 2027929, and I'm agnostic about how long we cook in 4.10 before moving ahead and landing that backport. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#release-openshift-origin-installer-e2e-aws-upgrade-4.7-to-4.8-to-4.9-to-4.10-ci
*** Bug 2028761 has been marked as a duplicate of this bug. ***
this e2e is failing on direct install0f 4.8.23 as well on upgrade job from 4.8.22 to 4.8.23 for ppc64le " [sig-cluster-lifecycle] TestAdminAck should succeed [Suite:openshift/conformance/parallel]"
S390x Upgrade jobs for 4.9 from 4.8 is continuously failing with test "disruption_tests: [bz-Cluster Version Operator] Verify presence of admin ack gate blocks upgrade until acknowledged" https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.9-upgrade-from-nightly-4.8-ocp-remote-libvirt-s390x/1467690345897660416
Checking a 4.7 -> ... -> 4.10 run, now that the third PR is in [1,2]. It has Jack's map-nil fix: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.7-to-4.8-to-4.9-to-4.10-ci/1467585900341891072/artifacts/release/artifacts/release-images-latest | jq -r '.spec.tags[] | select(.name == "tests").annotations["io.openshift.build.commit.id"]' bc643fa990bef62359eeaf8c54e1aa475f642193 $ git --no-pager log --oneline -1 bc643fa990bef62359eeaf8c54e1aa475f642193 bc643fa990 Merge pull request #26668 from jottofar/check-for-nil-map But there's still a failure due to an DNS/networking hiccup: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.7-to-4.8-to-4.9-to-4.10-ci/1467585900341891072/build-log.txt | grep FAIL: Dec 6 00:00:10.440: FAIL: Error accessing configmap openshift-config-managed/admin-gates: Get "https://api.ci-op-g2m38jp7-eafe9.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1/namespaces/openshift-config-managed/configmaps/admin-gates": dial tcp: lookup api.ci-op-g2m38jp7-eafe9.origin-ci-int-aws.dev.rhcloud.com on 172.30.0.10:53: no such host We may want to relax the test-case a bit so that it comes back and tries again when that sort of thing happens, instead of calling framework.Fail. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#release-openshift-origin-installer-e2e-aws-upgrade-4.7-to-4.8-to-4.9-to-4.10-ci [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.7-to-4.8-to-4.9-to-4.10-ci/1467585900341891072
The failures mentioned in comment 6 are from adminack.go line 193, and that's the nil-map thing Jack's fixed in master in origin#26668: $ git cat-file -p origin/release-4.9:test/extended/util/adminack.go | grep -n . | grep -5 ^193: 188:func setAdminGate(ctx context.Context, gateName string, gateValue string, oc *CLI) string { 189: ackCm, errMsg := getAdminAcksConfigMap(ctx, oc) 190: if len(errMsg) != 0 { 191: framework.Failf(errMsg) 192: } 193: ackCm.Data[gateName] = gateValue 194: _, err := oc.AdminKubeClient().CoreV1().ConfigMaps("openshift-config").Update(ctx, ackCm, metav1.UpdateOptions{}) 195: if err != nil { 196: return fmt.Sprintf("Unable to update configmap openshift-config/admin-acks, err=%v.", err) 197: } 198: return ""
I'm punting the "no such host" hiccup from comment 8 to follow up work, and marking this CLOSED CURRENTRELEASE based on the other changes it made fixing issues we had been seeing in CI.