test: [sig-cluster-lifecycle] Pods cannot access the /config/master API endpoint [Suite:openshift/conformance/parallel] is failing frequently in CI, see search results: $ w3m -dump -cols 200 'https://search.svc.ci.openshift.org/?search=Pods+cannot+access+the+%2Fconfig%2Fmaster+API+endpoint&maxAge=48h&type=junit&name=release-openshift-' | grep 'failures match' release-openshift-ocp-installer-e2e-aws-upi-4.5 - 24 runs, 50% failed, 8% of failures match release-openshift-ocp-installer-e2e-aws-4.5 - 50 runs, 62% failed, 3% of failures match release-openshift-ocp-installer-e2e-metal-4.5 - 26 runs, 50% failed, 8% of failures match release-openshift-ocp-installer-e2e-vsphere-upi-4.5 - 24 runs, 79% failed, 5% of failures match release-openshift-ocp-e2e-aws-scaleup-rhel7-4.5 - 11 runs, 64% failed, 14% of failures match release-openshift-ocp-e2e-aws-scaleup-rhel7-4.6 - 11 runs, 45% failed, 20% of failures match release-openshift-ocp-installer-e2e-aws-4.6 - 71 runs, 73% failed, 4% of failures match release-openshift-origin-installer-e2e-gcp-4.6 - 32 runs, 41% failed, 15% of failures match release-openshift-ocp-installer-e2e-gcp-4.6 - 5 runs, 40% failed, 50% of failures match release-openshift-ocp-installer-e2e-metal-4.6 - 5 runs, 40% failed, 50% of failures match release-openshift-ocp-installer-e2e-metal-compact-4.6 - 5 runs, 20% failed, 100% of failures match release-openshift-origin-installer-e2e-azure-shared-vpc-4.5 - 2 runs, 50% failed, 100% of failures match release-openshift-origin-installer-e2e-aws-upgrade-4.3-to-4.4-to-4.5-to-4.6-ci - 2 runs, 100% failed, 50% of failures match release-openshift-origin-installer-e2e-aws-shared-vpc-4.5 - 2 runs, 100% failed, 50% of failures match release-openshift-ocp-installer-e2e-aws-ovn-4.5 - 23 runs, 83% failed, 11% of failures match release-openshift-ocp-installer-e2e-ovirt-4.6 - 11 runs, 82% failed, 11% of failures match release-openshift-origin-installer-e2e-aws-calico-4.5 - 2 runs, 100% failed, 50% of failures match release-openshift-ocp-installer-e2e-openstack-4.6 - 12 runs, 100% failed, 17% of failures match Picking [1] as an example job, the error message was: Run #0: Failed expand_less 3m20s fail [github.com/openshift/origin/test/extended/csrapprover/csrapprover.go:49]: Unexpected error: <*errors.errorString | 0xc0001d8970>: { s: "timed out waiting for the condition", } timed out waiting for the condition occurred [1]: https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-4.5/1399
This error is coming in code from https://github.com/openshift/origin/blob/55f403906e9b5e66fab9b4afb19f40b2212f74b5/test/extended/csrapprover/csrapprover.go#L48 https://github.com/openshift/origin/blob/55f403906e9b5e66fab9b4afb19f40b2212f74b5/test/extended/util/framework.go#L1595 By looking at "Monitor cluster while tests execute": "registry.fedoraproject.org/fedora:30": rpc error: code = Unknown desc = Error reading manifest 30 in registry.fedoraproject.org/fedora: received unexpected HTTP status: 503 Service Temporarily Unavailable (2 times) Jun 10 07:17:32.994 W ns/e2e-test-cluster-client-cert-qz5fk pod/get-bootstrap-creds node/ip-10-0-135-201.us-west-2.compute.internal reason/GracefulDelete in 30s Seems fedora registry is actually not responding: Albertos-MacBook-Pro:enhancements@albertogarla $ docker pull registry.fedoraproject.org/fedora:30 Error response from daemon: received unexpected HTTP status: 503 Service Temporarily Unavailable Targeting to 4.6 to not block 4.5 as this is orthogonal.
*** Bug 1845295 has been marked as a duplicate of this bug. ***
i think we will probably need to revert Alberto's change given that the new target does not support multi-arch manifests. i did do some investigation into the previous image spec and it is working for me /with/ multi-arch: ``` $ podman pull registry.fedoraproject.org/fedora:32 --override-arch arm64 Trying to pull registry.fedoraproject.org/fedora:32... Getting image source signatures Copying blob 1bfcc9281f78 done Copying config ef79e50227 done Writing manifest to image destination Storing signatures ef79e5022740c1df693fafa7c666791adb6dabae9004ef5e46e21e8e75f33b1c ``` i'm not sure how to test these changes, but i will propose a PR to use the fedora:32 target from registry.fedoraproject.org if that will support our multi-arch builds. any recommendations or advice on how to test?
Bug 1816812 is about decoupling the test suite from external registries; maybe just close as a dup of that?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196