Description of problem: Seeing the following error in release-openshift-ocp-installer-e2e-aws-4.1 tests: level=fatal msg="failed to initialize the cluster: Cluster operator machine-config is reporting a failure: Failed to resync 4.1.0-0.nightly-2019-10-30-153934 because: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: configuration status for pool master is empty, retrying: timed out waiting for the condition" Failing CI run is: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.1/477#1:build-log.txt%3A48 This is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1726370 Version-Release number of selected component (if applicable): 4.1.z How reproducible: Not very, it's occurred once in the last 24 hours in e2e as indicated in https://ci-search-ci-search-next.svc.ci.openshift.org/?search=configuration+status+for+pool+master+is+empty&maxAge=336h&context=2&type=all. There are similar failures for 4.2 AWS proxy e2e though. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
seeing pivot errors, was this a bad image?: ``` I1030 17:00:28.755509 66671 run.go:16] Running: podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: Error reading manifest sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown W1030 17:00:29.132239 66671 run.go:40] podman failed: exit status 125; retrying... I1030 17:01:49.132470 66671 run.go:16] Running: podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 E1030 17:01:49.479058 2900 writer.go:132] Marking Degraded due to: during bootstrap: failed to run pivot: failed to start pivot.service: exit status 1 I1030 17:01:49.525280 2900 update.go:737] logger doesn't support --jounald, grepping the journal I1030 17:01:49.565653 2900 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory I1030 17:01:49.568287 2900 daemon.go:667] In bootstrap mode I1030 17:01:49.568309 2900 daemon.go:695] Current+desired config: rendered-worker-d27b6c7c52762f83fea9ce3683379f5d I1030 17:01:49.573460 2900 daemon.go:865] Bootstrap pivot required to: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 I1030 17:01:49.573529 2900 update.go:715] Updating OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: Error reading manifest sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown W1030 17:00:29.132239 66671 run.go:40] podman failed: exit status 125; retrying... I1030 17:01:49.132470 66671 run.go:16] Running: podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: Error reading manifest sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown W1030 17:01:49.475241 66671 run.go:40] podman failed: exit status 125; retrying... F1030 17:01:49.475272 66671 run.go:48] podman: timed out waiting for the condition pivot.service: Main process exited, code=exited, status=255/n/a pivot.service: Failed with result 'exit-code'. Failed to start Pivot Tool. pivot.service: Consumed 940ms CPU time ``` https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.1/477/artifacts/e2e-aws/pods/openshift-machine-config-operator_machine-config-daemon-244df_machine-config-daemon.log
``` I1030 17:00:28.755509 66671 run.go:16] Running: podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592: Error reading manifest sha256:0ed6c90b9738103a56fe9c06794e6cfab0ef1258b415286aca57aaedd454d592 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown ``` This looks like the release payload was GC'ed on Quay. Does this reproduce reliably with other 4.1 nightly payloads? Recent runs of the same job look green - https://prow.svc.ci.openshift.org/job-history/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.1
I'm unable to provide more info on this since this was something I found in CI while buildcop.
I think this was a flake due to a GC'ed release payload, since jobs after the reported failed job were passing just fine.