Description of problem: On oVirt CI we see jobs that fail on bootstrap phase due to kube-apiserver pods crash looping, See Examples[1]: When we look at the nodes[2] of one of the failing jobs we see that they all have "taints: NoSchedule" that the operators don't tolerate and therefore they are not installed. When we look at the pods[3] we see that kube-apiserver has "CrashLoopBackOff" [1] Examples: - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-ovirt-4.7/1330781822644129792 - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-ovirt-4.7/1330615017845821440 [2] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-ovirt-4.7/1330781822644129792/artifacts/e2e-ovirt/nodes.json [3] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-ovirt-4.7/1330781822644129792/artifacts/e2e-ovirt/pods.json On the operator logs I see: " W1123 08:27:04.284811 1 staticpod.go:37] revision 6 is unexpectedly already the latest available revision. This is a possible race! E1123 08:27:04.293782 1 base_controller.go:250] "RevisionController" controller failed to sync "key", err: conflicting latestAvailableRevision 6 " On the pods I see a lot of connection refused Additional info: On previous jobs I the same issue but only 2 nodes (masters) joined the cluster, examples: - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-ovirt-4.7/1330112463709933568 - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-ovirt-4.7/1329706492684668928
What is in the logs of the crashlooping kube-apiserver?
Kube-apiserver report this: Error: error reading public key file /etc/kubernetes/static-pod-resources/configmaps/bound-sa-token-signing-certs/service-account-001.pub: data does not contain any valid RSA or ECDSA public keys This is very probably due to https://github.com/openshift/cluster-kube-apiserver-operator/pull/1006, to be reverted in https://github.com/openshift/cluster-kube-apiserver-operator/pull/1011.
(In reply to Stefan Schimanski from comment #2) > Kube-apiserver report this: > > Error: error reading public key file > /etc/kubernetes/static-pod-resources/configmaps/bound-sa-token-signing-certs/ > service-account-001.pub: data does not contain any valid RSA or ECDSA public > keys > > This is very probably due to > https://github.com/openshift/cluster-kube-apiserver-operator/pull/1006, to > be reverted in > https://github.com/openshift/cluster-kube-apiserver-operator/pull/1011. Thanks for the fastest reply ever! I see that #1011 is about to get merged so I will track the upcoming jobs and update on this Bug incase the issue is resolved
*** This bug has been marked as a duplicate of bug 1900446 ***