CMO reported a hard error (failing=true) on it's cluster operator, and this should be an error it handles and ignores/retries: Mar 18 14:54:32.602 E clusteroperator/monitoring changed Failing to True: Failed to rollout the stack. Error: running task Updating Prometheus-k8s failed: reconciling Prometheus ClusterRoleBinding failed: updating ClusterRoleBinding object failed: an error on the server ("apiserver is shutting down.") has prevented the request from succeeding (put clusterrolebindings.rbac.authorization.k8s.io prometheus-k8s) https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/5804 Depends on https://github.com/kubernetes/kubernetes/pull/75368 which should make it automatic. For 4.1 we want the server to return a structured error and have client stacks gracefully retry the error, to minimize the churn caused by API restarts. Blocks GA
Related to https://bugzilla.redhat.com/show_bug.cgi?id=1684547 Need to ensure all components are protected.
*** Bug 1690167 has been marked as a duplicate of this bug. ***
To mitigate: https://github.com/openshift/origin/pull/22355 Stefan believe we have bug in shutdown order, so we still need to look at that. The pick above should make the error less disturbing.
Still could see errors from: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6519/artifacts/e2e-aws/junit/junit_e2e_20190406-222605.xml
To match with upstream: https://github.com/openshift/origin/pull/22511
No 'apiserver is shutting down' error , but have some related error: ClusterOperatorNotAvailable: Cluster operator openshift-apiserver has not yet reported success. Not sure is same issue or not.
That is different error and it has been fixed today.
Checked with latest e2e test logs, no 'apiserver is shutting down' error , will verify this.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758