Hide Forgot
Description of problem: when upgrade from 4.0.0-0.nightly-2019-03-19-004004 to 4.0.0-0.nightly-2019-03-20-153904, Cluster operator kube-controller-manager is reporting a failure: NodeInstallerFailing: 0 nodes are failing on revision 8 Version-Release number of selected component (if applicable): 4.0.0-0.nightly-2019-03-20-153904 How reproducible: always Steps to Reproduce: 1.install a cluster 4.0 with 4.0.0-0.nightly-2019-03-19-004004 2.upgrade cluster to 4.0.0-0.nightly-2019-03-20-153904 #oc adm upgrade --to 4.0.0-0.nightly-2019-03-20-153904 3.check clusterversion and clusteroperator #oc get clusterversion #oc get clusteroperator Actual results: upgrade is fail, and check cluster-version-operator log, show: I0321 09:20:59.722478 1 cvo.go:320] Desired version from spec is v1.Update{Version:"4.0.0-0.nightly-2019-03-20-153904", Image:"registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-03-20-153904"} I0321 09:20:59.722644 1 cvo.go:297] Finished syncing cluster version "openshift-cluster-version/version" (338.432µs) E0321 09:21:12.990874 1 task.go:58] error running apply for clusteroperator "kube-controller-manager" (76 of 308): Cluster operator kube-controller-manager is reporting a failure: NodeInstallerFailing: 0 nodes are failing on revision 8: NodeInstallerFailing: installer: manager NodeInstallerFailing: I0321 08:34:53.549721 1 cmd.go:308] Writing pod manifest "/etc/kubernetes/static-pod-resources/kube-controller-manager-pod-8/kube-controller-manager-pod.yaml" ... NodeInstallerFailing: I0321 08:34:53.550244 1 cmd.go:314] Creating directory for static pod manifest "/etc/kubernetes/manifests" ... NodeInstallerFailing: I0321 08:34:53.550335 1 cmd.go:328] Writing static pod manifest "/etc/kubernetes/manifests/kube-controller-manager-pod.yaml" ... NodeInstallerFailing: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"kube-controller-manager","namespace":"openshift-kube-controller-manager","creationTimestamp":null,"labels":{"app":"kube-controller-manager","kube-controller-manager":"true","revision":"8"}},"spec":{"volumes":[{"name":"resource-dir","hostPath":{"path":"/etc/kubernetes/static-pod-resources/kube-controller-manager-pod-8"}}],"containers":[{"name":"kube-controller-manager-8","image":"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2a4f752e5f0f174d0581f330a4a3211a7f68c34a1d593176db051fc90a5f6a3d","command":["hyperkube","kube-controller-manager"],"args":["--openshift-config=/etc/kubernetes/static-pod-resources/configmaps/config/config.yaml","--kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig","-v=2"],"ports":[{"containerPort":10257}],"resources":{"requests":{"cpu":"100m","memory":"200Mi"}},"volumeMounts":[{"name":"resource-dir","mountPath":"/etc/kubernetes/static-pod-resources"}],"livenessProbe":{"httpGet":{"path":"healthz","port":10257,"scheme":"HTTPS"},"initialDelaySeconds":45,"timeoutSeconds":10},"readinessProbe":{"httpGet":{"path":"healthz","port":10257,"scheme":"HTTPS"},"initialDelaySeconds":10,"timeoutSeconds":10},"terminationMessagePolicy":"FallbackToLogsOnError","imagePullPolicy":"IfNotPresent"}],"hostNetwork":true,"tolerations":[{"operator":"Exists"}],"priorityClassName":"system-node-critical"},"status":{}} NodeInstallerFailing: I0321 08:34:53.617509 1 request.go:530] Throttling request took 66.691145ms, request: POST:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-controller-manager/events NodeInstallerFailing: I0321 09:21:17.522465 1 leaderelection.go:227] successfully renewed lease openshift-cluster-version/version I0321 09:21:47.530861 1 leaderelection.go:227] successfully renewed lease openshift-cluster-version/version [root@localhost lyman]# oc get clusteroperator NAME VERSION AVAILABLE PROGRESSING FAILING SINCE ... kube-apiserver 4.0.0-0.nightly-2019-03-20-153904 True False False 42m kube-controller-manager 4.0.0-0.nightly-2019-03-20-153904 True True True 46m kube-scheduler 4.0.0-0.nightly-2019-03-20-153904 True False False 44m ... Expected results: all clusteroperator upgrade succeed. Additional info:
(In reply to MinLi from comment #0) > How reproducible: > always Thanks for reporting it. Did not meet it in my upgrade testing. FYI, per below comments, this and below bugs are probability issue: https://bugzilla.redhat.com/show_bug.cgi?id=1690088#c5 (same symptom as this bug) https://bugzilla.redhat.com/show_bug.cgi?id=1690153#c3 (same symptom as this bug)
The E0321 09:21:12.990874 1 task.go:58] error running apply for clusteroperator "kube-controller-manager" (76 of 308): Cluster operator kube-controller-manager is reporting a failure: NodeInstallerFailing: 0 nodes are failing on revision 8: should be fixed by https://github.com/openshift/cluster-kube-controller-manager-operator/pull/198
The fix has been merged,move to the QA status to check if it has been fixed in the latest build.
verified! upgrade succeed ! version: upgrade from 4.0.0-0.nightly-2019-03-23-222829 to 4.0.0-0.nightly-2019-03-25-141538
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758