Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1822441

Summary:	openshift-controller-manager stuck with one old operand pod, but flaps the Progressing message
Product:	OpenShift Container Platform	Reporter:	W. Trevor King <wking>
Component:	Node	Assignee:	Ryan Phillips <rphillips>
Node sub component:	CRI-O	QA Contact:	Weinan Liu <weinliu>
Status:	CLOSED CURRENTRELEASE	Docs Contact:
Severity:	medium
Priority:	medium	CC:	adam.kaplan, aos-bugs, gmontero, jluhrsen, jokerman, mfojtik, rphillips, tsweeney
Version:	4.5	Keywords:	Upgrades
Target Milestone:	---
Target Release:	4.8.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	devex
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-03-08 22:06:17 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description W. Trevor King 2020-04-09 02:44:00 UTC

A pull-request CI job [1] failed wiating for openshift-controller-manager:

[sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial] [Suite:openshift] expand_less 	1h19m17s
fail [github.com/openshift/origin/test/e2e/upgrade/upgrade.go:135]: during upgrade to registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5
Unexpected error:
    <*errors.errorString | 0xc002aab020>: {
        s: "Cluster did not complete upgrade: timed out waiting for the condition: Some cluster operators are still updating: insights, openshift-controller-manager",
    }
    Cluster did not complete upgrade: timed out waiting for the condition: Some cluster operators are still updating: insights, openshift-controller-manager
occurred

Update timing:

$ yaml2json <cluster-scoped-resources/config.openshift.io/clusterversions/version.yaml | jq -r '.status.history[] | .startedTime + " " + .state + " " + .version'
2020-04-08T16:20:53Z Partial 0.0.1-2020-04-08-154310
2020-04-08T15:48:14Z Completed 0.0.1-2020-04-08-154108

The CVO has been waiting on that operator for a while:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1630/pull-ci-openshift-machine-config-operator-master-e2e-gcp-upgrade/1627/artifacts/e2e-gcp-upgrade/pods/openshift-cluster-version_cluster-version-operator-758799946f-xvnsh_cluster-version-operator.log | grep 'Running sync.*in state\|Result of work'
I0408 16:35:13.134301       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 0
I0408 16:40:58.212051       1 task_graph.go:596] Result of work: [deployment openshift-machine-api/machine-api-operator is progressing ReplicaSetUpdated: ReplicaSet "machine-api-operator-6c5db496d" is progressing. Cluster operator openshift-apiserver is still updating Cluster operator config-operator is still updating]
I0408 16:41:22.707011       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 1
I0408 16:47:07.759017       1 task_graph.go:596] Result of work: [Cluster operator kube-storage-version-migrator is still updating]
I0408 16:47:51.400745       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 2
I0408 16:53:36.454757       1 task_graph.go:596] Result of work: [Cluster operator insights is still updating Cluster operator openshift-controller-manager is still updating Cluster operator cluster-autoscaler is still updating]
I0408 16:55:08.959993       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 3
I0408 17:00:54.012934       1 task_graph.go:596] Result of work: [Cluster operator insights is still updating Cluster operator openshift-controller-manager is still updating]
I0408 17:04:18.726927       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 4
I0408 17:10:03.780088       1 task_graph.go:596] Result of work: [Cluster operator insights is still updating Cluster operator openshift-controller-manager is still updating]
I0408 17:13:08.312313       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 5
I0408 17:18:53.365331       1 task_graph.go:596] Result of work: [Cluster operator insights is still updating Cluster operator openshift-controller-manager is still updating]
I0408 17:22:08.131304       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 6
I0408 17:27:53.184166       1 task_graph.go:596] Result of work: [Cluster operator insights is still updating Cluster operator openshift-controller-manager is still updating]
I0408 17:30:59.868386       1 sync_worker.go:471] Running sync registry.svc.ci.openshift.org/ci-op-0jz0k2zj/release@sha256:ed9d8645f9bc7736ef6da0f485bb4eb82c5a1694411f4b2e9e8c43d60a5eb5a5 (force=true) on generation 2 in state Updating at attempt 7
I0408 17:36:44.921347       1 task_graph.go:596] Result of work: [Cluster operator insights is still updating Cluster operator openshift-controller-manager is still updating]

The operand DaemonSet has one old pod:

$ cat namespaces/openshift-controller-manager/apps/daemonsets.yaml  | grep -A8 '^  status:'
  status:
    currentNumberScheduled: 3
    desiredNumberScheduled: 3
    numberAvailable: 3
    numberMisscheduled: 0
    numberReady: 3
    observedGeneration: 11
    updatedNumberScheduled: 2
kind: DaemonSetList
$ grep ' phase:\| image:' namespaces/openshift-controller-manager/pods/controller-manager-*/*.yaml | sort | uniq
namespaces/openshift-controller-manager/pods/controller-manager-djw6l/controller-manager-djw6l.yaml:    image: registry.svc.ci.openshift.org/ci-op-0jz0k2zj/stable-initial@sha256:fb906cc9ff9e369a8546d24ee4a0d57bc597657b6fe8fa91c8ce7b67067142cb
namespaces/openshift-controller-manager/pods/controller-manager-djw6l/controller-manager-djw6l.yaml:  phase: Running
namespaces/openshift-controller-manager/pods/controller-manager-f4c9q/controller-manager-f4c9q.yaml:    image: registry.svc.ci.openshift.org/ci-op-0jz0k2zj/stable@sha256:fb906cc9ff9e369a8546d24ee4a0d57bc597657b6fe8fa91c8ce7b67067142cb
namespaces/openshift-controller-manager/pods/controller-manager-f4c9q/controller-manager-f4c9q.yaml:  phase: Running
namespaces/openshift-controller-manager/pods/controller-manager-q88lj/controller-manager-q88lj.yaml:    image: registry.svc.ci.openshift.org/ci-op-0jz0k2zj/stable@sha256:fb906cc9ff9e369a8546d24ee4a0d57bc597657b6fe8fa91c8ce7b67067142cb
namespaces/openshift-controller-manager/pods/controller-manager-q88lj/controller-manager-q88lj.yaml:  phase: Running

The operator sets a message explaining the issue:

$ yaml2json <cluster-scoped-resources/config.openshift.io/clusteroperators/openshift-controller-manager.yaml | jq -r '.status.conditions[] | select(.type == "Progressing") | .lastTransitionTime + " " + .message'
2020-04-08T16:05:30Z Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3

But that log message appears to flap:

$ grep 'Progressing message changed' namespaces/openshift-controller-manager-operator/pods/openshift-controller-manager-operator-fc98f748d-sqczl/operator/operator/logs/current.log | tail -n4
2020-04-08T17:41:47.392979279Z I0408 17:41:47.392903       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"d6f62bea-7451-4fb1-b267-34f86c2bd7e1", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3"
2020-04-08T17:42:06.938238741Z I0408 17:42:06.938180       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"d6f62bea-7451-4fb1-b267-34f86c2bd7e1", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3" to ""
2020-04-08T17:42:27.400499446Z I0408 17:42:27.400439       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"d6f62bea-7451-4fb1-b267-34f86c2bd7e1", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3"
2020-04-08T17:42:46.942315058Z I0408 17:42:46.942249       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"d6f62bea-7451-4fb1-b267-34f86c2bd7e1", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3" to ""

Meanwhile, there has not been recent activity in the operand space:

$ yaml2json <namespaces/openshift-controller-manager/core/events.yaml | jq -r '[.items[] | .timePrefix = if .firstTimestamp == null or .firstTimestamp == "null" then .eventTime else .firstTimestamp + " - " + .lastTimestamp + " (" + (.count | tostring) + ")" end] | sort_by(.timePrefix)[] | .timePrefix + " " + .metadata.namespace + " " + .message' | tail -n2
2020-04-08T17:03:26Z - 2020-04-08T17:03:26Z (1) openshift-controller-manager Created container controller-manager
2020-04-08T17:03:26Z - 2020-04-08T17:03:26Z (1) openshift-controller-manager Started container controller-manager

The flapping means that it's hit or miss whether a must gather exposes the scheduler issue on the ClusterOperator or not.

[1]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1630/pull-ci-openshift-machine-config-operator-master-e2e-gcp-upgrade/1627

Comment 1 W. Trevor King 2020-04-09 02:46:47 UTC

This would also impact other ClusterOperator consumers, like the web console UX and uploaded Insights bundles.

Comment 3 Gabe Montero 2020-05-11 16:12:18 UTC

Results from my triage today of the run Trevor noted:

1) examining the events from the openshift-controller-manager-operator namespace, the initial install, prior to the upgrade attempt, seemed include some of the flapping previously noted 

The OperatorVersionChanged event for the baseline release

"lastTimestamp": "2020-04-08T16:04:30Z",
"message": "clusteroperator/openshift-controller-manager version \"operator\" changed from \"\" to \"0.0.1-2020-04-08-154108\"",


Some of the OperatorStatusChanged events around that time show OCM-O going progressing == false and available == true at least for 
a moment, but then soon after, we lost a replicas, and it stayed that way until the upgrade was initiated:

"lastTimestamp": "2020-04-08T16:04:30Z",
"message": "Status for clusteroperator/openshift-controller-manager changed: Progressing changed from True to False (\"\"),Available changed from False to True (\"\")",

"lastTimestamp": "2020-04-08T16:05:30Z",
"message": "Status for clusteroperator/openshift-controller-manager changed: Progressing changed from False to True (\"Progressing: daemonset/controller-manager: observed generation is 8, desired generation is 9.\\nProgressing: openshiftcontrollermanagers.operator.openshift.io/cluster: observed generation is 2, desired generation is 3.\")",

"lastTimestamp": "2020-04-08T16:06:51Z",
"message": "Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from \"\" to \"Progressing: daemonset/controller-manager: updated number scheduled is 0, desired number scheduled is 3\"",

 "lastTimestamp": "2020-04-08T16:44:50Z",
"message": "Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from \"\" to \"Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3\"",

2) Note that last timestamp of 16:44 ... I believe that was a few minutes before the upgrade was started.  In particular, I see these ScalingReplicaSet events at 16:48

"lastTimestamp": "2020-04-08T16:48:19Z",
"message": "Scaled up replica set openshift-controller-manager-operator-fc98f748d to 1",

"lastTimestamp": "2020-04-08T16:48:26Z",
"message": "Scaled down replica set openshift-controller-manager-operator-dbd4884d9 to 0",

"lastTimestamp": "2020-04-08T16:48:27Z",
"message": "Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from \"Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3\" to \"\"",


3) The current.log for the OCM-O starts at 16:48 with 2020-04-08T16:48:26.041967181Z I0408 16:48:26.041811       1 cmd.go:195] Using service-serving-cert provided certificates

4) And the status_controller.go logs are ultimately complaining about only 2 out of 3 replicas for the OCM

5) Now, for analyzing the 3 OCM replicas

2 of them are stuck in leader election, waiting for the third replica:

2020-04-08T17:03:26.979230031Z I0408 17:03:26.979195       1 leaderelection.go:242] attempting to acquire leader lease  openshift-controller-manager/openshift-master-controllers...

and 

2020-04-08T16:50:03.954507766Z I0408 16:50:03.954440       1 leaderelection.go:242] attempting to acquire leader lease  openshift-controller-manager/openshift-master-controllers...


and for the problematic third replica, we get this ominous message in the pod log:

unable to retrieve container logs for cri-o://d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c

6) So I downloaded the master journal, and grepped for d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c

And way back in the 16:05 time frame ... basically after we reached a good state prior to the upgrade, and when we saw those OCM-O events we see 

masters-journal:Apr 08 16:05:18.373284 ci-op-kf7v5-m-0.c.openshift-gce-devel-ci.internal hyperkube[1488]: I0408 16:05:18.373265    1488 kuberuntime_container.go:600] Killing container "cri-o://d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c" with 30 second grace period

I am not knowledgeable enough to know *why* our container was shut down (if it was something like node pressure, I cannot decipher it from the journal logs).

But I suspect it never fully completed, because I see references to that container well after the upgrade and the end of the run.  Things like 

masters-journal:Apr 08 16:55:49.050888 ci-op-kf7v5-m-0.c.openshift-gce-devel-ci.internal hyperkube[1488]: I0408 16:55:49.050304    1488 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-controller-manager", Name:"controller-manager-djw6l", UID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", APIVersion:"v1", ResourceVersion:"8845", FieldPath:""}): type: 'Warning' reason: 'FailedSync' error determining status: rpc error: code = Unknown desc = container with ID starting with d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c not found: ID does not exist

kuberuntime_container.go is the kubelet....I don't know if that area is where the root cause sits, but hopefully the SMEs there can find the data I do not know how to find, and re-route some more if needed.

Comment 4 Gabe Montero 2020-05-11 16:13:05 UTC

But bottom line ... I think OCM-O is responding correctly to conditions underneath it.

Comment 8 W. Trevor King 2020-06-11 20:18:41 UTC

Searching [1] turned up this 4.5.0-rc.1 -> 4.6.0-0.nightly-2020-06-11-041445 job [2].  The job succeeded, but shows the same message flapping:

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/32114/artifacts/e2e-aws-upgrade/container-logs/test.log | grep 'openshift-controller-manager.*Progressing' | tail -n7
Jun 11 05:06:30.449 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 0, desired number scheduled is 3" to ""
Jun 11 05:06:33.344 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: number available is 0, desired number available > 1\nProgressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3",Available changed from True to False ("Available: no daemon pods available on any node.")
Jun 11 05:06:33.389 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: number available is 0, desired number available > 1\nProgressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to "",Available message changed from "Available: no daemon pods available on any node." to ""
Jun 11 05:06:38.058 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: number available is 0, desired number available > 1",Available message changed from "" to "Available: no daemon pods available on any node."
Jun 11 05:06:50.273 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: number available is 0, desired number available > 1" to "",Available message changed from "Available: no daemon pods available on any node." to ""
Jun 11 05:07:10.766 W clusteroperator/openshift-controller-manager changed Progressing to False
Jun 11 05:07:10.766 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing changed from True to False (""),Available changed from False to True ("")

[1]: https://search.svc.ci.openshift.org/?context=0&search=openshift-controller-manager.*Progressing.*updated%20number%20scheduled%20is%20.*,%20desired%20number%20scheduled%20is%20.*
[2]: https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/32114

Comment 11 Seth Jennings 2020-08-10 16:06:27 UTC

Ryan is on leave

Comment 12 Seth Jennings 2020-08-13 19:24:49 UTC

I'm not seeing this in search.ci any more.  Is this still an issue?

Comment 13 W. Trevor King 2020-08-14 04:20:08 UTC

Search from comment 8 still hits five release-openshift-origin-installer-e2e-aws-upgrade in the past 24h.  Most recent of those is [1], from 4.4.17 to 4.5.0-0.ci-2020-08-13-154728, which has:

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1293938304575606784/artifacts/e2e-aws-upgrade/container-logs/test.log | grep 'clusteroperator/openshift-controller-manager changed: Progressing message changed'
Aug 13 16:40:14.730 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: observed generation is 9, desired generation is 10.\nProgressing: openshiftcontrollermanagers.operator.openshift.io/cluster: observed generation is 3, desired generation is 4." to ""
Aug 13 16:40:35.577 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"
Aug 13 16:40:42.016 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to ""
Aug 13 16:40:42.154 E ns/openshift-controller-manager-operator pod/openshift-controller-manager-operator-6f6978d49f-69nrs node/ip-10-0-181-10.us-west-1.compute.internal container/operator container exited with code 255 (Error): eNotYetAchieved","status":"True","type":"Progressing"},{"lastTransitionTime":"2020-08-13T16:15:35Z","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2020-08-13T16:09:10Z","reason":"NoData","status":"Unknown","type":"Upgradeable"}]}}\nI0813 16:40:35.522934       1 event.go:281] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"e02f700a-0b4c-4cf8-a0a0-70a06d66e788", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"\nI0813 16:40:39.418676       1 httplog.go:90] GET /metrics: (8.332331ms) 200 [Prometheus/2.15.2 10.131.0.18:47714]\nI0813 16:40:41.220019       1 cmd.go:83] Received SIGTERM or SIGINT signal, shutting down controller.\nI0813 16:40:41.220609       1 dynamic_serving_content.go:144] Shutting down serving-cert::/var/run/secrets/serving-cert/tls.crt::/var/run/secrets/serving-cert/tls.key\nI0813 16:40:41.221109       1 status_controller.go:212] Shutting down StatusSyncer-openshift-controller-manager\nI0813 16:40:41.221205       1 config_observer_controller.go:160] Shutting down ConfigObserver\nI0813 16:40:41.221756       1 operator.go:135] Shutting down OpenShiftControllerManagerOperator\nI0813 16:40:41.221855       1 configmap_cafile_content.go:226] Shutting down client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file\nI0813 16:40:41.221874       1 configmap_cafile_content.go:226] Shutting down client-ca::kube-system::extension-apiserver-authentication::client-ca-file\nI0813 16:40:41.221892       1 tlsconfig.go:234] Shutting down DynamicServingCertificateController\nI0813 16:40:41.222294       1 secure_serving.go:222] Stopped listening on [::]:8443\nF0813 16:40:41.222319       1 builder.go:243] stopped\n
Aug 13 16:40:45.223 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: observed generation is 11, desired generation is 12.\nProgressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"
Aug 13 16:40:45.237 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: observed generation is 11, desired generation is 12.\nProgressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to ""
Aug 13 16:40:49.861 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"
Aug 13 16:41:02.562 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to "" (2 times)
Aug 13 16:41:02.748 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to "" (3 times)
Aug 13 16:41:22.568 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3"
Aug 13 16:41:41.933 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3" to ""
Aug 13 16:42:41.401 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: observed generation is 9, desired generation is 10.\nProgressing: openshiftcontrollermanagers.operator.openshift.io/cluster: observed generation is 3, desired generation is 4." to ""
Aug 13 16:42:41.421 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"
Aug 13 16:42:41.427 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to ""
Aug 13 16:42:41.458 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: observed generation is 11, desired generation is 12.\nProgressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"
Aug 13 16:42:41.458 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: observed generation is 11, desired generation is 12.\nProgressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to ""
Aug 13 16:42:41.465 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3"
Aug 13 16:42:41.481 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to "" (2 times)
Aug 13 16:42:41.482 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 1, desired number scheduled is 3" to "" (3 times)
Aug 13 16:42:41.509 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3"
Aug 13 16:42:41.524 I ns/openshift-controller-manager-operator deployment/openshift-controller-manager-operator reason/OperatorStatusChanged Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3" to ""

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1293938304575606784

Comment 14 Seth Jennings 2020-08-31 19:43:12 UTC

We are getting a crio context deadline exceeded

=====
hyperkube[1488]: I0408 15:58:31.904320    1488 kubelet.go:1913] SyncLoop (ADD, "api"): "installer-5-ci-op-kf7v5-m-0.c.openshift-gce-devel-ci.internal_openshift-kube-scheduler(727e4913-dea1-46a9-be4d-2241ba2135de), controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20), revision-pruner-2-ci-op-kf7v5-m-0.c.openshift-gce-devel-ci.internal_openshift-kube-apiserver(21a31168-c613-4131-bf96-f3bf50849234)"

hyperkube[1488]: I0408 16:02:32.250249    1488 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-controller-manager", Name:"controller-manager-djw6l", UID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", APIVersion:"v1", ResourceVersion:"8845", FieldPath:""}): type: 'Warning' reason: 'FailedCreatePodSandBox' Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded

hyperkube[1488]: E0408 16:02:45.053045    1488 remote_runtime.go:105] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error reserving pod name k8s_controller-manager-djw6l_openshift-controller-manager_54ad2139-7ea0-45e6-9f1f-bffc0423ce20_0 for id e83ab4c81071f62f257959f79bf831edacfc22823d221b04aace770516b432a8: name is reserved

hyperkube[1488]: I0408 16:04:17.091380    1488 kubelet.go:1920] SyncLoop (UPDATE, "api"): "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)"

hyperkube[1488]: I0408 16:04:18.053807    1488 kubelet.go:1958] SyncLoop (PLEG): "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)", event: &pleg.PodLifecycleEvent{ID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", Type:"ContainerStarted", Data:"704c170bc201337717891a85ab4c774aa5aaaf50ebcc10dcbca900b6dfa843ee"}

hyperkube[1488]: I0408 16:04:20.827458    1488 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-controller-manager", Name:"controller-manager-djw6l", UID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", APIVersion:"v1", ResourceVersion:"8845", FieldPath:"spec.containers{controller-manager}"}): type: 'Normal' reason: 'Pulled' Successfully pulled image "registry.svc.ci.openshift.org/ci-op-0jz0k2zj/stable-initial@sha256:fb906cc9ff9e369a8546d24ee4a0d57bc597657b6fe8fa91c8ce7b67067142cb"

hyperkube[1488]: I0408 16:04:22.126260    1488 kubelet.go:1958] SyncLoop (PLEG): "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)", event: &pleg.PodLifecycleEvent{ID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", Type:"ContainerStarted", Data:"d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c"}

crio[1396]: 2020-04-08T16:04:47Z [error] SetNetworkStatus: failed to query the pod controller-manager-djw6l in out of cluster comm: Get https://[api-int.ci-op-0jz0k2zj-28de9.origin-ci-int-gce.dev.openshift.com]:6443/api/v1/namespaces/openshift-controller-manager/pods/controller-manager-djw6l: dial tcp 10.0.0.2:6443: i/o timeout

crio[1396]: 2020-04-08T16:04:47Z [error] Multus: error unsetting the networks status: SetNetworkStatus: failed to query the pod controller-manager-djw6l in out of cluster comm: Get https://[api-int.ci-op-0jz0k2zj-28de9.origin-ci-int-gce.dev.openshift.com]:6443/api/v1/namespaces/openshift-controller-manager/pods/controller-manager-djw6l: dial tcp 10.0.0.2:6443: i/o timeout

hyperkube[1488]: I0408 16:05:18.372502    1488 kubelet.go:1958] SyncLoop (PLEG): "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)", event: &pleg.PodLifecycleEvent{ID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", Type:"ContainerDied", Data:"704c170bc201337717891a85ab4c774aa5aaaf50ebcc10dcbca900b6dfa843ee"} <-- sandbox container dies?

hyperkube[1488]: I0408 16:05:30.708104    1488 kubelet.go:1929] SyncLoop (DELETE, "api"): "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)"
hyperkube[1488]: I0408 16:05:37.050375    1488 kubelet_pods.go:933] Pod "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)" is terminated, but some containers are still running
hyperkube[1488]: I0408 16:05:47.050189    1488 kubelet_pods.go:933] Pod "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)" is terminated, but some containers are still running
crio[1396]: time="2020-04-08 16:05:48.392323735Z" level=warning msg="Stop container \"d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c\" timed out: failed to wait process, timeout reached after 30 seconds"
systemd[1]: crio-d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c.scope: Consumed 197ms CPU time
hyperkube[1488]: I0408 16:05:48.437977    1488 manager.go:1007] Destroyed container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod54ad2139_7ea0_45e6_9f1f_bffc0423ce20.slice/crio-d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c.scope" (aliases: [k8s_controller-manager_controller-manager-djw6l_openshift-controller-manager_54ad2139-7ea0-45e6-9f1f-bffc0423ce20_0 d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c], namespace: "crio")
systemd[1]: crio-conmon-d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c.scope: Consumed 64ms CPU time
hyperkube[1488]: I0408 16:05:48.442970    1488 manager.go:1007] Destroyed container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod54ad2139_7ea0_45e6_9f1f_bffc0423ce20.slice/crio-conmon-d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c.scope" (aliases: [], namespace: "")
hyperkube[1488]: I0408 16:05:49.233663    1488 logs.go:309] Finish parsing log file "/var/log/pods/openshift-controller-manager_controller-manager-djw6l_54ad2139-7ea0-45e6-9f1f-bffc0423ce20/controller-manager/0.log"
hyperkube[1488]: I0408 16:05:49.234498    1488 kubelet.go:1958] SyncLoop (PLEG): "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)", event: &pleg.PodLifecycleEvent{ID:"54ad2139-7ea0-45e6-9f1f-bffc0423ce20", Type:"ContainerDied", Data:"d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c"}
hyperkube[1488]: I0408 16:05:49.234564    1488 kubelet_pods.go:1358] Generating status for "controller-manager-djw6l_openshift-controller-manager(54ad2139-7ea0-45e6-9f1f-bffc0423ce20)"
crio[1396]: time="2020-04-08 16:05:49.690924243Z" level=info msg="stopped container d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c: openshift-controller-manager/controller-manager-djw6l/controller-manager" id=0ae4029d-566c-4523-bc5c-3333e8eaf349
hyperkube[1488]: I0408 16:05:49.691228    1488 kuberuntime_container.go:606] Container "cri-o://d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c" exited normally
crio[1396]: time="2020-04-08 16:05:49.692722138Z" level=info msg="attempting to run pod sandbox with infra container: openshift-controller-manager/controller-manager-djw6l/POD" id=be7c4627-20b0-428a-93d7-9f7d3ca43a98
crio[1396]: time="2020-04-08 16:05:49.727598113Z" level=info msg="Removed container d0bea1ad1c42ee47d04737621ee482ef0931c9769661c4766adab367ae26d74c: openshift-controller-manager/controller-manager-djw6l/controller-manager" id=f801c967-5fd0-4a37-bdd8-4bacfc68a18a
======

Comment 15 Seth Jennings 2020-08-31 19:45:15 UTC

Could be a dup of bz1785399

Comment 16 Peter Hunt 2020-09-11 20:01:56 UTC

I didn't have a chance to look more into this this sprint, I will try again next sprint

Comment 18 Peter Hunt 2020-10-02 19:38:48 UTC

Sorry, still haven't had time. I will try next sprint. thanks for the patience

Comment 21 jamo luhrsen 2020-12-23 22:41:33 UTC

As I was digging through current 4.7 troubles in our release dashboard I noticed that https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#release-openshift-ocp-installer-e2e-gcp-ovn-4.7 had
a couple jobs with "failed to initialize the cluster: Cluster operator openshift-controller-manager is still updating". I found this bz and looks like at least this job is failing because of it:
  https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-ovn-4.7/1339617837525766144

at least I noticed similar symptoms that @wking pointed out in the original description:

curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-ovn-4.7/1339617837525766144/artifacts/e2e-gcp/pods/openshift-cluster-version_cluster-version-operator-854584df5c-sg7kt_cluster-version-operator.log | grep 'Running sync.*in state\|Result of work'

I1217 17:35:41.980112       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 0
I1217 17:35:42.643551       1 task_graph.go:555] Result of work: []
I1217 17:38:32.939215       1 task_graph.go:555] Result of work: [Cluster operator authentication is reporting a failure: WellKnownReadyControllerDegraded: kube-apiserver oauth endpoint https://10.0.0.3:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance) Cluster operator openshift-controller-manager is still updating Cluster operator console is reporting a failure: RouteHealthDegraded: route not yet available, https://console-openshift-console.apps.ci-op-q5pxttvh-fdcb7.origin-ci-int-gce.dev.openshift.com/health returns '503 Service Unavailable']
I1217 17:38:58.007577       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 1
I1217 17:38:58.096844       1 task_graph.go:555] Result of work: []
I1217 17:41:48.965792       1 task_graph.go:555] Result of work: [Cluster operator authentication is reporting a failure: WellKnownReadyControllerDegraded: kube-apiserver oauth endpoint https://10.0.0.3:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance) Cluster operator openshift-controller-manager is still updating]
I1217 17:42:36.896004       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 2
I1217 17:42:36.991528       1 task_graph.go:555] Result of work: []
I1217 17:45:27.854609       1 task_graph.go:555] Result of work: [Cluster operator authentication is reporting a failure: WellKnownReadyControllerDegraded: need at least 3 kube-apiservers, got 2 Cluster operator openshift-controller-manager is still updating]
I1217 17:46:58.098056       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 3
I1217 17:46:58.239145       1 task_graph.go:555] Result of work: []
I1217 17:49:49.055175       1 task_graph.go:555] Result of work: [Cluster operator openshift-controller-manager is still updating]
I1217 17:53:06.975585       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 4
I1217 17:53:07.084166       1 task_graph.go:555] Result of work: []
I1217 17:55:57.932759       1 task_graph.go:555] Result of work: [Cluster operator openshift-controller-manager is still updating]
I1217 17:58:51.429709       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 5
I1217 17:58:51.560286       1 task_graph.go:555] Result of work: []
I1217 18:01:42.387211       1 task_graph.go:555] Result of work: [Cluster operator openshift-controller-manager is still updating]
I1217 18:04:58.708849       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 6
I1217 18:04:58.805430       1 task_graph.go:555] Result of work: []
I1217 18:07:49.668136       1 task_graph.go:555] Result of work: [Cluster operator openshift-controller-manager is still updating]
I1217 18:11:11.294447       1 sync_worker.go:517] Running sync 4.7.0-0.nightly-2020-12-17-170130 (force=false) on generation 1 in state Initializing at attempt 7
I1217 18:11:11.403215       1 task_graph.go:555] Result of work: []
I1217 18:14:02.251859       1 task_graph.go:555] Result of work: [Cluster operator openshift-controller-manager is still updating]

cat ./namespaces/openshift-controller-manager/apps/daemonsets.yaml | grep -A8 '^  status:'

  status:
    currentNumberScheduled: 3
    desiredNumberScheduled: 3
    numberAvailable: 2
    numberMisscheduled: 0
    numberReady: 2
    numberUnavailable: 1
    observedGeneration: 9
    updatedNumberScheduled: 2



grep 'Progressing message changed' namespaces/openshift-controller-manager-operator/pods/openshift-controller-manager-operator-7576c6f6f7-smvg6/openshift-controller-manager-operator/openshift-controller-manager-operator/logs/current.log | tail -n4
2020-12-17T18:15:27.220717443Z I1217 18:15:27.220620       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"8dcdde51-7abe-4bef-8967-8f099b95ae80", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3"
2020-12-17T18:15:46.335035345Z I1217 18:15:46.334958       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"8dcdde51-7abe-4bef-8967-8f099b95ae80", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3" to ""
2020-12-17T18:16:07.215627824Z I1217 18:16:07.215548       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"8dcdde51-7abe-4bef-8967-8f099b95ae80", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "" to "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3"
2020-12-17T18:16:26.337121808Z I1217 18:16:26.337043       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"8dcdde51-7abe-4bef-8967-8f099b95ae80", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: updated number scheduled is 2, desired number scheduled is 3" to ""


All just fyi and hopefully put this bug back in the queue to get figured out.

Comment 24 Ryan Phillips 2021-03-08 22:06:17 UTC

Closing... This appears to be cleared up in 4.6 and above.