Description of problem: Job link: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1526139736576495616 The cluster operator was stuck and stopped updating the co object: {code} [akaris@linux analysis-2]$ omg get co | grep -v 'True False False' NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE dns 4.10.14 True True False 2h0m {code} It looks like the openshift-dns-operator somehow stopped updating or reconciling. We see in the last update line in the operator's logs that it updated the DNS default status to available "from having 5 up to date DNS pods" ; however, the co object shows something completely different. Indented (for clarity) the line from the log: {code} 2022-05-16T11:20:41.082720910Z time="2022-05-16T11:20:41Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{ v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 53, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"} }}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{ v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 20, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"All DNS and node-resolver pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"} }}" {code} But co DNS shows: {code} - lastTransitionTime: '2022-05-16T11:20:41Z' message: 'Upgrading operator to "4.11.0-0.ci-2022-05-16-095559". Upgrading coredns to "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236". Upgrading kube-rbac-proxy to "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c".' reason: Upgrading status: 'True' type: Progressing {code} That progressing message above there however does not match with any of the log messages from the new operator-pod. I'm speculating and maybe its because it intermittently lost its watchers and then never recovers (but I'm not sure about this part, it's pure speculation and unlikely): {code} (...) 2022-05-16T11:20:39.979874343Z W0516 11:20:39.979768 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979881804Z W0516 11:20:39.979769 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979881804Z W0516 11:20:39.979787 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979881804Z W0516 11:20:39.979788 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding (...) {code} Current time when the must-gather was taken is and we never saw an update from the dns operator since 11:20: {code} [akaris@linux analysis-2]$ omg get events -n default -o yaml | grep time | sort | tail -1 time: '2022-05-16T13:18:57Z' {code} Last log line in the DNS operator: {code} 2022-05-16T11:20:41.576468249Z time="2022-05-16T11:20:41Z" level=info msg="reconciling request: /default" {code} OpenShift release version: {code} [akaris@linux analysis-2]$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2h5m Working towards 4.11.0-0.ci-2022-05-16-095559: 658 of 802 done (82% complete) {code} Cluster Platform: AWS How reproducible: Seen in CI Full output: ============================================================================================================================== {code} [akaris@linux analysis-2]$ omg get co | grep -v 'True False False' NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE dns 4.10.14 True True False 2h0m [akaris@linux analysis-2]$ omg get pods -n openshift-dns NAME READY STATUS RESTARTS AGE dns-default-bdx6d 2/2 Running 0 2h1m dns-default-dc8s5 2/2 Running 0 2h0m dns-default-dlqkw 2/2 Running 0 2h1m dns-default-nl6qp 2/2 Running 0 2h2m dns-default-rvb6f 2/2 Running 0 2h3m dns-default-rzqkx 2/2 Running 0 2h2m node-resolver-62crv 1/1 Running 0 2h3m node-resolver-jc4zh 1/1 Running 0 2h3m node-resolver-l88nt 1/1 Running 0 2h3m node-resolver-m9d2l 1/1 Running 0 2h3m node-resolver-vcgqh 1/1 Running 0 2h3m node-resolver-x89ph 1/1 Running 0 2h3m {code} {code} [akaris@linux analysis-2]$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2h5m Working towards 4.11.0-0.ci-2022-05-16-095559: 658 of 802 done (82% complete) {code} Current time is: {code} [akaris@linux analysis-2]$ omg get events -n default -o yaml | grep time | sort | tail -1 time: '2022-05-16T13:18:57Z' {code} {code} conditions: - lastTransitionTime: '2022-05-16T10:20:05Z' message: DNS "default" is available. reason: AsExpected status: 'True' type: Available - lastTransitionTime: '2022-05-16T11:20:41Z' message: 'Upgrading operator to "4.11.0-0.ci-2022-05-16-095559". Upgrading coredns to "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236". Upgrading kube-rbac-proxy to "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c".' reason: Upgrading status: 'True' type: Progressing - lastTransitionTime: '2022-05-16T10:28:41Z' reason: DNSNotDegraded status: 'False' type: Degraded - lastTransitionTime: '2022-05-16T10:19:50Z' message: 'DNS default is upgradeable: DNS Operator can be upgraded' reason: DNSUpgradeable status: 'True' type: Upgradeable {code} The operator is simply still reporting an upgrading message. {code} openshift-dns-operator dns-operator-57597d499b-xvlk2 2/2 Running 0 2h4m 10.128.0.94 ip-10-0-181-237.ec2.internal [akaris@linux analysis-2]$ {code} {code} [akaris@linux analysis-2]$ omg get pod -o json -n openshift-dns-operator dns-operator-57597d499b-xvlk2 | jq '.status' { "conditions": [ { "lastProbeTime": null, "lastTransitionTime": "2022-05-16T11:16:37Z", "status": "True", "type": "Initialized" }, { "lastProbeTime": null, "lastTransitionTime": "2022-05-16T11:16:43Z", "status": "True", "type": "Ready" }, { "lastProbeTime": null, "lastTransitionTime": "2022-05-16T11:16:43Z", "status": "True", "type": "ContainersReady" }, { "lastProbeTime": null, "lastTransitionTime": "2022-05-16T11:16:37Z", "status": "True", "type": "PodScheduled" } ], "containerStatuses": [ { "containerID": "cri-o://4e7547fe92aa0ee700d6cd1521addb394200344d5e2b2c4ffd119a4d31ff3de7", "image": "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:6e03ab982d4cf1242b8d567ebe61299976ff80258ea6b19fb01e621f10f6fe1e", "imageID": "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:6e03ab982d4cf1242b8d567ebe61299976ff80258ea6b19fb01e621f10f6fe1e", "lastState": {}, "name": "dns-operator", "ready": true, "restartCount": 0, "started": true, "state": { "running": { "startedAt": "2022-05-16T11:16:42Z" } } }, { "containerID": "cri-o://40099408a4052788d2cb2fdf1d91c333908e2a9ce434e94a1cea0ab3256d47fa", "image": "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c", "imageID": "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c", "lastState": {}, "name": "kube-rbac-proxy", "ready": true, "restartCount": 0, "started": true, "state": { "running": { "startedAt": "2022-05-16T11:16:42Z" } } } ], "hostIP": "10.0.181.237", "phase": "Running", "podIP": "10.128.0.94", "podIPs": [ { "ip": "10.128.0.94" } ], "qosClass": "Burstable", "startTime": "2022-05-16T11:16:37Z" } {code} {code} [akaris@linux analysis-2]$ omg logs -n openshift-dns-operator dns-operator-57597d499b-xvlk2 -c dns-operator | tail -n 30 2022-05-16T11:19:01.757993648Z time="2022-05-16T11:19:01Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 1, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 3 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 1, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-16T11:19:01.760380220Z time="2022-05-16T11:19:01Z" level=info msg="reconciling request: /default" 2022-05-16T11:19:16.689256992Z time="2022-05-16T11:19:16Z" level=info msg="reconciling request: /default" 2022-05-16T11:19:16.770284730Z time="2022-05-16T11:19:16Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 1, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 16, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-16T11:19:16.772765985Z time="2022-05-16T11:19:16Z" level=info msg="reconciling request: /default" 2022-05-16T11:19:38.164577501Z time="2022-05-16T11:19:38Z" level=info msg="reconciling request: /default" 2022-05-16T11:19:38.281651967Z time="2022-05-16T11:19:38Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 16, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 38, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-16T11:19:38.283408807Z time="2022-05-16T11:19:38Z" level=info msg="reconciling request: /default" 2022-05-16T11:19:53.252341596Z time="2022-05-16T11:19:53Z" level=info msg="reconciling request: /default" 2022-05-16T11:19:53.395557074Z time="2022-05-16T11:19:53Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 38, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 53, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-16T11:19:53.397425072Z time="2022-05-16T11:19:53Z" level=info msg="reconciling request: /default" 2022-05-16T11:20:39.979748444Z W0516 11:20:39.979711 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979819120Z W0516 11:20:39.979806 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979827076Z W0516 11:20:39.979807 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979832918Z W0516 11:20:39.979711 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ClusterOperator ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979864281Z W0516 11:20:39.979828 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979864281Z W0516 11:20:39.979711 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979874343Z W0516 11:20:39.979743 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DNS ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979874343Z W0516 11:20:39.979735 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979874343Z W0516 11:20:39.979753 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979874343Z W0516 11:20:39.979768 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979881804Z W0516 11:20:39.979769 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979881804Z W0516 11:20:39.979787 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:39.979881804Z W0516 11:20:39.979788 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-16T11:20:40.964951157Z time="2022-05-16T11:20:40Z" level=info msg="reconciling request: /default" 2022-05-16T11:20:41.082720910Z time="2022-05-16T11:20:41Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 19, 53, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 28, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 16, 11, 20, 41, 0, time.Local), Reason:\"AsExpected\", Message:\"All DNS and node-resolver pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 20, 5, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 16, 10, 19, 50, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-16T11:20:41.084576254Z time="2022-05-16T11:20:41Z" level=info msg="reconciling request: /default" 2022-05-16T11:20:41.171878423Z time="2022-05-16T11:20:41Z" level=info msg="reconciling request: /default" 2022-05-16T11:20:41.576468249Z time="2022-05-16T11:20:41Z" level=info msg="reconciling request: /default" [akaris@linux analysis-2]$ {code} Around the 11:21 mark, tons of pods on the cluster see that same http connection issue: {code} [akaris@linux analysis-2]$ grep 'unable to decode an event from the watch stream' -RI | grep '2022-05-16T11:21' | wc -l 360 {code} Something else that's weird - all of the pods are actually updated to the requested version: {code} - lastTransitionTime: '2022-05-16T11:20:41Z' message: 'Upgrading operator to "4.11.0-0.ci-2022-05-16-095559". Upgrading coredns to "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236". Upgrading kube-rbac-proxy to "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c".' {code} When we look at all of the pods, they are updated to exatly those containers: {code} [akaris@linux analysis-2]$ omg get pods -n openshift-dns | awk '/dns-default/ {print $1}' | while read p ; do omg get pod -n openshift-dns $p -o json | jq '.spec.containers[] | .image'; done "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-16-095559@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" {code} For reference https://github.com/openshift/cluster-dns-operator/blob/release-4.11/pkg/operator/controller/status/controller.go#L479
We can see the exact same issue in run 1526731236049948672 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn-upgrade/1526731236049948672 {code} [akaris@linux analysis-1526731236049948672]$ omg get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.11.0-0.nightly-2022-05-18-010528 True False False 2h10m baremetal 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m cloud-controller-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 3h14m cloud-credential 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m cluster-autoscaler 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m config-operator 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m console 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m csi-snapshot-controller 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m dns 4.11.0-0.nightly-2022-05-11-054135 True True False 2h3m etcd 4.11.0-0.nightly-2022-05-18-010528 True False False 2h30m image-registry 4.11.0-0.nightly-2022-05-18-010528 True False False 2h11m ingress 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m insights 4.11.0-0.nightly-2022-05-18-010528 True False False 3h4m kube-apiserver 4.11.0-0.nightly-2022-05-18-010528 True False False 2h26m kube-controller-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 2h19m kube-scheduler 4.11.0-0.nightly-2022-05-18-010528 True False False 2h19m kube-storage-version-migrator 4.11.0-0.nightly-2022-05-18-010528 True False False 2h10m machine-api 4.11.0-0.nightly-2022-05-18-010528 True False False 2h14m machine-approver 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m machine-config 4.11.0-0.nightly-2022-05-11-054135 True False False 3h10m marketplace 4.11.0-0.nightly-2022-05-18-010528 True False False 3h10m monitoring 4.11.0-0.nightly-2022-05-18-010528 True False False 2h10m network 4.11.0-0.nightly-2022-05-18-010528 True False False 1h55m node-tuning 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m openshift-apiserver 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m openshift-controller-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m openshift-samples 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m operator-lifecycle-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m operator-lifecycle-manager-catalog 4.11.0-0.nightly-2022-05-18-010528 True False False 2h7m operator-lifecycle-manager-packageserver 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m service-ca 4.11.0-0.nightly-2022-05-18-010528 True False False 2h11m storage 4.11.0-0.nightly-2022-05-18-010528 True False False 2h5m [akaris@linux analysis-1526731236049948672]$ killall omg [akaris@linux analysis-1526731236049948672]$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2h21m Working towards 4.11.0-0.nightly-2022-05-18-010528: 658 of 802 done (82% complete) [akaris@linux analysis-1526731236049948672]$ omg get co dns -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: include.release.openshift.io/ibm-cloud-managed: 'true' include.release.openshift.io/self-managed-high-availability: 'true' include.release.openshift.io/single-node-developer: 'true' creationTimestamp: '2022-05-18T01:25:25Z' generation: 1 managedFields: - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:include.release.openshift.io/ibm-cloud-managed: {} f:include.release.openshift.io/self-managed-high-availability: {} f:include.release.openshift.io/single-node-developer: {} f:ownerReferences: .: {} k:{"uid":"46bbbb00-9d1f-4d2f-80ac-f874fea89e79"}: {} f:spec: {} manager: Go-http-client operation: Update time: '2022-05-18T01:25:25Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:extension: {} manager: Go-http-client operation: Update subresource: status time: '2022-05-18T01:25:26Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: {} f:relatedObjects: {} f:versions: {} manager: dns-operator operation: Update subresource: status time: '2022-05-18T01:36:55Z' name: dns ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 46bbbb00-9d1f-4d2f-80ac-f874fea89e79 resourceVersion: '61304' uid: 06ec1647-6916-4e4b-af6f-1a6f6ccac50b spec: {} status: conditions: - lastTransitionTime: '2022-05-18T01:37:14Z' message: DNS "default" is available. reason: AsExpected status: 'True' type: Available - lastTransitionTime: '2022-05-18T02:44:17Z' message: 'Upgrading operator to "4.11.0-0.nightly-2022-05-18-010528". Upgrading coredns to "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b". Upgrading kube-rbac-proxy to "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109".' reason: Upgrading status: 'True' type: Progressing - lastTransitionTime: '2022-05-18T01:49:27Z' reason: DNSNotDegraded status: 'False' type: Degraded - lastTransitionTime: '2022-05-18T01:36:56Z' message: 'DNS default is upgradeable: DNS Operator can be upgraded' reason: DNSUpgradeable status: 'True' type: Upgradeable extension: null relatedObjects: - group: '' name: openshift-dns-operator resource: namespaces - group: operator.openshift.io name: default resource: dnses - group: '' name: openshift-dns resource: namespaces versions: - name: operator version: 4.11.0-0.nightly-2022-05-11-054135 - name: coredns version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:596c58ad0fb3a58712b27b051a95571d630374dc26d5a00afa7245b8c327de07 - name: openshift-cli version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9c25547a9165593735b7dacfbf6abbcaeb1ffc4cb941d2e0c0b65bea946bc008 - name: kube-rbac-proxy version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a175aec15d91bafafc84946593f9432736c3ef8643f1118fad49beb47d54cf57 {code} We can see the same here, all dns pods are already updated: {code} [akaris@linux analysis-1526731236049948672]$ omg get pods -n openshift-dns | awk '/dns-default/ {print $1}' | while read p ; do omg get pod -n openshift-dns $p -o json | jq '.spec.containers[] | .image'; done "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" [akaris@linux analysis-1526731236049948672]$ omg get pods -n openshift-dns NAME READY STATUS RESTARTS AGE dns-default-kmfjw 2/2 Running 0 2h4m dns-default-n2tf6 2/2 Running 0 2h5m dns-default-pq2zp 2/2 Running 0 2h5m dns-default-tcgn7 2/2 Running 0 2h6m dns-default-wtf8z 2/2 Running 0 2h7m dns-default-z6hhn 2/2 Running 0 2h4m node-resolver-kkjsz 1/1 Running 0 2h7m node-resolver-mjnk9 1/1 Running 0 2h7m node-resolver-ngqr2 1/1 Running 0 2h7m node-resolver-w426l 1/1 Running 0 2h7m node-resolver-x7wls 1/1 Running 0 2h7m node-resolver-zxj8t 1/1 Running 0 2h7m {code} And the logs: {code} [akaris@linux analysis-1526731236049948672]$ omg logs -n openshift-dns-operator dns-operator-67f99d6557-2g6g5 -c dns-operator | tail -n 30 2022-05-18T02:42:36.954999590Z time="2022-05-18T02:42:36Z" level=info msg="reconciling request: /default" 2022-05-18T02:42:54.845966990Z time="2022-05-18T02:42:54Z" level=info msg="reconciling request: /default" 2022-05-18T02:42:54.932624410Z time="2022-05-18T02:42:54Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 42, 36, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 42, 54, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:42:54.935050953Z time="2022-05-18T02:42:54Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:15.631198953Z time="2022-05-18T02:43:15Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:15.745417942Z time="2022-05-18T02:43:15Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 42, 54, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 15, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:43:15.748076288Z time="2022-05-18T02:43:15Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:15.973438911Z time="2022-05-18T02:43:15Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:16.067891155Z time="2022-05-18T02:43:16Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 15, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 16, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:43:16.070968109Z time="2022-05-18T02:43:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791860 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791862 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791884 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791933 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791956 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791963 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791967 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ClusterOperator ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791979 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DNS ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791993 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.792001 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.792003 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.792009 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792118238Z W0518 02:44:15.792073 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:16.672301571Z time="2022-05-18T02:44:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:16.747104965Z time="2022-05-18T02:44:16Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 16, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 44, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"All DNS and node-resolver pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:44:16.750168018Z time="2022-05-18T02:44:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:16.813141408Z time="2022-05-18T02:44:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:17.098547647Z time="2022-05-18T02:44:17Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:17.164908296Z time="2022-05-18T02:44:17Z" level=info msg="reconciling request: /default" {code}
And the same here - run 1525510596311650304] https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1525510596311650304 {code} [akaris@linux analysis-1525510596311650304]$ omg get co | grep -v 'True False False' NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE dns 4.10.14 True True False 1h58m [akaris@linux analysis-1525510596311650304]$ omg get co dns -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: include.release.openshift.io/ibm-cloud-managed: 'true' include.release.openshift.io/self-managed-high-availability: 'true' include.release.openshift.io/single-node-developer: 'true' creationTimestamp: '2022-05-14T16:32:20Z' generation: 1 managedFields: - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:include.release.openshift.io/ibm-cloud-managed: {} f:include.release.openshift.io/self-managed-high-availability: {} f:include.release.openshift.io/single-node-developer: {} f:ownerReferences: .: {} k:{"uid":"8df8df3a-7725-47b7-9326-50bdbed53979"}: {} f:spec: {} manager: Go-http-client operation: Update time: '2022-05-14T16:32:20Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:extension: {} manager: Go-http-client operation: Update subresource: status time: '2022-05-14T16:32:20Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: {} f:relatedObjects: {} f:versions: {} manager: dns-operator operation: Update subresource: status time: '2022-05-14T16:38:12Z' name: dns ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 8df8df3a-7725-47b7-9326-50bdbed53979 resourceVersion: '57174' uid: 68846a48-eed5-442f-8e5c-a04e9e1cc21e spec: {} status: conditions: - lastTransitionTime: '2022-05-14T16:38:30Z' message: DNS "default" is available. reason: AsExpected status: 'True' type: Available - lastTransitionTime: '2022-05-14T17:35:42Z' message: 'Upgrading operator to "4.11.0-0.ci-2022-05-14-160619". Upgrading coredns to "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236". Upgrading kube-rbac-proxy to "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c".' reason: Upgrading status: 'True' type: Progressing - lastTransitionTime: '2022-05-14T16:45:16Z' reason: DNSNotDegraded status: 'False' type: Degraded - lastTransitionTime: '2022-05-14T16:38:12Z' message: 'DNS default is upgradeable: DNS Operator can be upgraded' reason: DNSUpgradeable status: 'True' type: Upgradeable extension: null relatedObjects: - group: '' name: openshift-dns-operator resource: namespaces - group: operator.openshift.io name: default resource: dnses - group: '' name: openshift-dns resource: namespaces versions: - name: operator version: 4.10.14 - name: coredns version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ceb0d1d2015b87e9daf3e57b93f5464f15a1386a6bcab5442b7dba594b058b24 - name: openshift-cli version: registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:b634a8ede5ffec8e4068475de9746424e34f73416959a241592736fd1cdf5ab8 - name: kube-rbac-proxy version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:08e8b4004edaeeb125ced09ab2c4cd6d690afaf3a86309c91a994dec8e3ccbf3 [akaris@linux analysis-1525510596311650304]$ omg get pods -n openshift-dns | awk '/dns-default/ {print $1}' | while read p ; do omg get pod -n openshift-dns $p -o json | jq '.spec.containers[] | .image'; done "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" [akaris@linux analysis-1525510596311650304]$ omg get pods -n openshift-dns NAME READY STATUS RESTARTS AGE dns-default-249dg 2/2 Running 0 2h1m dns-default-czz5x 2/2 Running 0 1h59m dns-default-dcjm7 2/2 Running 0 1h58m dns-default-l4vc9 2/2 Running 0 2h0m dns-default-mprr6 2/2 Running 0 2h2m dns-default-vjpn2 2/2 Running 0 2h0m node-resolver-4r7tx 1/1 Running 0 2h2m node-resolver-754hp 1/1 Running 0 2h2m node-resolver-c5dnq 1/1 Running 0 2h2m node-resolver-jf5l4 1/1 Running 0 2h2m node-resolver-sx4nq 1/1 Running 0 2h2m node-resolver-zstzk 1/1 Running 0 2h2m [akaris@linux analysis-1525510596311650304]$ omg logs -n openshift-dns-operator dns-operator-67f99d6557-2g6g5 -c dns-operator | tail -n 30 [ERROR] Pod directory not found: /home/akaris/cases/dns-operator/analysis-1525510596311650304/registry-ci-openshift-org-ocp-4-11-2022-05-14-160619-sha256-090ae5109f7a1d071e12a49ae62460328b1bbe39e4bf4a3ff909f35629ae07a2/namespaces/openshift-dns-operator/pods/dns-operator-67f99d6557-2g6g5 [akaris@linux analysis-1525510596311650304]$ omg get pods -n openshift-dns-operator NAME READY STATUS RESTARTS AGE dns-operator-5d5bf79f5d-5llxf 2/2 Running 0 2h2m [akaris@linux analysis-1525510596311650304]$ omg logs -n openshift-dns-operator dns-operator-5d5bf79f5d-5llxf | tail -n 30 [ERROR] This pod has more than one containers: ['dns-operator', 'kube-rbac-proxy'] Use -c/--container to specify the container [akaris@linux analysis-1525510596311650304]$ omg logs -n openshift-dns-operator dns-operator-5d5bf79f5d-5llxf -c dns-operator | tail -n 30 2022-05-14T17:34:02.952228784Z time="2022-05-14T17:34:02Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 33, 41, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 3 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 2, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:34:02.955049734Z time="2022-05-14T17:34:02Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:20.918978128Z time="2022-05-14T17:34:20Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:21.008638550Z time="2022-05-14T17:34:21Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 2, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 21, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:34:21.013241525Z time="2022-05-14T17:34:21Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:42.326082681Z time="2022-05-14T17:34:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:42.467891086Z time="2022-05-14T17:34:42Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 21, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 42, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:34:42.472562810Z time="2022-05-14T17:34:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:41.511932339Z W0514 17:35:41.511893 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512072961Z W0514 17:35:41.512053 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ClusterOperator ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512106215Z W0514 17:35:41.512089 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512139893Z W0514 17:35:41.512117 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512167522Z W0514 17:35:41.512148 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512232053Z time="2022-05-14T17:35:41Z" level=error msg="failed to ensure default dns Get \"https://172.30.0.1:443/apis/operator.openshift.io/v1/dnses/default\": http2: client connection lost" 2022-05-14T17:35:41.512260744Z W0514 17:35:41.512247 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512318475Z W0514 17:35:41.512304 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DNS ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512340839Z W0514 17:35:41.512325 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512352962Z W0514 17:35:41.512348 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512381130Z W0514 17:35:41.512371 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512427727Z W0514 17:35:41.512415 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512458488Z W0514 17:35:41.512451 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512496706Z W0514 17:35:41.512477 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:42.365704114Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.451387428Z time="2022-05-14T17:35:42Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 42, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 35, 42, 0, time.Local), Reason:\"AsExpected\", Message:\"All DNS and node-resolver pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:35:42.454446312Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.572214603Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.904078610Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.949765309Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:43.033849849Z time="2022-05-14T17:35:43Z" level=info msg="reconciling request: /default" [akaris@linux analysis-1525510596311650304]$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2h3m Working towards 4.11.0-0.ci-2022-05-14-160619: 658 of 802 done (82% complete) [akaris@linux analysis-1525510596311650304]$ {code}
Just some further info - we can indeed see that the DNS object of name "default" is correctly updated: ~~~ [akaris@linux analysis-2]$ cat registry-ci-openshift-org-ocp-4-11-2022-05-16-095559-sha256-090ae5109f7a1d071e12a49ae62460328b1bbe39e4bf4a3ff909f35629ae07a2/cluster-scoped-resources/operator.openshift.io/dnses/default.yaml --- apiVersion: operator.openshift.io/v1 kind: DNS metadata: creationTimestamp: "2022-05-16T10:19:50Z" finalizers: - dns.operator.openshift.io/dns-controller generation: 1 managedFields: - apiVersion: operator.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"dns.operator.openshift.io/dns-controller": {} f:spec: .: {} f:logLevel: {} f:nodePlacement: {} f:operatorLogLevel: {} f:upstreamResolvers: .: {} f:policy: {} f:upstreams: {} manager: dns-operator operation: Update time: "2022-05-16T10:19:50Z" - apiVersion: operator.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:clusterDomain: {} f:clusterIP: {} f:conditions: {} manager: dns-operator operation: Update subresource: status time: "2022-05-16T10:19:50Z" name: default resourceVersion: "57484" uid: 56af9aae-2124-4840-a378-2b3847073df6 spec: logLevel: Normal nodePlacement: {} operatorLogLevel: Normal upstreamResolvers: policy: Sequential upstreams: - port: 53 type: SystemResolvConf status: clusterDomain: cluster.local clusterIP: 172.30.0.10 conditions: - lastTransitionTime: "2022-05-16T10:28:41Z" message: Enough DNS pods are available, and the DNS service has a cluster IP address. reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2022-05-16T11:20:41Z" message: All DNS and node-resolver pods are available, and the DNS service has a cluster IP address. reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2022-05-16T10:20:05Z" message: The DNS daemonset has available pods, and the DNS service has a cluster IP address. reason: AsExpected status: "True" type: Available - lastTransitionTime: "2022-05-16T10:19:50Z" message: DNS Operator can be upgraded reason: AsExpected status: "True" type: Upgradeable ~~~
So this here works: https://github.com/openshift/cluster-dns-operator/blob/d50df32df68f53c1d47db8f5e51a8b27c402f278/pkg/operator/controller/dns_status.go#L36 This here does not: https://github.com/openshift/cluster-dns-operator/blob/d50df32df68f53c1d47db8f5e51a8b27c402f278/pkg/operator/controller/status/controller.go#L175
Possibly related to <https://github.com/openshift/cluster-dns-operator/pull/318>.
From the prowci, the jobs are passing for those mentioned profiles. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1536577259970760704 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn-upgrade/1536550068805439488 Hence marking as verified.
The issue was introduced in 4.11 by https://github.com/openshift/cluster-dns-operator/pull/318 and was fixed before we shipped a release with the issue, so no doc text is needed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069