Bug 2093236
| Summary: | DNS operator stopped reconciling after 4.10 to 4.11 upgrade | 4.11 nightly to 4.11 nightly upgrade | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Andreas Karis <akaris> |
| Component: | Networking | Assignee: | Andrew McDermott <amcdermo> |
| Networking sub component: | DNS | QA Contact: | Melvin Joseph <mjoseph> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | aos-bugs, hongli, mmasters |
| Version: | 4.11 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-10 11:16:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Andreas Karis
2022-06-03 11:33:03 UTC
We can see the exact same issue in run 1526731236049948672 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn-upgrade/1526731236049948672 {code} [akaris@linux analysis-1526731236049948672]$ omg get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.11.0-0.nightly-2022-05-18-010528 True False False 2h10m baremetal 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m cloud-controller-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 3h14m cloud-credential 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m cluster-autoscaler 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m config-operator 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m console 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m csi-snapshot-controller 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m dns 4.11.0-0.nightly-2022-05-11-054135 True True False 2h3m etcd 4.11.0-0.nightly-2022-05-18-010528 True False False 2h30m image-registry 4.11.0-0.nightly-2022-05-18-010528 True False False 2h11m ingress 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m insights 4.11.0-0.nightly-2022-05-18-010528 True False False 3h4m kube-apiserver 4.11.0-0.nightly-2022-05-18-010528 True False False 2h26m kube-controller-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 2h19m kube-scheduler 4.11.0-0.nightly-2022-05-18-010528 True False False 2h19m kube-storage-version-migrator 4.11.0-0.nightly-2022-05-18-010528 True False False 2h10m machine-api 4.11.0-0.nightly-2022-05-18-010528 True False False 2h14m machine-approver 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m machine-config 4.11.0-0.nightly-2022-05-11-054135 True False False 3h10m marketplace 4.11.0-0.nightly-2022-05-18-010528 True False False 3h10m monitoring 4.11.0-0.nightly-2022-05-18-010528 True False False 2h10m network 4.11.0-0.nightly-2022-05-18-010528 True False False 1h55m node-tuning 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m openshift-apiserver 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m openshift-controller-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m openshift-samples 4.11.0-0.nightly-2022-05-18-010528 True False False 2h12m operator-lifecycle-manager 4.11.0-0.nightly-2022-05-18-010528 True False False 3h11m operator-lifecycle-manager-catalog 4.11.0-0.nightly-2022-05-18-010528 True False False 2h7m operator-lifecycle-manager-packageserver 4.11.0-0.nightly-2022-05-18-010528 True False False 2h9m service-ca 4.11.0-0.nightly-2022-05-18-010528 True False False 2h11m storage 4.11.0-0.nightly-2022-05-18-010528 True False False 2h5m [akaris@linux analysis-1526731236049948672]$ killall omg [akaris@linux analysis-1526731236049948672]$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2h21m Working towards 4.11.0-0.nightly-2022-05-18-010528: 658 of 802 done (82% complete) [akaris@linux analysis-1526731236049948672]$ omg get co dns -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: include.release.openshift.io/ibm-cloud-managed: 'true' include.release.openshift.io/self-managed-high-availability: 'true' include.release.openshift.io/single-node-developer: 'true' creationTimestamp: '2022-05-18T01:25:25Z' generation: 1 managedFields: - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:include.release.openshift.io/ibm-cloud-managed: {} f:include.release.openshift.io/self-managed-high-availability: {} f:include.release.openshift.io/single-node-developer: {} f:ownerReferences: .: {} k:{"uid":"46bbbb00-9d1f-4d2f-80ac-f874fea89e79"}: {} f:spec: {} manager: Go-http-client operation: Update time: '2022-05-18T01:25:25Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:extension: {} manager: Go-http-client operation: Update subresource: status time: '2022-05-18T01:25:26Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: {} f:relatedObjects: {} f:versions: {} manager: dns-operator operation: Update subresource: status time: '2022-05-18T01:36:55Z' name: dns ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 46bbbb00-9d1f-4d2f-80ac-f874fea89e79 resourceVersion: '61304' uid: 06ec1647-6916-4e4b-af6f-1a6f6ccac50b spec: {} status: conditions: - lastTransitionTime: '2022-05-18T01:37:14Z' message: DNS "default" is available. reason: AsExpected status: 'True' type: Available - lastTransitionTime: '2022-05-18T02:44:17Z' message: 'Upgrading operator to "4.11.0-0.nightly-2022-05-18-010528". Upgrading coredns to "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b". Upgrading kube-rbac-proxy to "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109".' reason: Upgrading status: 'True' type: Progressing - lastTransitionTime: '2022-05-18T01:49:27Z' reason: DNSNotDegraded status: 'False' type: Degraded - lastTransitionTime: '2022-05-18T01:36:56Z' message: 'DNS default is upgradeable: DNS Operator can be upgraded' reason: DNSUpgradeable status: 'True' type: Upgradeable extension: null relatedObjects: - group: '' name: openshift-dns-operator resource: namespaces - group: operator.openshift.io name: default resource: dnses - group: '' name: openshift-dns resource: namespaces versions: - name: operator version: 4.11.0-0.nightly-2022-05-11-054135 - name: coredns version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:596c58ad0fb3a58712b27b051a95571d630374dc26d5a00afa7245b8c327de07 - name: openshift-cli version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9c25547a9165593735b7dacfbf6abbcaeb1ffc4cb941d2e0c0b65bea946bc008 - name: kube-rbac-proxy version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a175aec15d91bafafc84946593f9432736c3ef8643f1118fad49beb47d54cf57 {code} We can see the same here, all dns pods are already updated: {code} [akaris@linux analysis-1526731236049948672]$ omg get pods -n openshift-dns | awk '/dns-default/ {print $1}' | while read p ; do omg get pod -n openshift-dns $p -o json | jq '.spec.containers[] | .image'; done "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:26956e07a594b8665740d9cff7d9c30361ce8dbb1523a996c3aadf95ae77363b" "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:090ae5e7554012e1c0f1925f8dd7a02e110cb98f94d8774d3e17039115b8a109" [akaris@linux analysis-1526731236049948672]$ omg get pods -n openshift-dns NAME READY STATUS RESTARTS AGE dns-default-kmfjw 2/2 Running 0 2h4m dns-default-n2tf6 2/2 Running 0 2h5m dns-default-pq2zp 2/2 Running 0 2h5m dns-default-tcgn7 2/2 Running 0 2h6m dns-default-wtf8z 2/2 Running 0 2h7m dns-default-z6hhn 2/2 Running 0 2h4m node-resolver-kkjsz 1/1 Running 0 2h7m node-resolver-mjnk9 1/1 Running 0 2h7m node-resolver-ngqr2 1/1 Running 0 2h7m node-resolver-w426l 1/1 Running 0 2h7m node-resolver-x7wls 1/1 Running 0 2h7m node-resolver-zxj8t 1/1 Running 0 2h7m {code} And the logs: {code} [akaris@linux analysis-1526731236049948672]$ omg logs -n openshift-dns-operator dns-operator-67f99d6557-2g6g5 -c dns-operator | tail -n 30 2022-05-18T02:42:36.954999590Z time="2022-05-18T02:42:36Z" level=info msg="reconciling request: /default" 2022-05-18T02:42:54.845966990Z time="2022-05-18T02:42:54Z" level=info msg="reconciling request: /default" 2022-05-18T02:42:54.932624410Z time="2022-05-18T02:42:54Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 42, 36, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 42, 54, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:42:54.935050953Z time="2022-05-18T02:42:54Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:15.631198953Z time="2022-05-18T02:43:15Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:15.745417942Z time="2022-05-18T02:43:15Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 42, 54, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 15, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:43:15.748076288Z time="2022-05-18T02:43:15Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:15.973438911Z time="2022-05-18T02:43:15Z" level=info msg="reconciling request: /default" 2022-05-18T02:43:16.067891155Z time="2022-05-18T02:43:16Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 15, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 16, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:43:16.070968109Z time="2022-05-18T02:43:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791860 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791862 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791884 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.791945235Z W0518 02:44:15.791933 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791956 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791963 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791967 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ClusterOperator ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791979 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DNS ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.791993 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.792001 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.792003 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792019036Z W0518 02:44:15.792009 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:15.792118238Z W0518 02:44:15.792073 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-18T02:44:16.672301571Z time="2022-05-18T02:44:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:16.747104965Z time="2022-05-18T02:44:16Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 43, 16, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 49, 27, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 18, 2, 44, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"All DNS and node-resolver pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 37, 14, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 18, 1, 36, 56, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-18T02:44:16.750168018Z time="2022-05-18T02:44:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:16.813141408Z time="2022-05-18T02:44:16Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:17.098547647Z time="2022-05-18T02:44:17Z" level=info msg="reconciling request: /default" 2022-05-18T02:44:17.164908296Z time="2022-05-18T02:44:17Z" level=info msg="reconciling request: /default" {code} And the same here - run 1525510596311650304] https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1525510596311650304 {code} [akaris@linux analysis-1525510596311650304]$ omg get co | grep -v 'True False False' NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE dns 4.10.14 True True False 1h58m [akaris@linux analysis-1525510596311650304]$ omg get co dns -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: include.release.openshift.io/ibm-cloud-managed: 'true' include.release.openshift.io/self-managed-high-availability: 'true' include.release.openshift.io/single-node-developer: 'true' creationTimestamp: '2022-05-14T16:32:20Z' generation: 1 managedFields: - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:include.release.openshift.io/ibm-cloud-managed: {} f:include.release.openshift.io/self-managed-high-availability: {} f:include.release.openshift.io/single-node-developer: {} f:ownerReferences: .: {} k:{"uid":"8df8df3a-7725-47b7-9326-50bdbed53979"}: {} f:spec: {} manager: Go-http-client operation: Update time: '2022-05-14T16:32:20Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:extension: {} manager: Go-http-client operation: Update subresource: status time: '2022-05-14T16:32:20Z' - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: {} f:relatedObjects: {} f:versions: {} manager: dns-operator operation: Update subresource: status time: '2022-05-14T16:38:12Z' name: dns ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 8df8df3a-7725-47b7-9326-50bdbed53979 resourceVersion: '57174' uid: 68846a48-eed5-442f-8e5c-a04e9e1cc21e spec: {} status: conditions: - lastTransitionTime: '2022-05-14T16:38:30Z' message: DNS "default" is available. reason: AsExpected status: 'True' type: Available - lastTransitionTime: '2022-05-14T17:35:42Z' message: 'Upgrading operator to "4.11.0-0.ci-2022-05-14-160619". Upgrading coredns to "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236". Upgrading kube-rbac-proxy to "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c".' reason: Upgrading status: 'True' type: Progressing - lastTransitionTime: '2022-05-14T16:45:16Z' reason: DNSNotDegraded status: 'False' type: Degraded - lastTransitionTime: '2022-05-14T16:38:12Z' message: 'DNS default is upgradeable: DNS Operator can be upgraded' reason: DNSUpgradeable status: 'True' type: Upgradeable extension: null relatedObjects: - group: '' name: openshift-dns-operator resource: namespaces - group: operator.openshift.io name: default resource: dnses - group: '' name: openshift-dns resource: namespaces versions: - name: operator version: 4.10.14 - name: coredns version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ceb0d1d2015b87e9daf3e57b93f5464f15a1386a6bcab5442b7dba594b058b24 - name: openshift-cli version: registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:b634a8ede5ffec8e4068475de9746424e34f73416959a241592736fd1cdf5ab8 - name: kube-rbac-proxy version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:08e8b4004edaeeb125ced09ab2c4cd6d690afaf3a86309c91a994dec8e3ccbf3 [akaris@linux analysis-1525510596311650304]$ omg get pods -n openshift-dns | awk '/dns-default/ {print $1}' | while read p ; do omg get pod -n openshift-dns $p -o json | jq '.spec.containers[] | .image'; done "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:c371180cfc6ba0c7ccf4d1b5da89beee8b2ea575e6e89bc89f06280884753236" "registry.ci.openshift.org/ocp/4.11-2022-05-14-160619@sha256:5b01b4dccbca6d9f3526d861b92cb64885a3bd748a508bd1228ec10170a4485c" [akaris@linux analysis-1525510596311650304]$ omg get pods -n openshift-dns NAME READY STATUS RESTARTS AGE dns-default-249dg 2/2 Running 0 2h1m dns-default-czz5x 2/2 Running 0 1h59m dns-default-dcjm7 2/2 Running 0 1h58m dns-default-l4vc9 2/2 Running 0 2h0m dns-default-mprr6 2/2 Running 0 2h2m dns-default-vjpn2 2/2 Running 0 2h0m node-resolver-4r7tx 1/1 Running 0 2h2m node-resolver-754hp 1/1 Running 0 2h2m node-resolver-c5dnq 1/1 Running 0 2h2m node-resolver-jf5l4 1/1 Running 0 2h2m node-resolver-sx4nq 1/1 Running 0 2h2m node-resolver-zstzk 1/1 Running 0 2h2m [akaris@linux analysis-1525510596311650304]$ omg logs -n openshift-dns-operator dns-operator-67f99d6557-2g6g5 -c dns-operator | tail -n 30 [ERROR] Pod directory not found: /home/akaris/cases/dns-operator/analysis-1525510596311650304/registry-ci-openshift-org-ocp-4-11-2022-05-14-160619-sha256-090ae5109f7a1d071e12a49ae62460328b1bbe39e4bf4a3ff909f35629ae07a2/namespaces/openshift-dns-operator/pods/dns-operator-67f99d6557-2g6g5 [akaris@linux analysis-1525510596311650304]$ omg get pods -n openshift-dns-operator NAME READY STATUS RESTARTS AGE dns-operator-5d5bf79f5d-5llxf 2/2 Running 0 2h2m [akaris@linux analysis-1525510596311650304]$ omg logs -n openshift-dns-operator dns-operator-5d5bf79f5d-5llxf | tail -n 30 [ERROR] This pod has more than one containers: ['dns-operator', 'kube-rbac-proxy'] Use -c/--container to specify the container [akaris@linux analysis-1525510596311650304]$ omg logs -n openshift-dns-operator dns-operator-5d5bf79f5d-5llxf -c dns-operator | tail -n 30 2022-05-14T17:34:02.952228784Z time="2022-05-14T17:34:02Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 33, 41, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 3 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 2, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:34:02.955049734Z time="2022-05-14T17:34:02Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:20.918978128Z time="2022-05-14T17:34:20Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:21.008638550Z time="2022-05-14T17:34:21Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 2, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 21, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:34:21.013241525Z time="2022-05-14T17:34:21Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:42.326082681Z time="2022-05-14T17:34:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:34:42.467891086Z time="2022-05-14T17:34:42Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 21, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 4 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 42, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:34:42.472562810Z time="2022-05-14T17:34:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:41.511932339Z W0514 17:35:41.511893 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512072961Z W0514 17:35:41.512053 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ClusterOperator ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512106215Z W0514 17:35:41.512089 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512139893Z W0514 17:35:41.512117 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512167522Z W0514 17:35:41.512148 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512232053Z time="2022-05-14T17:35:41Z" level=error msg="failed to ensure default dns Get \"https://172.30.0.1:443/apis/operator.openshift.io/v1/dnses/default\": http2: client connection lost" 2022-05-14T17:35:41.512260744Z W0514 17:35:41.512247 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512318475Z W0514 17:35:41.512304 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DNS ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512340839Z W0514 17:35:41.512325 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512352962Z W0514 17:35:41.512348 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512381130Z W0514 17:35:41.512371 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512427727Z W0514 17:35:41.512415 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512458488Z W0514 17:35:41.512451 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:41.512496706Z W0514 17:35:41.512477 1 reflector.go:442] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2022-05-14T17:35:42.365704114Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.451387428Z time="2022-05-14T17:35:42Z" level=info msg="updated DNS default status: old: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 34, 42, 0, time.Local), Reason:\"Reconciling\", Message:\"Have 5 available DNS pods, want 6.\\nHave 5 up-to-date DNS pods, want 6.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}, new: v1.DNSStatus{ClusterIP:\"172.30.0.10\", ClusterDomain:\"cluster.local\", Conditions:[]v1.OperatorCondition{v1.OperatorCondition{Type:\"Degraded\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 45, 16, 0, time.Local), Reason:\"AsExpected\", Message:\"Enough DNS pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Progressing\", Status:\"False\", LastTransitionTime:time.Date(2022, time.May, 14, 17, 35, 42, 0, time.Local), Reason:\"AsExpected\", Message:\"All DNS and node-resolver pods are available, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Available\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 30, 0, time.Local), Reason:\"AsExpected\", Message:\"The DNS daemonset has available pods, and the DNS service has a cluster IP address.\"}, v1.OperatorCondition{Type:\"Upgradeable\", Status:\"True\", LastTransitionTime:time.Date(2022, time.May, 14, 16, 38, 12, 0, time.Local), Reason:\"AsExpected\", Message:\"DNS Operator can be upgraded\"}}}" 2022-05-14T17:35:42.454446312Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.572214603Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.904078610Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:42.949765309Z time="2022-05-14T17:35:42Z" level=info msg="reconciling request: /default" 2022-05-14T17:35:43.033849849Z time="2022-05-14T17:35:43Z" level=info msg="reconciling request: /default" [akaris@linux analysis-1525510596311650304]$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2h3m Working towards 4.11.0-0.ci-2022-05-14-160619: 658 of 802 done (82% complete) [akaris@linux analysis-1525510596311650304]$ {code} Just some further info - we can indeed see that the DNS object of name "default" is correctly updated:
~~~
[akaris@linux analysis-2]$ cat registry-ci-openshift-org-ocp-4-11-2022-05-16-095559-sha256-090ae5109f7a1d071e12a49ae62460328b1bbe39e4bf4a3ff909f35629ae07a2/cluster-scoped-resources/operator.openshift.io/dnses/default.yaml
---
apiVersion: operator.openshift.io/v1
kind: DNS
metadata:
creationTimestamp: "2022-05-16T10:19:50Z"
finalizers:
- dns.operator.openshift.io/dns-controller
generation: 1
managedFields:
- apiVersion: operator.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"dns.operator.openshift.io/dns-controller": {}
f:spec:
.: {}
f:logLevel: {}
f:nodePlacement: {}
f:operatorLogLevel: {}
f:upstreamResolvers:
.: {}
f:policy: {}
f:upstreams: {}
manager: dns-operator
operation: Update
time: "2022-05-16T10:19:50Z"
- apiVersion: operator.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:clusterDomain: {}
f:clusterIP: {}
f:conditions: {}
manager: dns-operator
operation: Update
subresource: status
time: "2022-05-16T10:19:50Z"
name: default
resourceVersion: "57484"
uid: 56af9aae-2124-4840-a378-2b3847073df6
spec:
logLevel: Normal
nodePlacement: {}
operatorLogLevel: Normal
upstreamResolvers:
policy: Sequential
upstreams:
- port: 53
type: SystemResolvConf
status:
clusterDomain: cluster.local
clusterIP: 172.30.0.10
conditions:
- lastTransitionTime: "2022-05-16T10:28:41Z"
message: Enough DNS pods are available, and the DNS service has a cluster IP address.
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2022-05-16T11:20:41Z"
message: All DNS and node-resolver pods are available, and the DNS service has
a cluster IP address.
reason: AsExpected
status: "False"
type: Progressing
- lastTransitionTime: "2022-05-16T10:20:05Z"
message: The DNS daemonset has available pods, and the DNS service has a cluster
IP address.
reason: AsExpected
status: "True"
type: Available
- lastTransitionTime: "2022-05-16T10:19:50Z"
message: DNS Operator can be upgraded
reason: AsExpected
status: "True"
type: Upgradeable
~~~
So this here works: https://github.com/openshift/cluster-dns-operator/blob/d50df32df68f53c1d47db8f5e51a8b27c402f278/pkg/operator/controller/dns_status.go#L36 This here does not: https://github.com/openshift/cluster-dns-operator/blob/d50df32df68f53c1d47db8f5e51a8b27c402f278/pkg/operator/controller/status/controller.go#L175 Possibly related to <https://github.com/openshift/cluster-dns-operator/pull/318>. From the prowci, the jobs are passing for those mentioned profiles. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1536577259970760704 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn-upgrade/1536550068805439488 Hence marking as verified. The issue was introduced in 4.11 by https://github.com/openshift/cluster-dns-operator/pull/318 and was fixed before we shipped a release with the issue, so no doc text is needed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |