Hide Forgot
*** Bug 1930917 has been marked as a duplicate of this bug. ***
Waiting for https://github.com/openshift/kubernetes/pull/581 to be merged so that I can rebase https://github.com/openshift/sdn/pull/261.
Here's an update on where this backport stands. The 4.6.z backport is waiting on two things: verification of the 4.7.z backport, and passing CI tests. The 4.7.z backport got delayed by a process issue unrelated to the fix itself. CI tests are failing due to an issue with our CI infrastructure: one of the CI jobs verifies changes on GCP, and we are currently having general issues with GCE API rate limiting, again unrelated to the fix itself. I anticipate that the 4.7.z backport will be verified this week. Then the 4.6.z backport can be verified next week, and shipped in the fast/stable channels approximately 2.5 weeks from now. (There is a possibility that the 4.6.z backport will be available in the candidate-4.6 channel a little earlier.)
*** Bug 1921797 has been marked as a duplicate of this bug. ***
Verified in 4.6.0-0.nightly-2021-03-06-050044 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2021-03-06-050044 True False 24m Cluster version is 4.6.0-0.nightly-2021-03-06-050044 # in first terminal, created test pod, rsh into test pod and run infinite loop to measure DNS lookup time $ oc create -f https://raw.githubusercontent.com/openshift/verification-tests/master/testdata/networking/aosqe-pod-for-ping.json pod/hello-pod created $ oc rsh hello-pod / # while (true); do time getent hosts kubernetes.default.svc.cluster.local; sleep 1; done 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.02s user 0m 0.00s sys 0m 0.00s 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.00s user 0m 0.00s sys 0m 0.00s 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.00s user 0m 0.00s sys 0m 0.00s 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.00s user 0m 0.00s sys 0m 0.00s <---- snip-----> # In second terminal. reboot one of the master node $ oc get node NAME STATUS ROLES AGE VERSION ci-ln-947k4ft-f76d1-9nnw9-master-0 Ready master 57m v1.19.0+2f3101c ci-ln-947k4ft-f76d1-9nnw9-master-1 Ready master 57m v1.19.0+2f3101c ci-ln-947k4ft-f76d1-9nnw9-master-2 Ready master 57m v1.19.0+2f3101c ci-ln-947k4ft-f76d1-9nnw9-worker-b-drbnn Ready worker 49m v1.19.0+2f3101c ci-ln-947k4ft-f76d1-9nnw9-worker-c-fwrs8 Ready worker 49m v1.19.0+2f3101c ci-ln-947k4ft-f76d1-9nnw9-worker-d-6jpxz Ready worker 49m v1.19.0+2f3101c $ oc debug node/ci-ln-947k4ft-f76d1-9nnw9-master-1 Starting pod/ci-ln-947k4ft-f76d1-9nnw9-master-1-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.0.4 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# reboot #monitor DNS lookup delay in first terminal for 3-5 minutes while master node is rebooting, make sure there is no delay <---- snip-----> 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.00s user 0m 0.00s sys 0m 0.00s 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.00s user 0m 0.00s sys 0m 0.00s 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.00s user 0m 0.00s sys 0m 0.00s 172.30.0.1 kubernetes.default.svc.cluster.local kubernetes.default.svc.cluster.local real 0m 0.01s user 0m 0.00s sys 0m 0.00s <---- snip-----> [jechen@jechen ~]$ oc -n openshift-dns get ds/dns-default -oyaml <---- snip-----> readinessProbe: failureThreshold: 3 httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 3 <--- verified fix with https://github.com/openshift/cluster-dns-operator/pull/236 successThreshold: 1 timeoutSeconds: 3 <--- verified fix with https://github.com/openshift/cluster-dns-operator/pull/236 <---- snip----->
sorry, I marked wrong status, changed to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.21 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0753