Description of problem:
The Openshift DNS deployment for Calico SDN does not keep DNS traffic local to the node or zone. It simply randomly selects one of the backends which can lead to DNS traffic going cross zone and having higher latency on requests and sometimes failures.
One option is to ensure topology hints are enabled in 4.11+ which will aim to keep traffic within a zone boundary whenever possible. Ideally: traffic would be kept to a node however within a zone is a major improvement versus the current topology for DNS.
OpenShift release version:
All Openshift releases
Any provider using Calico SDN (IBM ROKS, IBM Cloud Satellite)
Steps to Reproduce (in detail):
1. Setup TCP Dump
2. Send DNS requests from a pod on the node in a multi zone cluster
3. Watch as the requests are distributed to different zones over time
DNS requests can leave node and zone to any random backend DNS pod
When possible: DNS requests stay local to node and/or zone
Impact of the problem:
Higher latency of DNS requests
Increased failures of DNS requests
** Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.
I did my best to write up doc text for this BZ. Please feel free to suggest or make corrections.
Verified with 4.11.0-0.ci-2022-06-20-211630 (since latest available nightly build is 5 days ago) and the annotation "service.kubernetes.io/topology-aware-hints: auto" is added to dns-default service.
$ oc -n openshift-dns get svc/dns-default -oyaml
Checked with latest nightly build 4.11.0-0.nightly-2022-06-21-151125 and passed as well
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.