To fix bug 1936589, we configured cluster DNS to honor ttl values of up to 15 minutes from upstream resolvers and cap higher ttl values to 15 minutes. Before that change, ttl values were capped to 30 seconds. Capping ttl values for nxdomain responses to 15 minutes instead of 30 seconds causes long delays (15 minutes) when provisioning service load-balancers, including the default ingress load-balancer that is provisioned when a cluster is installed. Bug 1936589 is verified but not shipped. The justification for marking this new BZ as a blocker is that we want to fix the problem introduced by the fix for bug 1936589 before it ships.
Verified with the cluster launched by cluster-bot (launch openshift/cluster-dns-operator#256) and passed. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.ci.test-2021-04-02-005725-ci-ln-9c610wt True False 59m Cluster version is 4.6.0-0.ci.test-2021-04-02-005725-ci-ln-9c610wt $ oc -n openshift-dns get cm/dns-default -oyaml apiVersion: v1 data: Corefile: | .:5353 { errors health kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure upstream fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf { policy sequential } cache 900 { denial 9984 30 } reload } ### check the TTL of positive response sh-4.4# dig stackoverflow.com ; <<>> DiG 9.11.13-RedHat-9.11.13-6.el8_2.1 <<>> stackoverflow.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10317 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;stackoverflow.com. IN A ;; ANSWER SECTION: stackoverflow.com. 300 IN A 151.101.65.69 stackoverflow.com. 300 IN A 151.101.129.69 ### check the TTL of negative response sh-4.4# dig nxdomain.google.com ; <<>> DiG 9.11.13-RedHat-9.11.13-6.el8_2.1 <<>> nxdomain.google.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 24399 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;nxdomain.google.com. IN A ;; AUTHORITY SECTION: google.com. 27 IN SOA ns1.google.com. dns-admin.google.com. 366215971 900 900 1800 60
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.25 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1153