Bug 1954760 - CoreDNS's "errors" plugin is not enabled for custom upstream resolvers
Summary: CoreDNS's "errors" plugin is not enabled for custom upstream resolvers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: DNS
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Miciah Dashiel Butler Masters
QA Contact: jechen
URL:
Whiteboard:
Depends On: 1953609
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-28 18:14 UTC by OpenShift BugZilla Robot
Modified: 2021-05-20 11:52 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-20 11:52:25 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-dns-operator pull 271 0 None open [release-4.6] Bug 1954760: Enable errors plugin for custom upstream resolvers 2021-05-03 15:33:55 UTC
Red Hat Product Errata RHBA-2021:1521 0 None None None 2021-05-20 11:52:31 UTC

Comment 2 jechen 2021-05-06 22:56:49 UTC
verified in 4.6.0-0.nightly-2021-05-06-185359

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2021-05-06-185359   True        False         2m1s    Cluster version is 4.6.0-0.nightly-2021-05-06-185359

1. Set up a custom nameserver
$ oc adm new-project mydns
Created project mydns


$ oc -n mydns create configmap coredns --from-file=./test2/Corefile
configmap/coredns created

$ oc -n mydns create deployment coredns --image=openshift/origin-coredns:latest --replicas=0 --port=5353 -- coredns --conf=/etc/coredns/Corefile
deployment.apps/coredns created


$ oc -n mydns set volume deployments/coredns --add --mount-path=/etc/coredns --type=configmap --configmap-name=coredns
info: Generated volume name: volume-fhfnx
deployment.apps/coredns volume updated


$ oc -n mydns scale deployments/coredns --replicas=1
deployment.apps/coredns scaled


$ oc -n mydns expose deployments/coredns --port=5353 --protocol=UDP
service/coredns exposed

$ oc -n mydns get services
NAME      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
coredns   ClusterIP   172.30.249.112   <none>        5353/UDP   11s


2. Configured cluster DNS to forward queries for the zone to this custom nameserver
$ oc patch dns.operator/default --type=merge --patch='{"spec":{"servers":[{"name":"mydns","zones":["redhat.com"],"forwardPlugin":{"upstreams":["172.30.249.112:5353"]}}]}}'
dns.operator.openshift.io/default patched


3. created a test pod,  from the pod, continuously performed nslookups for a name in the zone for which the custom nameserver is responsible
$ oc create -f https://raw.githubusercontent.com/openshift/verification-tests/master/testdata/networking/aosqe-pod-for-ping.json
pod/hello-pod created


$ oc rsh hello-pod
/ # while :; do nslookup www.redhat.com; sleep 0.5; done
Server:		172.30.0.10
Address:	172.30.0.10:53

Non-authoritative answer:
www.redhat.com	canonical name = ds-www.redhat.com.edgekey.net
ds-www.redhat.com.edgekey.net	canonical name = ds-www.redhat.com.edgekey.net.globalredir.akadns.net
ds-www.redhat.com.edgekey.net.globalredir.akadns.net	canonical name = e3396.dscx.akamaiedge.net
Name:	e3396.dscx.akamaiedge.net
Address: 23.59.99.64

Non-authoritative answer:
www.redhat.com	canonical name = ds-www.redhat.com.edgekey.net
ds-www.redhat.com.edgekey.net	canonical name = ds-www.redhat.com.edgekey.net.globalredir.akadns.net
ds-www.redhat.com.edgekey.net.globalredir.akadns.net	canonical name = e3396.dscx.akamaiedge.net
Name:	e3396.dscx.akamaiedge.net
Address: 2600:1408:9000:481::d44
Name:	e3396.dscx.akamaiedge.net
Address: 2600:1408:9000:496::d44

Server:		172.30.0.10
Address:	172.30.0.10:53
<--snip-->


4. changed the replicas for the deployment to 0
 oc -n mydns scale deployments/coredns --replicas=0
deployment.apps/coredns scaled


5. after step 4, nslookup from test pod failed as expected
$ oc rsh hello-pod
/ # while :; do nslookup www.redhat.com; sleep 0.5; done
;; connection timed out; no servers could be reached

Server:		172.30.0.10
Address:	172.30.0.10:53

** server can't find www.redhat.com: SERVFAIL

*** Can't find www.redhat.com: No answer

Server:		172.30.0.10
Address:	172.30.0.10:53

** server can't find www.redhat.com: SERVFAIL

** server can't find www.redhat.com: SERVFAIL

<--snip-->


6. verified DNS pods started logging errors
$ for pod in $(oc -n openshift-dns get pods -o name)
> do oc -n openshift-dns logs -c dns $pod
> done
<--snip-->
.:5353
[INFO] plugin/reload: Running configuration MD5 = d587e9b1a9e78220deb55ca080b8f672
CoreDNS-1.6.6
linux/amd64, go1.15.7, 
[INFO] Reloading
[INFO] plugin/health: Going into lameduck mode for 20s
[INFO] plugin/reload: Running configuration MD5 = 9ce9e29fc6a4173f57ac6b0c9f5b5721
[INFO] Reloading complete
.:5353
[INFO] plugin/reload: Running configuration MD5 = d587e9b1a9e78220deb55ca080b8f672
CoreDNS-1.6.6
linux/amd64, go1.15.7, 
[INFO] Reloading
[INFO] plugin/health: Going into lameduck mode for 20s
[INFO] plugin/reload: Running configuration MD5 = 9ce9e29fc6a4173f57ac6b0c9f5b5721
[INFO] Reloading complete
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:47949->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:51175->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:55914->172.30.249.112:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:41773->172.30.249.112:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:45510->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:35381->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:47768->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:59455->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:45804->172.30.249.112:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:50049->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:47915->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:50498->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:37584->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:55970->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:52351->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:53309->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:35827->172.30.249.112:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:54926->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:55623->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:44810->172.30.249.112:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:38616->172.30.249.112:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:49047->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:56183->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.12:59278->172.30.249.112:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.12:54129->172.30.249.112:5353: i/o timeout
<--snip-->


7. verified the custom nameserver Corefile has included "errors" plugin 
$ oc -n openshift-dns get configmaps/dns-default -o yaml
apiVersion: v1
data:
  Corefile: |
    # mydns
    redhat.com:5353 {
        forward . 172.30.249.112:5353
        errors                 verified expected result fixed by https://github.com/openshift/cluster-dns-operator/pull/271
    }
    .:5353 {
        errors
        health {
            lameduck 20s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            upstream
            fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
            policy sequential
        }
        cache 900 {
            denial 9984 30
        }
        reload
    }
<--snip-->

Comment 5 errata-xmlrpc 2021-05-20 11:52:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.29 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1521


Note You need to log in before you can comment on or make changes to this bug.