1953609 – CoreDNS's "errors" plugin is not enabled for custom upstream resolvers

Bug 1953609 - CoreDNS's "errors" plugin is not enabled for custom upstream resolvers

Summary: CoreDNS's "errors" plugin is not enabled for custom upstream resolvers

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.7.z
Assignee:	Miciah Dashiel Butler Masters
QA Contact:	jechen
Docs Contact:
URL:
Whiteboard:
Depends On:	1934905
Blocks:	1954760
TreeView+	depends on / blocked

Reported:	2021-04-26 13:46 UTC by OpenShift BugZilla Robot
Modified:	2022-08-04 22:39 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-19 15:16:51 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-dns-operator pull 268	0	None	open	[release-4.7] Bug 1953609: Enable errors plugin for custom upstream resolvers	2021-04-26 13:46:26 UTC
Red Hat Product Errata	RHBA-2021:1550	0	None	None	None	2021-05-19 15:17:04 UTC

Comment 4 jechen 2021-05-03 17:17:45 UTC

Verified in 4.7.0-0.nightly-2021-05-01-081439

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-05-01-081439   True        False         109m    Cluster version is 4.7.0-0.nightly-2021-05-01-081439

1. Set up a custom nameserver
$ oc adm new-project mydns
Created project mydns

$ oc -n mydns create configmap coredns --from-file=./test2/Corefile
configmap/coredns created

$ oc -n mydns create deployment coredns --image=openshift/origin-coredns:latest --replicas=0 --port=5353 -- coredns --conf=/etc/coredns/Corefile
deployment.apps/coredns created

oc -n mydns set volume deployments/coredns --add --mount-path=/etc/coredns --type=configmap --configmap-name=coredns
info: Generated volume name: volume-xhsx6
deployment.apps/coredns volume updated

$ oc -n mydns scale deployments/coredns --replicas=1
deployment.apps/coredns scaled

$ oc -n mydns expose deployments/coredns --port=5353 --protocol=UDP
service/coredns exposed

$ oc -n mydns get services
NAME      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
coredns   ClusterIP   172.30.168.121   <none>        5353/UDP   10s


2. Configured cluster DNS to forward queries for the zone to this custom nameserver
$ oc patch dns.operator/default --type=merge --patch='{"spec":{"servers":[{"name":"mydns","zones":["redhat.com"],"forwardPlugin":{"upstreams":["172.30.168.121:5353"]}}]}}'
dns.operator.openshift.io/default patched

3. created a test pod,  from the pod, continuously performed nslookups for a name in the zone for which the custom nameserver is responsible, nslookups succeed
$ oc create -f https://raw.githubusercontent.com/openshift/verification-tests/master/testdata/networking/aosqe-pod-for-ping.json
pod/hello-pod created


[jechen@jechen ~]$ oc rsh hello-pod
/ # while :; do nslookup www.redhat.com; sleep 0.5; done
Server:		172.30.0.10
Address:	172.30.0.10:53

Name:	www.redhat.com
Address: 1.2.3.4

Server:		172.30.0.10
Address:	172.30.0.10:53

Name:	www.redhat.com
Address: 1.2.3.4

Server:		172.30.0.10
Address:	172.30.0.10:53

^C
/ # exit
command terminated with exit code 130


4. changed the replicas for the deployment to 0
$ oc -n mydns scale deployments/coredns --replicas=0
deployment.apps/coredns scaled

5. after step 4, nslookup from test pod failed as expected
$ oc rsh hello-pod
/ # while :; do nslookup www.redhat.com; sleep 0.5; done
;; connection timed out; no servers could be reached

Server:		172.30.0.10
Address:	172.30.0.10:53

** server can't find www.redhat.com: SERVFAIL

** server can't find www.redhat.com: SERVFAIL

Server:		172.30.0.10
Address:	172.30.0.10:53

^C
/ # exit
command terminated with exit code 130


$ for pod in $(oc -n openshift-dns get pods -o name)  
> do oc -n openshift-dns logs -c dns $pod
> done
.:5353
[INFO] plugin/reload: Running configuration MD5 = 88abbc1ca2ae7b733e12d8821a5b24b8
CoreDNS-1.6.6
<--snip-->
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:36385->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:59034->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:50547->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:59893->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:42549->172.30.168.121:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:33987->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:48401->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:39898->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:50438->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:46909->172.30.168.121:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:47045->172.30.168.121:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:55262->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:57168->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:47617->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:59960->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:53228->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:56946->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:48361->172.30.168.121:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:40388->172.30.168.121:5353: read: connection refused
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:33089->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:49169->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. AAAA: read udp 10.131.0.6:51940->172.30.168.121:5353: i/o timeout
[ERROR] plugin/errors: 2 www.redhat.com. A: read udp 10.131.0.6:55137->172.30.168.121:5353: i/o timeout
.:5353
<---snip--->

7. verified the custom nameserver Corefile has included "errors" plugin 
$ oc -n openshift-dns get configmaps/dns-default -o yaml
apiVersion: v1
data:
  Corefile: |
    # mydns
    redhat.com:5353 {
        forward . 172.30.168.121:5353
        errors                <-- verified the fix with https://github.com/openshift/cluster-dns-operator/pull/268
        bufsize 1232
    }
    .:5353 {
        bufsize 1232
        errors                
        health {
            lameduck 20s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            upstream
            fallthrough in-addr.arpa ip6.arpa
        }
        prometheus 127.0.0.1:9153
        forward . /etc/resolv.conf {
            policy sequential
        }
        cache 900 {
            denial 9984 30
        }
        reload
    }
kind: ConfigMap
<---snip--->

Comment 5 Siddharth Sharma 2021-05-10 17:59:33 UTC

This bug will be shipped as part of next z-stream release 4.7.11 on May 19th, as 4.7.10 was dropped due to a blocker https://bugzilla.redhat.com/show_bug.cgi?id=1958518.

Comment 9 errata-xmlrpc 2021-05-19 15:16:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.11 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1550

Note You need to log in before you can comment on or make changes to this bug.