Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1949361

Summary:	CoreDNS resolution failure for external hostnames with "A: dns: overflow unpacking uint16"
Product:	OpenShift Container Platform	Reporter:	Arnab Ghosh <arghosh>
Component:	Networking	Assignee:	Stephen Greene <sgreene>
Networking sub component:	DNS	QA Contact:	Hongan Li <hongli>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	amcdermo, aos-bugs, dahernan, fgiloux, mapandey, mjoseph, palonsor, rcernin, rsales, sgreene
Version:	4.5
Target Milestone:	---
Target Release:	4.8.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: Cluster upstream resolver returns DNS response that exceeds 512 bytes via UDP. Consequence: CoreDNS may return SERVFAIL and or log various error messages, sometimes forcing the client to retry over TCP. Fix: Enable the CoreDNS bufisze plugin with a UDP buffer size of 1232 bytes. Result: CoreDNS is less likely to return SERVFAIL or present any runtime errors when handling large DNS responses via UDP. Also, UDP packet fragmentation is less likely to occur.	Story Points:	---
Clone Of:
Clones:	1953097 (view as bug list)		Environment:
Last Closed:	2021-07-27 23:00:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1953097

Comment 2 David Hernández Fernández 2021-04-14 09:40:39 UTC

The issue was response header coming from dns server end is more than 512 bits. The dns was configured over 512 bits, but according to following pull-request, cluster(CoreDNS) should compress header if it is more than 512 bits 

We think it should be solved with  https://github.com/coredns/coredns/pull/2225/commits but it is not enough. The issue was solved modifying the dns server.

Comment 4 Andrew McDermott 2021-04-15 16:14:19 UTC

(In reply to David Hernández Fernández from comment #2)
> The issue was response header coming from dns server end is more than 512
> bits. The dns was configured over 512 bits, but according to following
> pull-request, cluster(CoreDNS) should compress header if it is more than 512
> bits 
> 
> We think it should be solved with 
> https://github.com/coredns/coredns/pull/2225/commits but it is not enough.
> The issue was solved modifying the dns server.

Would it be possible to get the exact configuration (bind? zone file?) from the upstream resolver setup - this would save
us a lot of time. Also, what exactly was changed in the dns server that this issue no longer occurs?

Comment 5 Rafael Sales 2021-04-16 03:46:28 UTC

Hi, we don't have access this configuration, but I know what they did. 

They configured the record with a lot of ips, because they would like to make a load balancer round robin through dns server. 

So when we query this fqdn, the result came over 512 bytes and the coredns can't handle this answer. 

After this, the company responsible for this DNS Server changed the record with few ips and the result came 475 bytes and CoreDNS can handle with the new result.

Comment 19 Hongan Li 2021-04-26 03:03:13 UTC

verified with 4.8.0-0.nightly-2021-04-25-195440 and passed.

$ oc -n openshift-dns get cm/dns-default -oyaml
apiVersion: v1
data:
  Corefile: |
    # test
    foo.bar:5353 {
        forward . 192.168.11.11
        errors
        bufsize 1232
    }
    .:5353 {
        bufsize 1232
        errors
        health {
            lameduck 20s
        }
<---snip--->
kind: ConfigMap

Comment 22 errata-xmlrpc 2021-07-27 23:00:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438