Bug 1949361 - CoreDNS resolution failure for external hostnames with "A: dns: overflow unpacking uint16"
Summary: CoreDNS resolution failure for external hostnames with "A: dns: overflow unpa...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.8.0
Assignee: Stephen Greene
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks: 1953097
TreeView+ depends on / blocked
 
Reported: 2021-04-14 06:00 UTC by Arnab Ghosh
Modified: 2022-11-09 06:11 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Cluster upstream resolver returns DNS response that exceeds 512 bytes via UDP. Consequence: CoreDNS may return SERVFAIL and or log various error messages, sometimes forcing the client to retry over TCP. Fix: Enable the CoreDNS bufisze plugin with a UDP buffer size of 1232 bytes. Result: CoreDNS is less likely to return SERVFAIL or present any runtime errors when handling large DNS responses via UDP. Also, UDP packet fragmentation is less likely to occur.
Clone Of:
: 1953097 (view as bug list)
Environment:
Last Closed: 2021-07-27 23:00:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-dns-operator pull 266 0 None closed Bug 1949361: Corefile: Enable bufsize plugin for all servers 2021-04-26 14:50:06 UTC
Red Hat Knowledge Base (Solution) 5984291 0 None None None 2021-04-22 03:45:12 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:00:47 UTC

Comment 2 David Hernández Fernández 2021-04-14 09:40:39 UTC
The issue was response header coming from dns server end is more than 512 bits. The dns was configured over 512 bits, but according to following pull-request, cluster(CoreDNS) should compress header if it is more than 512 bits 

We think it should be solved with  https://github.com/coredns/coredns/pull/2225/commits but it is not enough. The issue was solved modifying the dns server.

Comment 4 Andrew McDermott 2021-04-15 16:14:19 UTC
(In reply to David Hernández Fernández from comment #2)
> The issue was response header coming from dns server end is more than 512
> bits. The dns was configured over 512 bits, but according to following
> pull-request, cluster(CoreDNS) should compress header if it is more than 512
> bits 
> 
> We think it should be solved with 
> https://github.com/coredns/coredns/pull/2225/commits but it is not enough.
> The issue was solved modifying the dns server.

Would it be possible to get the exact configuration (bind? zone file?) from the upstream resolver setup - this would save
us a lot of time. Also, what exactly was changed in the dns server that this issue no longer occurs?

Comment 5 Rafael Sales 2021-04-16 03:46:28 UTC
Hi, we don't have access this configuration, but I know what they did. 

They configured the record with a lot of ips, because they would like to make a load balancer round robin through dns server. 

So when we query this fqdn, the result came over 512 bytes and the coredns can't handle this answer. 

After this, the company responsible for this DNS Server changed the record with few ips and the result came 475 bytes and CoreDNS can handle with the new result.

Comment 19 Hongan Li 2021-04-26 03:03:13 UTC
verified with 4.8.0-0.nightly-2021-04-25-195440 and passed.

$ oc -n openshift-dns get cm/dns-default -oyaml
apiVersion: v1
data:
  Corefile: |
    # test
    foo.bar:5353 {
        forward . 192.168.11.11
        errors
        bufsize 1232
    }
    .:5353 {
        bufsize 1232
        errors
        health {
            lameduck 20s
        }
<---snip--->
kind: ConfigMap

Comment 22 errata-xmlrpc 2021-07-27 23:00:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.