Bug 1949361
Summary: | CoreDNS resolution failure for external hostnames with "A: dns: overflow unpacking uint16" | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Arnab Ghosh <arghosh> | |
Component: | Networking | Assignee: | Stephen Greene <sgreene> | |
Networking sub component: | DNS | QA Contact: | Hongan Li <hongli> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | amcdermo, aos-bugs, dahernan, fgiloux, mapandey, mjoseph, palonsor, rcernin, rsales, sgreene | |
Version: | 4.5 | |||
Target Milestone: | --- | |||
Target Release: | 4.8.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause:
Cluster upstream resolver returns DNS response that exceeds 512 bytes via UDP.
Consequence:
CoreDNS may return SERVFAIL and or log various error messages, sometimes forcing the client to retry over TCP.
Fix:
Enable the CoreDNS bufisze plugin with a UDP buffer size of 1232 bytes.
Result: CoreDNS is less likely to return SERVFAIL or present any runtime errors when handling large DNS responses via UDP. Also, UDP packet fragmentation is less likely to occur.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1953097 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-27 23:00:24 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1953097 |
Comment 2
David Hernández Fernández
2021-04-14 09:40:39 UTC
(In reply to David Hernández Fernández from comment #2) > The issue was response header coming from dns server end is more than 512 > bits. The dns was configured over 512 bits, but according to following > pull-request, cluster(CoreDNS) should compress header if it is more than 512 > bits > > We think it should be solved with > https://github.com/coredns/coredns/pull/2225/commits but it is not enough. > The issue was solved modifying the dns server. Would it be possible to get the exact configuration (bind? zone file?) from the upstream resolver setup - this would save us a lot of time. Also, what exactly was changed in the dns server that this issue no longer occurs? Hi, we don't have access this configuration, but I know what they did. They configured the record with a lot of ips, because they would like to make a load balancer round robin through dns server. So when we query this fqdn, the result came over 512 bytes and the coredns can't handle this answer. After this, the company responsible for this DNS Server changed the record with few ips and the result came 475 bytes and CoreDNS can handle with the new result. verified with 4.8.0-0.nightly-2021-04-25-195440 and passed. $ oc -n openshift-dns get cm/dns-default -oyaml apiVersion: v1 data: Corefile: | # test foo.bar:5353 { forward . 192.168.11.11 errors bufsize 1232 } .:5353 { bufsize 1232 errors health { lameduck 20s } <---snip---> kind: ConfigMap Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |