Bug 2043046
Summary: | nslookup reporting Truncated, retrying in TCP mode errors | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Andy Bartlett <andbartl> |
Component: | Networking | Assignee: | aos-network-edge-staff <aos-network-edge-staff> |
Networking sub component: | DNS | QA Contact: | Melvin Joseph <mjoseph> |
Status: | CLOSED NOTABUG | Docs Contact: | |
Severity: | urgent | ||
Priority: | unspecified | CC: | amcdermo, aos-bugs, gspence, hongli, mmasters, pedro.magos |
Version: | 4.8 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-01 16:03:13 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Andy Bartlett
2022-01-20 14:31:06 UTC
The description of this report doesn't explicitly state what the expected behavior is, but based on the summary of the report ("nslookup reporting Truncated, retrying in TCP mode errors"), I assume the customer is concerned about the "Truncated, retrying in TCP mode" messages from nslookup. The basic DNS protocol allows query and response sizes of up to 512 bytes when using UDP; it is necessary to use TCP or an extension to the DNS protocol to accommodate larger queries or responses. The "truncated" messages from nslookup indicate that nslookup attempted to perform a lookup using UDP, the response was truncated (probably because the response was more than 512 bytes), and so nslookup retried the request using TCP. Afterwards, nslookup provided a response, which indicates that the TCP query succeeded in getting a response, so the lookup ultimately succeeded with no problems. This is exactly the expected behavior for a DNS query over UDP that elicits a large response. (In case you are curious, here is the relevant code in nslookup: <https://gitlab.isc.org/isc-projects/bind9/-/blob/7267c3932362fe100ee2717b8a8ada1d21ce7987/bin/dig/dighost.c#L3850-3851>.) In addition to TCP, there is the EDNS standard (cf. <https://en.wikipedia.org/wiki/Extension_Mechanisms_for_DNS>, <https://datatracker.ietf.org/doc/html/rfc2671>, and <https://datatracker.ietf.org/doc/html/rfc6891>). Some resolvers and nameservers use this standard to send queries and responses larger than 512 bytes using UDP. However, support for EDNS is not universal. In particular, we have discovered that Go does not support EDNS (cf. bug 1949361, bug 1953518, bug 1966116, bug 1991067, <https://github.com/golang/go/issues/6464>, <https://github.com/golang/go/issues/11070>), which means that using EDNS to send more than 512 bytes over UDP causes problems for Go-based operators and builds. Because of this issue, we use CoreDNS's "bufsize" plugin (cf. <https://coredns.io/plugins/bufsize/>) to restrict UDP queries to 512 bytes (cf. <https://github.com/openshift/cluster-dns-operator/blob/0fcb6e5e330c26bb9d2ee32e9a63e87515c58784/pkg/operator/controller/controller_dns_configmap.go#L45-L52>, <https://github.com/openshift/machine-config-operator/blob/45d7287d05bcbcc8ef892e6613db3f02df05fd43/templates/common/on-prem/files/coredns-corefile.yaml#L7>). This restriction requires that resolvers retry with TCP when the UDP responses are truncated. However, it is the best solution we have found to maximize compatibility with resolvers and nameservers that implement the basic DNS protocol but may not implement EDNS. It does require that firewalls be properly configured to allow TCP port 53 and that resolvers properly retry with TCP when UDP does not work, but compliant resolvers should all do this. Does that resolve the matter? Is any further action required on this BZ? Closing because the described behavior is expected. Hi Miciah, Yup thanks for closing this, the problem seems to be resolved at the customer. Regards, Andy Can we reopen ?, we have exacltly the same issue, but removing the 'bufsize 512' is not changing the UDP Payload size ... We have a Weblogic Operator and Weblogic containers that needs to resolve names that doesn't exist at the pod boot. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |