Bug 1967766

Summary: DNS SRV request which worked in 4.7.9 stopped working in 4.7.11
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: NetworkingAssignee: Stephen Greene <sgreene>
Networking sub component: DNS QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: high CC: aos-bugs, breit, dyocum, mjoseph, mmasters, pducai, sgreene, wking
Version: 4.6Keywords: Regression
Target Milestone: ---   
Target Release: 4.7.z   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The fix for Bug 1953097 enabled the CoreDNS Bufsize plugin with a size of 1232 bytes. Some primitive DNS resolvers are not capable of receiving DNS response messages over UDP that are greater than 512 bytes. Note that DNS resolvers that retry lookups using TCP (such as Dig) are not affected by this bug. Consequence: Some DNS resolvers (such as Go's internal DNS library) are unable to receive long-winded DNS responses from openshift-dns. Fix: Set the CoreDNS bufsize to 512 bytes for all servers. Result: DNS Clients that require UDP DNS messages to not exceed 512 bytes function as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-29 04:19:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1966116    
Bug Blocks: 1970140    

Comment 1 Hongan Li 2021-06-07 02:15:07 UTC
Verified with 4.7.0-0.ci.test-2021-06-07-011518-ci-ln-435xggk-latest on a cluster launched by cluster-bot and passed.

$ oc get clusterversion
NAME      VERSION                                                  AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.ci.test-2021-06-07-011518-ci-ln-435xggk-latest   True        False         25m     Cluster version is 4.7.0-0.ci.test-2021-06-07-011518-ci-ln-435xggk-latest

$ oc -n openshift-dns get cm/dns-default -oyaml
apiVersion: v1
data:
  Corefile: |
    # test
    mytest.ocp:5353 {
        forward . 192.168.1.1
        errors
        bufsize 512
    }
    .:5353 {
        bufsize 512
        errors
        health {
            lameduck 20s
        }
        ready

Comment 3 Hongan Li 2021-06-10 11:41:39 UTC
PR has been merged into 4.7.0-0.nightly-2021-06-09-233032, moving to verified

Comment 4 Miciah Dashiel Butler Masters 2021-06-11 13:42:59 UTC
*** Bug 1970889 has been marked as a duplicate of this bug. ***

Comment 8 OpenShift Automated Release Tooling 2021-06-17 12:29:08 UTC
OpenShift engineering has decided to not ship Red Hat OpenShift Container Platform 4.7.17 due a regression https://bugzilla.redhat.com/show_bug.cgi?id=1973006. All the fixes which were part of 4.7.17 will be now part of 4.7.18 and planned to be available in candidate channel on June 23 2021 and in fast channel on June 28th.

Comment 12 Stephen Greene 2021-06-22 13:50:34 UTC
*** Bug 1970577 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2021-06-29 04:19:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.18 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2502