Bug 1943826
| Summary: | CoreDNS caches NXDOMAIN responses for up to 900 seconds | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | OpenShift BugZilla Robot <openshift-bugzilla-robot> |
| Component: | Networking | Assignee: | Stephen Greene <sgreene> |
| Networking sub component: | DNS | QA Contact: | Arvind iyengar <aiyengar> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | urgent | CC: | aiyengar, amcdermo, aos-bugs, dofinn, hongli, jeder, otuchfel |
| Version: | 4.6 | Keywords: | ServiceDeliveryBlocker, ServiceDeliveryImpact |
| Target Milestone: | --- | ||
| Target Release: | 4.7.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause:
Bug 1936587 set the global CoreDNS cache max TTL to 900 seconds.
Consequence:
NXDOMAIN records received from upstream resolvers are cached for 900 seconds.
Fix:
Explicitly cache negative DNS response records for maximum 30 seconds.
Result:
Resolving domains that are in the process of being published does not take at minimum 15 minutes.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-04-12 23:22:57 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1943578 | ||
| Bug Blocks: | 1944245 | ||
merged in 4.7.0-0.nightly-2021-04-01-052823, moving to verified per #Comment 1 (should be verified by bot but seems it missed this one) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.6 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1075 |
Verified in "4.7.0-0.ci.test-2021-03-31-042927-ci-ln-sv7y39b". With this payload it is observed that the additional configuration of 30 second TTL for negative records get set by default along with 900 seconds for positive record in cache plugin section: ----- Defaulting container name to dns. Use 'oc describe pod/dns-default-chmn9 -n openshift-dns' to see all of the containers in this pod. .:5353 { errors health { lameduck 20s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure upstream fallthrough in-addr.arpa ip6.arpa } prometheus 127.0.0.1:9153 forward . /etc/resolv.conf { policy sequential } cache 900 { denial 9984 30 } reload } -----