Bug 1747840

Summary: [Azure] the TTL for *.apps DNS record should not be zero on Azure platform
Product: OpenShift Container Platform Reporter: Hongan Li <hongli>
Component: NetworkingAssignee: Dan Mace <dmace>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, esimard
Version: 4.2.0   
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 06:39:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hongan Li 2019-09-02 03:20:19 UTC
Description of problem:
Checked the TTL for *.apps DNS record on different Platform and found that it is 0 second on Azure, but 60 seconds on AWS platform and 600 seconds on GCP platform.

Using 0 second for TTL may cause potentially performance issue. I have found sometimes it failed to resolve name when curling a route repeatedly on Azure platform. 

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-08-31-102058

How reproducible:
100%

Steps to Reproduce:
1. install 4.2 cluster
2. dig console-openshift-console.apps.hongli-az058.qe.azure.devcluster.openshift.com


Actual results:
$ dig console-openshift-console.apps.hongli-az058.qe.azure.devcluster.openshift.com

; <<>> DiG 9.11.9-RedHat-9.11.9-1.fc30 <<>> console-openshift-console.apps.hongli-az058.qe.azure.devcluster.openshift.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35106
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;console-openshift-console.apps.hongli-az058.qe.azure.devcluster.openshift.com. IN A

;; ANSWER SECTION:
console-openshift-console.apps.hongli-az058.qe.azure.devcluster.openshift.com. 0 IN A 52.141.217.141

;; Query time: 52 msec
;; SERVER: 10.72.17.5#53(10.72.17.5)
;; WHEN: Mon Sep 02 11:03:25 CST 2019
;; MSG SIZE  rcvd: 122


Expected results:
Use 60 or 300 seconds TTL for *.apps record.

Additional info:
Not sure if 60s or 300s is better, but we should keep the same default value in all platform if possible.

Comment 2 Dan Mace 2019-09-03 20:40:57 UTC
Great catch, thanks for the report. We've chosen 30s as the new default, and existing records should get migrated during an upgrade (except for AWS, for which we don't control the TTL in this case — TTL for alias records in Route53 are not configurable).

Comment 4 Hongan Li 2019-09-09 03:02:35 UTC
verified with 4.2.0-0.nightly-2019-09-08-180038 and issue has been fixed.

$ dig console-openshift-console.apps.hongli-az038.qe.azure.devcluster.openshift.com

; <<>> DiG 9.11.9-RedHat-9.11.9-1.fc30 <<>> console-openshift-console.apps.hongli-az038.qe.azure.devcluster.openshift.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37063
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;console-openshift-console.apps.hongli-az038.qe.azure.devcluster.openshift.com. IN A

;; ANSWER SECTION:
console-openshift-console.apps.hongli-az038.qe.azure.devcluster.openshift.com. 30 IN A 13.86.100.124

And TTL also has been changed to 30s on GCP platform:

$ dig console-openshift-console.apps.hongli-gcp038.qe.gcp.devcluster.openshift.com

; <<>> DiG 9.11.9-RedHat-9.11.9-1.fc30 <<>> console-openshift-console.apps.hongli-gcp038.qe.gcp.devcluster.openshift.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 293
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;console-openshift-console.apps.hongli-gcp038.qe.gcp.devcluster.openshift.com. IN A

;; ANSWER SECTION:
console-openshift-console.apps.hongli-gcp038.qe.gcp.devcluster.openshift.com. 30 IN A 35.225.130.254

Comment 5 errata-xmlrpc 2019-10-16 06:39:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922