Bug 1901648

Summary: "do you need to set up custom dns" tooltip inaccurate
Product: OpenShift Container Platform Reporter: Erik M Jacobs <ejacobs>
Component: NetworkingAssignee: Miheer Salunke <misalunk>
Networking sub component: router QA Contact: Arvind iyengar <aiyengar>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: amcdermo, aos-bugs, evadla, hongli, jokerman, mmasters, sgreene, spadgett, wking
Version: 4.6   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The following use case was causing the issue: Create Ingress with a custom domain Ingress status gets updated by OpenShift Ingress controller with router canonical hostname Use external-dns to sync with Route 53 Consequence: The problem was the canonical router hostname didn't exist in the DNS. It is not created by OpenShift. OpenShift creates this *.apps.<cluster_name>.<base_domain> DNS record and not this one apps.<cluster_name>.<base_domain>. So canonical router hostname was not right. Fix: Now this fix sets it to router-default.apps.<cluster_name>.<base_domain> Result: Now this fix sets it to router-default.apps.<cluster_name>.<base_domain> Release note- Any administrators that have automation that takes the canonical host name and prepends a wildcard or a subdomain, should be aware that we set canonical router hostname as <ingress-controller-name>.apps.<cluster_name>.<base_domain> Basically format is router- + ingress controller name + . + ingress controller domain name
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:34:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Erik M Jacobs 2020-11-25 18:01:36 UTC
There are two tooltips on the Route page in the management console.

Router Canoncal Hostname --

The documentation for UPI DNS requirements, and the IPI installer itself, do not do anything with the router canonical hostname. While the console says this hostname is "optional", it's not referred to anywhere, and likely is not configured in ANY OpenShift environment because we don't even suggest doing so:

https://docs.openshift.com/container-platform/4.6/installing/installing_bare_metal/installing-bare-metal.html#installation-dns-user-infra_installing-bare-metal

The only thing that is documented and guaranteed is a *.apps DNS entry, which is created by the ingress operator (in IPI) or by the user (in UPI).

The second tooltip is "Do you need to set up custom DNS?"

The entry reads as follows, for a sample route:

---
To use a custom route, you must update your DNS provider by creating a canonical name (CNAME) record. Your CNAME record should point to your custom domain console-latest.apps.cluster-195f.195f.example.opentlc.com, to the OpenShift canonical router hostname, apps.cluster-195f.195f.example.opentlc.com, as the alias.
---

Note that, as above, apps itself is not configured. The wildcard is configured. Routes are, by default, exposed as:

{servicename}-{namespacename}.{cluster-base-domain}

Since this {servicename}-{namespacename} paradigm is guaranteed to work because of the requirements/prereqs for a *.apps wildcard, the above text should be amended to recommend the CNAME point to the default route that would be created. Otherwise, it is extremely/highly likely that the instructions, as provided, will not work.

This also applies to the dev console.

Comment 1 Harish Govindarajulu 2020-12-04 19:36:03 UTC
Did not get a chance to look into this. Will work on it in the next sprint

Comment 2 Jakub Hadvig 2020-12-23 16:13:14 UTC
We did not have time to fix this issue this sprint. Will reevaluate and try to fix in next sprint.

Comment 3 Samuel Padgett 2021-01-06 17:07:06 UTC
Andrew, can you give us guidance on what to show here?

Erik, note that we *only* show this message if a router canonical hostname is set:

https://github.com/spadgett/console/blob/fc81b5feb552de8b2771d155ed9719e8a84f93c5/frontend/public/components/routes.tsx#L305

Comment 4 Erik M Jacobs 2021-01-07 17:12:45 UTC
But we show that if it is set *in the cluster*, and not *in the world*.

On the systems where I've seen that a `dig` of the hostname displayed doesn't actually resolve (NXDOMAIN).

I don't think the installer actually sets a canonical hostname in the DNS provider with IPI. I'm checking, though.

Comment 5 Erik M Jacobs 2021-01-07 19:44:54 UTC
I just validated that the router's canonical hostname DNS entry is *not* created during an AWS IPI install:

---
dig apps.cluster-57b8.57b8.example.opentlc.com

; <<>> DiG 9.11.25-RedHat-9.11.25-2.fc33 <<>> apps.cluster-57b8.57b8.example.opentlc.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 23587
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;apps.cluster-57b8.57b8.example.opentlc.com. IN A

;; Query time: 175 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Thu Jan 07 14:38:15 EST 2021
;; MSG SIZE  rcvd: 71
---

This is a 4.6.3 cluster. I am not seeing the "custom dns" tooltip, but clicking the "Router Canonical Hostname" words brings up:

---
CanonicalHostname is the external host name for the router that can be used as a CNAME for the host requested for this route. This value is optional and may not be set in all cases.
---

I'm validated that's because this particular route uses the default name for the route hostname, and does not have a "custom" route hostname.

I created a route for "www.cnn.com" and then I did, in fact, see the "Do you need to set up custom DNS?" tooltip. As expected, it makes the same claim about canonical hostname:

---
To use a custom route, you must update your DNS provider by creating a canonical name (CNAME) record. Your CNAME record should point to your custom domain www.cnn.com, to the OpenShift canonical router hostname, apps.cluster-57b8.57b8.example.opentlc.com, as the alias.
---

There is no DNS entry for "apps.cluster...", so this would never work.

Comment 6 Samuel Padgett 2021-01-08 14:03:45 UTC
My reading of the API doc for `routerCanonicalHostname` is that it's meant exactly for this purpose. If the value is getting set automatically to a hostname that won't work, we need the Network Edge team to take a look.

Andrew, if this is a console bug, let us know. We'd need guidance on what to do instead. Thanks!

Comment 7 Andrew McDermott 2021-03-23 17:30:30 UTC
*** Bug 1940545 has been marked as a duplicate of this bug. ***

Comment 11 Arvind iyengar 2021-05-17 07:29:20 UTC
Verified in "4.8.0-0.ci.test-2021-05-17-053804-ci-ln-c6w7zi2-latest". With this payload it is observed that the `routerCanonicalHostname` field in route resources gets populated with name in `router-default.<clustername>.<base-domain>` format as intended.
------
oc get clusterversion             
NAME      VERSION                                                  AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci.test-2021-05-17-053804-ci-ln-c6w7zi2-latest   True        False         70m     Cluster version is 4.8.0-0.ci.test-2021-05-17-053804-ci-ln-c6w7zi2-latest


oc get route service-secure -o yaml  
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    openshift.io/host.generated: "true"
  creationTimestamp: "2021-05-17T07:03:40Z"
  labels:
    name: service-secure
  name: service-secure
  namespace: test1
  resourceVersion: "49866"
  uid: 19a3e019-8c59-41df-b25d-6e9361233ab0
spec:
  host: service-secure-test1.apps.ci-ln-c6w7zi2-f76d1.origin-ci-int-gce.dev.openshift.com
  port:
    targetPort: https
  to:
    kind: Service
    name: service-secure
    weight: 100
  wildcardPolicy: None
status:
  ingress:
  - conditions:
    - lastTransitionTime: "2021-05-17T07:03:40Z"
      status: "True"
      type: Admitted
    host: service-secure-test1.apps.ci-ln-c6w7zi2-f76d1.origin-ci-int-gce.dev.openshift.com
    routerCanonicalHostname: router-default.apps.ci-ln-c6w7zi2-f76d1.origin-ci-int-gce.dev.openshift.com <--
    routerName: default
    wildcardPolicy: None
  - conditions:
    - lastTransitionTime: "2021-05-17T07:03:40Z"
      status: "True"
      type: Admitted
    host: service-secure-test1.apps.ci-ln-c6w7zi2-f76d1.origin-ci-int-gce.dev.openshift.com
    routerCanonicalHostname: router-internalapps.internalapps.ci-ln-c6w7zi2-f76d1.origin-ci-int-gce.dev.openshift.com <--
    routerName: internalapps
    wildcardPolicy: None
------

Comment 12 Miciah Dashiel Butler Masters 2021-06-01 18:41:50 UTC
The fix warrants a release note in case any administrators have automation that takes the canonical host name and prepends a wildcard or a subdomain.  Miheer, please make sure we don't lose track of that.

Comment 14 Hongan Li 2021-06-07 09:00:05 UTC
The PR has been merged into 4.8.0-0.nightly-2021-06-03-014152 but robot didn't move it to verified, so moving to verified manually.

also retested with 4.8.0-0.nightly-2021-06-07-023220 and the CanonicalHostname has been updated as as <ingress-controller-name>.apps.<cluster_name>.<base_domain>

see: 
routerCanonicalHostname: router-default.apps.ci-ln-x6hpgik-f76d1.origin-ci-int-gce.dev.openshift.com

Comment 17 errata-xmlrpc 2021-07-27 22:34:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438