Bug 1853418

Summary: 'openshift-install' should not throw an x509 error if the user provides a 'baseDomain' with a trailing dot
Product: OpenShift Container Platform Reporter: David Roble <droble>
Component: InstallerAssignee: Pierre Prinetti <pprinett>
Installer sub component: OpenShift on OpenStack QA Contact: weiwei jiang <wjiang>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: annair, egarcia, jflynn, m.andre, pprinett, swasthan, tsze, xtian
Version: 4.6Keywords: Reopened, UpcomingSprint
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:11:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1847179    
Attachments:
Description Flags
SSL error viewed via Firefox
none
verified log with basedomain have dot none

Comment 2 David Roble 2020-07-06 19:01:15 UTC
On the bootstrap node I can resolve the API endpoint hostname and ping the endpoint, and I am able to reproduce the SSL certificate error via the CLI using curl:

[core@interop-hwpzm-bootstrap ~]$ ping api.interop.interop.oasis.css-qe.com
PING api.interop.interop.oasis.css-qe.com (172.24.0.5) 56(84) bytes of data.
64 bytes from api-int.interop.interop.oasis.css-qe.com (172.24.0.5): icmp_seq=1 ttl=64 time=0.092 ms
64 bytes from api-int.interop.interop.oasis.css-qe.com (172.24.0.5): icmp_seq=2 ttl=64 time=0.440 ms
64 bytes from api-int.interop.interop.oasis.css-qe.com (172.24.0.5): icmp_seq=3 ttl=64 time=0.060 ms
64 bytes from api-int.interop.interop.oasis.css-qe.com (172.24.0.5): icmp_seq=4 ttl=64 time=0.102 ms
64 bytes from api-int.interop.interop.oasis.css-qe.com (172.24.0.5): icmp_seq=5 ttl=64 time=0.153 ms
64 bytes from api-int.interop.interop.oasis.css-qe.com (172.24.0.5): icmp_seq=6 ttl=64 time=0.054 ms
^C
--- api.interop.interop.oasis.css-qe.com ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 133ms
rtt min/avg/max/mdev = 0.054/0.150/0.440/0.133 ms

[core@interop-hwpzm-bootstrap ~]$ curl https://api.interop.interop.oasis.css-qe.com:6443
curl: (60) SSL certificate problem: self signed certificate in certificate chain
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Comment 3 David Roble 2020-07-06 19:02:04 UTC
Created attachment 1700064 [details]
SSL error viewed via Firefox

Comment 4 David Roble 2020-07-07 20:07:07 UTC
I believe I have found the issue.  In my 'install-config.yaml' I had the following domain config:

baseDomain: interop.oasis.css-qe.com.


Notice the trailing dot in the domain name.  I am using Google Cloud DNS for this project, and I copied the hostname I am using from the Google web interface, and it included the trailing dot.  This never jumped out at me, since the trailing dot is used in every DNS lookup behind the scenes, but is typically hidden from the user.  Web browsers and DNS utilities like 'host' and 'dig' work with the dot just fine.

At this point I think that 'openshift-install' should not throw an x509 error if the user provides a 'baseDomain' with a trailing dot, since in my view that is a valid DNS name.

Comment 5 Pierre Prinetti 2020-07-09 14:21:06 UTC
Thanks for reporting.

The installer could indeed ignore the trailing dot.

Lowering the priority as it is not considered a functional issue.

Comment 8 Anoop 2020-07-09 18:57:35 UTC
This issue was observed with 4.6. The patch you have linked here for 4.4. Not sure why this is "NOTABUG' for 4.6.

Comment 9 egarcia 2020-07-10 18:43:07 UTC
Ah, my bad. The linked code does not actually sanitize the input downstream in the installer.

Comment 14 To Hung Sze 2020-09-10 14:01:01 UTC
Note to self, to be covered in test case ocp-24404 for gcp regression.

Comment 17 weiwei jiang 2020-09-28 03:38:22 UTC
Checked with 4.6.0-0.nightly-2020-09-27-075304, and verified.

verified log is attached.

Comment 18 weiwei jiang 2020-09-28 03:39:25 UTC
Created attachment 1717173 [details]
verified log with basedomain have dot

Comment 19 weiwei jiang 2020-09-28 03:40:43 UTC
install-config.yaml which used to verify:

---
apiVersion: v1
baseDomain: wjiang.shiftstack.io.
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform: {}
  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform: {}
  replicas: 3
metadata:
  creationTimestamp: null
  name: bmocp
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.0.0/18
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  openstack:
    apiVIP: 192.168.0.5
    cloud: shiftstack
    computeFlavor: m4.xlarge
    externalDNS: null
    externalNetwork: public
    ingressVIP: 192.168.0.7
    lbFloatingIP: 10.46.43.177
publish: External
pullSecret: 'xxxxx'
sshKey: |
  ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCbI/Ls4UkisLh0bz/YHLdw4N8edQ0cQzN9U92DR1lgqA7/Ex0TM4UYmpqPzAaGziURiL4m1Z2s9w7HU9OsYU9c3LrUWuxXiGL7kUdnhZ0haV5AwZqQtoOF+nWToQ4rsrGNhJatH8Bh+hKOocf3LmsB8tAOuAh2WZbv7KHRFoCH/oFRNHHPR979/b2jrMJQJgMZOU5OzwM4/jNo0RfXNHQPAjdn1sJVKfsUKDCdrhwKasi/viRf/JM2f+A7BLVeIl4+92XJU21WTQp0OmzBm47vCi+k7MKNh3aEVpnGVhQOTEBdWjTT/3QGxoEDrOvzx7omPDNusXj5l84Pdeg6fmrx

Comment 22 errata-xmlrpc 2020-10-27 16:11:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196