Bug 1467790 - Start and enable node failed due to node has 64 characters hostname
Start and enable node failed due to node has 64 characters hostname
Status: POST
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.6.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.7.0
Assigned To: Michael Gugino
Johnny Liu
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-05 03:29 EDT by Wenkai Shi
Modified: 2017-08-16 08:51 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Wenkai Shi 2017-07-05 03:29:53 EDT
Description of problem:
Start and enable node failed due to node has 64 characters hostname, it break the installer, according to " https://bugzilla.redhat.com/show_bug.cgi?id=1211856#c7 ", seems installer will be break early if instance has more than 64 characters. The atomic-openshift-node.service require no more than 63 characters.

Version-Release number of selected component (if applicable):
openshift-ansible-3.6.133-1.git.0.950bb48.el7

How reproducible:
100%

Steps to Reproduce:
1. prepare instances with 64 characters hostname
2. install OCP
3.

Actual results:
# ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbboks/byo/config.yml
...
TASK [openshift_node : Start and enable node] **********************************
Wednesday 05 July 2017  06:05:01 +0000 (0:00:00.075)       0:13:13.322 ******** 
FAILED - RETRYING: TASK: openshift_node : Start and enable node (1 retries left).
fatal: [qe-weshi-master-registry-router-nfs-1.westus2.cloudapp.azure.com]: FAILED! => {
    "attempts": 1, 
    "changed": false, 
    "failed": true
}

MSG:

Unable to start service atomic-openshift-node: Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details.
...

Expected results:
Installation succeed

Additional info:
# journalctl -xe -u atomic-openshift-node
...
Jul 05 06:53:56 qe-weshi-master-registry-router-nfs-1.westus2.cloudapp.azure.com atomic-openshift-node[35512]: E0705 06:53:56.800726   35512 kubelet_node_status.go:101] Unable to register node "qe-weshi-master-registry-router-nfs-1.westus2.cloudapp.azure.com" with API server: Node "qe-weshi-master-registry-router-nfs-1.westus2.cloudapp.azure.com" is invalid: metadata.labels: Invalid value: "qe-weshi-master-registry-router-nfs-1.westus2.cloudapp.azure.com": must be no more than 63 characters
...

# echo -n "qe-weshi-master-registry-router-nfs-1.westus2.cloudapp.azure.com" | wc -c 
64
Comment 1 Tim Bielawa 2017-07-10 11:02:21 EDT
I'm wondering, is this actually a bug/limitation in OCP/Kube rather than the installer?
Comment 2 Scott Dodson 2017-07-20 10:37:29 EDT
Lets handle this in sanitize_inventory role
Comment 3 Michael Gugino 2017-08-15 21:08:47 EDT
While I agree it may be an issue in OpenShift, I have submitted the following PR to work around on our end:

https://github.com/openshift/openshift-ansible/pull/5100

I don't believe sanitize_inventory is the most ideal place for this check as we can potentially populate other names as nodename.

Note You need to log in before you can comment on or make changes to this bug.