Bug 1878065

Summary: GCP IPI installation fails when the cluster name has 19 characters
Product: OpenShift Container Platform Reporter: Yang Yang <yanyang>
Component: RHCOSAssignee: Ben Howard <behoward>
Status: CLOSED DUPLICATE QA Contact: Michael Nguyen <mnguyen>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: aos-bugs, bbreard, behoward, imcleod, jligon, jokerman, kgarriso, nstielau
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-14 19:33:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Journalctl -xe none

Description Yang Yang 2020-09-11 08:50:16 UTC
Created attachment 1714529 [details]
Journalctl -xe

Description of problem:
Worker nodes are not joined the cluster and complain host not found

Sep 11 08:38:59 auto-yanyang-935787-t2hhk-worker-a-vxb7c.c.openshift-qe.internal hyperkube[39487]: E0911 08:38:59.117233   39487 kubelet_node_status.go:92] Unable to register node "auto-yanyang-935787-t2hhk-worker-a-vxb7c.c.openshift-qe.internal" with API server: Node "auto-yanyang-935787-t2hhk-worker-a-vxb7c.c.openshift-qe.internal" is invalid: metadata.labels: Invalid value: "auto-yanyang-935787-t2hhk-worker-a-vxb7c.c.openshift-qe.internal": must be no more than 63 characters
Sep 11 08:38:59 auto-yanyang-935787-t2hhk-worker-a-vxb7c.c.openshift-qe.internal hyperkube[39487]: I0911 08:38:59.089636   39487 nodeinfomanager.go:403] Failed to publish CSINode: nodes "auto-yanyang-935787-t2hhk-worker-a-vxb7c.c.openshift-qe.internal" not found


Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-09-09-173545

How reproducible:
Always

Steps to Reproduce:
1. Install a GCP IPI cluster with cluster name auto-yanyang-935787
2.
3.

Actual results:
Installation fails
level=info msg="Cluster operator monitoring Available is False with : "
level=fatal msg="failed to initialize the cluster: Some cluster operators are still updating: authentication, console, image-registry, ingress, kube-storage-version-migrator, monitoring"

Worker nodes are not joined
# oc get node 
NAME                                                         STATUS   ROLES    AGE     VERSION
auto-yanyang-935787-t2hhk-master-0.c.openshift-qe.internal   Ready    master   4h38m   v1.19.0-rc.2+fc4c489
auto-yanyang-935787-t2hhk-master-1.c.openshift-qe.internal   Ready    master   4h38m   v1.19.0-rc.2+fc4c489
auto-yanyang-935787-t2hhk-master-2.c.openshift-qe.internal   Ready    master   4h38m   v1.19.0-rc.2+fc4c489

Expected results:
Installation passes

Additional info:

Comment 2 Ben Howard 2020-09-11 19:28:40 UTC
This should have been fixed....the attached journal is incomplete. Can you post the _full_ system journal? 

Also, what RHCOS image did you provision this IPI install with? Unless you were using an RHCOS 4.6 image, I could see the DHCP client in initramfs triggering this. Either way, I need a lot more information.

Comment 5 Yang Yang 2020-09-14 06:33:52 UTC
Full system journal is online https://drive.google.com/file/d/1Xey1URfsuKGhULH3rX_l7aiTw3HnH7JI/view?usp=sharing
I'm using rhcos-46-82-202008260918-0-gcp-x86-64 .

Comment 6 Ben Howard 2020-09-14 19:33:17 UTC

*** This bug has been marked as a duplicate of bug 1872885 ***

Comment 7 Yang Yang 2020-09-15 02:25:11 UTC
Ben Howard,

It was reported against 4.6 but the dupe bug 1872885 against 4.4. You closed it as a dupe, so does it mean it will only be fix in 4.6?

Comment 8 Red Hat Bugzilla 2023-09-14 06:08:21 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days