Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1339086

Summary:	GCE Cloud Provider not working - Delete old Node fails
Product:	OpenShift Container Platform	Reporter:	Lutz Lange <llange>
Component:	Node	Assignee:	Seth Jennings <sjenning>
Status:	CLOSED DUPLICATE	QA Contact:	DeShuai Ma <dma>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.2.0	CC:	agoldste, aos-bugs, jokerman, llange, mmccomas, wmeng
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-06-01 13:32:08 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Lutz Lange 2016-05-24 06:25:07 UTC

Description of problem:
    I can't set the cloud-provider to gce nor get access to the GCE-PDs with OSE 3.2.  

    Back in OSE 3.1 it was possible to access GCE-PDs without setting the cloud-provider toc gce in the master-config.yaml and the node-config.yaml. I could not get that working with OSE 3.2. There might have been a misalignment in my config as the master got the cloud-provider settings in my first run, while the node-config.yaml was lacking them. I did go and correct this in the node-config.yaml.
This brings the cluster into a non operational state.

Version-Release number of selected component (if applicable):
atomic-openshift-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-clients-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-master-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-sdn-ovs-3.2.0.44-1.git.0.a4463d9.el7.x86_64
tuned-profiles-atomic-openshift-node-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-node-3.2.0.44-1.git.0.a4463d9.el7.x86_64

How reproducible:
Configure an OSE cluster on GCE without setting cloud-provider to "gce". 
Change the config in the next step and add the cloud-provider attributes for master and nodes.

Actual results:
May 24 06:15:08 tmaster atomic-openshift-node: I0524 06:15:08.179277   43193 kubelet.go:1134] Attempting to register node tmaster.c.jens-walkthrough.internal
May 24 06:15:08 tmaster atomic-openshift-node: E0524 06:15:08.185229   43193 kubelet.go:1173] Previously "tmaster.c.jens-walkthrough.internal" had externalID "2844582603662585040"; now it is "tmaster.c.jens-walkthrough.internal"; will delete and recreate.
May 24 06:15:08 tmaster atomic-openshift-node: E0524 06:15:08.186261   43193 kubelet.go:1175] Unable to delete old node: User "system:node:tmaster.c.jens-walkthrough.internal" cannot delete nodes at the cluster scope

[root@tmaster ~]# oc get nodes
NAME                                  STATUS     AGE
tmaster.c.jens-walkthrough.internal   NotReady   1d
tnode1.c.jens-walkthrough.internal    NotReady   1d
tnode2.c.jens-walkthrough.internal    NotReady   1d
tnode3.c.jens-walkthrough.internal    NotReady   1d
You have new mail in /var/spool/mail/root

Expected results:
All nodes should be able to report ready.

Comment 1 Weihua Meng 2016-05-24 08:13:33 UTC

I meet the same thing with AWS before.
Did you follow those steps and it does not work?
https://docs.openshift.org/latest/install_config/configuring_aws.html#aws-applying-configuration-changes

Comment 2 Lutz Lange 2016-05-24 09:26:24 UTC

Thank you for that pointer. The nodes came back and report ready now. I'll investigate if the storage part is working now.

Comment 3 Andy Goldstein 2016-05-24 14:46:39 UTC

Any update on this? Is storage working?

Comment 4 Lutz Lange 2016-05-24 16:55:28 UTC

Deleting the nodes removed all the labels as well. So now I have to relable to get my workloads up and running...

Will work on this more tomorrow.

Comment 5 Lutz Lange 2016-05-26 15:19:10 UTC

I finally got my nodes configured....

But attaching GCE PDs does not work in my case. It looks like OSE ( kube ) tries to call the api with the full name instead of the GCE instance name. That would be the same as my hostname.

I do get this error :

May 26 15:15:40 tnode1 atomic-openshift-node: W0526 15:15:40.173126    1380 gce_util.go:183] Retrying attach for GCE PD "t-log" (retry count=3).
qMay 26 15:15:40 tnode1 atomic-openshift-node: E0526 15:15:40.623928    1380 gce_util.go:187] Error attaching PD "t-log": googleapi: Error 400: Invalid value 'tnode1.c.jens-walkthrough.internal'. Values must match the following regular expression: '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?', invalidParameter.

What is the recommended workaround for this?

Comment 6 Andy Goldstein 2016-06-01 13:32:08 UTC

Lutz, based on your most recent comment, this is a dupe of bug 1318230

*** This bug has been marked as a duplicate of bug 1318230 ***