Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1339086

Summary: GCE Cloud Provider not working - Delete old Node fails
Product: OpenShift Container Platform Reporter: Lutz Lange <llange>
Component: NodeAssignee: Seth Jennings <sjenning>
Status: CLOSED DUPLICATE QA Contact: DeShuai Ma <dma>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: agoldste, aos-bugs, jokerman, llange, mmccomas, wmeng
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-01 13:32:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lutz Lange 2016-05-24 06:25:07 UTC
Description of problem:
    I can't set the cloud-provider to gce nor get access to the GCE-PDs with OSE 3.2.  

    Back in OSE 3.1 it was possible to access GCE-PDs without setting the cloud-provider toc gce in the master-config.yaml and the node-config.yaml. I could not get that working with OSE 3.2. There might have been a misalignment in my config as the master got the cloud-provider settings in my first run, while the node-config.yaml was lacking them. I did go and correct this in the node-config.yaml.
This brings the cluster into a non operational state.

Version-Release number of selected component (if applicable):
atomic-openshift-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-clients-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-master-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-sdn-ovs-3.2.0.44-1.git.0.a4463d9.el7.x86_64
tuned-profiles-atomic-openshift-node-3.2.0.44-1.git.0.a4463d9.el7.x86_64
atomic-openshift-node-3.2.0.44-1.git.0.a4463d9.el7.x86_64

How reproducible:
Configure an OSE cluster on GCE without setting cloud-provider to "gce". 
Change the config in the next step and add the cloud-provider attributes for master and nodes.

Actual results:
May 24 06:15:08 tmaster atomic-openshift-node: I0524 06:15:08.179277   43193 kubelet.go:1134] Attempting to register node tmaster.c.jens-walkthrough.internal
May 24 06:15:08 tmaster atomic-openshift-node: E0524 06:15:08.185229   43193 kubelet.go:1173] Previously "tmaster.c.jens-walkthrough.internal" had externalID "2844582603662585040"; now it is "tmaster.c.jens-walkthrough.internal"; will delete and recreate.
May 24 06:15:08 tmaster atomic-openshift-node: E0524 06:15:08.186261   43193 kubelet.go:1175] Unable to delete old node: User "system:node:tmaster.c.jens-walkthrough.internal" cannot delete nodes at the cluster scope

[root@tmaster ~]# oc get nodes
NAME                                  STATUS     AGE
tmaster.c.jens-walkthrough.internal   NotReady   1d
tnode1.c.jens-walkthrough.internal    NotReady   1d
tnode2.c.jens-walkthrough.internal    NotReady   1d
tnode3.c.jens-walkthrough.internal    NotReady   1d
You have new mail in /var/spool/mail/root

Expected results:
All nodes should be able to report ready.

Comment 1 Weihua Meng 2016-05-24 08:13:33 UTC
I meet the same thing with AWS before.
Did you follow those steps and it does not work?
https://docs.openshift.org/latest/install_config/configuring_aws.html#aws-applying-configuration-changes

Comment 2 Lutz Lange 2016-05-24 09:26:24 UTC
Thank you for that pointer. The nodes came back and report ready now. I'll investigate if the storage part is working now.

Comment 3 Andy Goldstein 2016-05-24 14:46:39 UTC
Any update on this? Is storage working?

Comment 4 Lutz Lange 2016-05-24 16:55:28 UTC
Deleting the nodes removed all the labels as well. So now I have to relable to get my workloads up and running...

Will work on this more tomorrow.

Comment 5 Lutz Lange 2016-05-26 15:19:10 UTC
I finally got my nodes configured....

But attaching GCE PDs does not work in my case. It looks like OSE ( kube ) tries to call the api with the full name instead of the GCE instance name. That would be the same as my hostname.

I do get this error :

May 26 15:15:40 tnode1 atomic-openshift-node: W0526 15:15:40.173126    1380 gce_util.go:183] Retrying attach for GCE PD "t-log" (retry count=3).
qMay 26 15:15:40 tnode1 atomic-openshift-node: E0526 15:15:40.623928    1380 gce_util.go:187] Error attaching PD "t-log": googleapi: Error 400: Invalid value 'tnode1.c.jens-walkthrough.internal'. Values must match the following regular expression: '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?', invalidParameter.

What is the recommended workaround for this?

Comment 6 Andy Goldstein 2016-06-01 13:32:08 UTC
Lutz, based on your most recent comment, this is a dupe of bug 1318230

*** This bug has been marked as a duplicate of bug 1318230 ***