Bug 1318230 - Pods is always in ContainerCreating state due to could not attach GCE PD
Summary: Pods is always in ContainerCreating state due to could not attach GCE PD
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Seth Jennings
QA Contact: DeShuai Ma
URL:
Whiteboard:
: 1339086 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-16 10:21 UTC by Qixuan Wang
Modified: 2020-03-11 15:03 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-26 17:22:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
node configuration (1.10 KB, text/plain)
2016-03-18 09:15 UTC, Jan Safranek
no flags Details

Description Qixuan Wang 2016-03-16 10:21:16 UTC
Description of problem:
On GCE environment, create PD and pods with GCE persistent volume. Pod can't run because it could not attach GCE PD. Node log shows "Error attaching PD "mypd-1": googleapi: Error 400: Invalid value 'lxia-ose32.c.openshift-gce-devel.internal'. Values must match the following regular expression: '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?', invalidParameter" . It seems instanceID gets wrong return value thus google API can't parse it.  
  

Version-Release number of selected component (if applicable):
openshift v3.2.0.3
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1. Setup GCE environment with ansible and configure cloud provider on master and node, restart service.

2. Create pd, pod
# gcloud compute disks create --size=500GB --zone=us-central1-a my-data-disk


# vi test-pd.yaml
apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: gcr.io/google_containers/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /test-pd
      name: test-volume
  volumes:
  - name: test-volume
    # This GCE PD must already exist.
    gcePersistentDisk:
      pdName: my-data-disk
      fsType: ext4

# oc create -f test-pd.yaml


3. Check pod 


Actual results:
3. 
[root@ose-32-dma-master us]# oc describe pod test-pd
Name:		test-pd
Namespace:	qwang1
Image(s):	gcr.io/google_containers/test-webserver
Node:		ose-32-dma-node-1.c.openshift-gce-devel.internal/10.240.0.11
Start Time:	Wed, 16 Mar 2016 05:39:04 -0400
Labels:		<none>
Status:		Pending
Reason:		
Message:	
IP:		
Controllers:	<none>
Containers:
  test-container:
    Container ID:	
    Image:		gcr.io/google_containers/test-webserver
    Image ID:		
    Port:		
    QoS Tier:
      cpu:		BestEffort
      memory:		BestEffort
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Environment Variables:
Conditions:
  Type		Status
  Ready 	False 
Volumes:
  test-volume:
    Type:	GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
    PDName:	my-data-disk
    FSType:	ext4
    Partition:	0
    ReadOnly:	false
  default-token-v216z:
    Type:	Secret (a secret that should populate this volume)
    SecretName:	default-token-v216z
Events:
  FirstSeen	LastSeen	Count	From								SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----								-------------	--------	------		-------
  4m		4m		1	{default-scheduler }								Normal		Scheduled	Successfully assigned test-pd to ose-32-dma-node-1.c.openshift-gce-devel.internal
  3m		21s		4	{kubelet ose-32-dma-node-1.c.openshift-gce-devel.internal}			Warning		FailedMount	Unable to mount volumes for pod "test-pd_qwang1(ec2af27a-eb5a-11e5-8d6e-42010af00009)": Could not attach GCE PD "my-data-disk". Timeout waiting for mount paths to be created.
  3m		21s		4	{kubelet ose-32-dma-node-1.c.openshift-gce-devel.internal}			Warning		FailedSync	Error syncing pod, skipping: Could not attach GCE PD "my-data-disk". Timeout waiting for mount paths to be created.


[root@ose-32-dma-node-1 ~]# journalctl -f -u atomic-openshift-node
Mar 16 05:39:46 ose-32-dma-node-1.c.openshift-gce-devel.internal atomic-openshift-node[19926]: W0316 05:39:46.210981   19926 gce_util.go:176] Retrying attach for GCE PD "my-data-disk" (retry count=8).
Mar 16 05:39:46 ose-32-dma-node-1.c.openshift-gce-devel.internal atomic-openshift-node[19926]: E0316 05:39:46.366276   19926 gce_util.go:180] Error attaching PD "my-data-disk": googleapi: Error 400: Invalid value 'ose-32-dma-node-1.c.openshift-gce-devel.internal'. Values must match the following regular expression: '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?', invalidParameter



Expected results:
GCE PD should be acttached

Additional info:

Comment 1 Jan Safranek 2016-03-18 09:15:11 UTC
Created attachment 1137724 [details]
node configuration

Comment 2 Jan Safranek 2016-03-18 09:42:18 UTC
From the node config.yaml:

> nodeName: ose-32-dma-node-1.c.openshift-gce-devel.internal

This tells GCE cloud provider to use "ose-32-dma-node-1.c.openshift-gce-devel.internal" as GCE instance name. GCE does not allow dots in instance names -> error "Values must match the following regular expression ..."

IMO we should fix ansible playbook to put GCE instance name here. It may be quite different to hostname.

Comment 3 Jan Safranek 2016-03-18 10:00:29 UTC
Umm, reading OpenShift sources:

    // NodeName is the value used to identify this particular node in the cluster.  If possible, this should be your fully qualified hostname.
    // If you're describing a set of static nodes to the master, this value must match one of the values in the list
    NodeName string

And in node_config.go:
   cfg.NodeName = options.NodeName

That's KubeletConfig.NodeName. It's not only "value used to identify this particular node in the cluster". It's also used to identify instance in external cloud and thus must be equal to GCE or OpenStack instance name.

Comment 4 Jason DeTiberus 2016-03-21 20:47:05 UTC
Jan,

Does this behave right when the NodeName is not defined in the config file?

If so, we can probably update the config to not set NodeName if the user has not overridden the value of openshift_hostname for the node.

Comment 5 Qixuan Wang 2016-03-22 05:39:30 UTC
If nodeName isn't defined in the node config file, node service won't work.

Comment "nodeName: ose-32-dma-node-1.c.openshift-gce-devel.internal" or set "nodeName: null", restart atomic-openshift-node service:

Mar 22 01:28:45 ose-32-dma-node-1.c.openshift-gce-devel.internal systemd[1]: Starting Atomic OpenShift Node...
Mar 22 01:28:45 ose-32-dma-node-1.c.openshift-gce-devel.internal atomic-openshift-node[5699]: Invalid NodeConfig /etc/origin/node/node-config.yaml
Mar 22 01:28:45 ose-32-dma-node-1.c.openshift-gce-devel.internal atomic-openshift-node[5699]: nodeName: Required value

Did I get your point?

Comment 6 Qixuan Wang 2016-03-22 05:43:56 UTC
Hmm, I don't think so. Perhaps I need to change node_config.go. Please ignore the above comments.

Comment 7 Jan Safranek 2016-03-22 11:57:40 UTC
Jason,

when NodeName is not defined in kubelet config, kcfg.Hostname is used instead in most cloud operations:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/server.go#L584

To get things even more complicated, GCE PD volume plugin ignores any NodeName and uses machine hostname (or hostname-override) as instance name:
https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/gce_pd/gce_util.go#L187

All this is very confusing and probably buggy, every volume plugin and cloud provider works slightly differently.

Conclusion: with current code, GCE instance name must be the same as hostname. Either ansbible scripts or GCE volume plugin (or both) must be fixed.

Comment 8 Jason DeTiberus 2016-03-22 19:58:59 UTC
Jan,

I think ideally, we would want all of the cloud providers to ignore the value of NodeName all together. It is possible to get the data that is needed to query the api directly from the host metadata, rather than relying on a user provided setting matching up with the value queried in the cloud provider api.

That said, we can definitely work around it in ansible until we can get the upstream code decoupled from the NodeName.

Comment 9 Jan Safranek 2016-03-23 10:53:36 UTC
(In reply to Jason DeTiberus from comment #8)
> I think ideally, we would want all of the cloud providers to ignore the
> value of NodeName all together.

Agreed, cloud providers need some refactoring in Kubernetes 1.3. What's the right BZ component & team who should take care about this?

Comment 10 Jason DeTiberus 2016-03-23 17:37:52 UTC
Good question on component and team. I suspect it might be best to use the Kubernetes component.

Andy, would your team be the proper team to take a look at fixes to the cloud providers to decouple the NodeName setting from the api lookup?

Jan, We'll probably want to clone this bug for the upstream changes and keep this current bug around for implementing a workaround for 3.2.

Comment 11 Andy Goldstein 2016-03-23 17:41:53 UTC
Yes, Origin product, Kubernetes component for cloud provider code in Kube for post 3.2 work.

Comment 12 Liang Xia 2016-04-19 10:40:51 UTC
Could someone look into this since it is blocking storage testing.

Comment 13 Seth Jennings 2016-04-19 19:02:48 UTC
I can confirm that using the instance name rather than the FQDN of the instance for the nodeName works around this issue.

Had a similar situation for Openstack cloudprovider:
https://bugzilla.redhat.com/show_bug.cgi?id=1321964

Opened a Trello card to get the cloudprovider stuff detached from the nodeName in the Openstack case:
https://trello.com/c/dyHpMQw9/335-as-a-user-i-want-to-the-installer-to-configure-for-the-openstack-cloudprovider-without-having-to-manually-edit-the-node-configs

I believe that setting openshift_hostname=<instance name> in the installer should prevent needing to manually edit the node-config.yaml post installation.

But, yes, we need to get all the cloudprovider code away from using the nodeName.

Comment 14 Paul Morie 2016-04-19 19:50:20 UTC
Changing the semantics of cloud provider and nodename is a very involved task -- not one that can be done for 3.2.

Seth's workaround is the solution for now.

Comment 15 Liang Xia 2016-04-20 07:48:53 UTC
We can get the workaround working, so removed keyword testblocker and lower the Severity/Priority.

Assign the bug back since we still need to get the issue fixed, either in code or ansible playbook.

Comment 16 Andy Goldstein 2016-06-01 13:32:08 UTC
*** Bug 1339086 has been marked as a duplicate of this bug. ***

Comment 17 Andy Goldstein 2016-08-08 20:17:14 UTC
This won't make 3.3 and needs deeper redesign work with the Kube community. We'll cover that work in Trello.

Comment 18 Derek Carr 2016-10-26 17:22:26 UTC
The current feature is working as designed that node name and instance name must match.  If we want to remove this constraint, and we have a compelling reason for removing said constraint, please open a new RFE that captures that detail.  Until then, closing this bug as designed.

Comment 19 Aleksandar Kostadinov 2016-12-06 12:10:41 UTC
Just for a reference, bug 1367201 is the RFE to decouple node and machine name.


Note You need to log in before you can comment on or make changes to this bug.