1318974 – Creating pods on OSE with awsElasticBlockStore only assigns devices /dev/xvdb - /dev/xvdp to openshift node

Bug 1318974 - Creating pods on OSE with awsElasticBlockStore only assigns devices /dev/xvdb - /dev/xvdp to openshift node

Summary: Creating pods on OSE with awsElasticBlockStore only assigns devices /dev/xvdb...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Jan Safranek
QA Contact:	Jianwei Hou
Docs Contact:
URL:
Whiteboard:
Depends On:	1315995
Blocks:	1267746
TreeView+	depends on / blocked

Reported:	2016-03-18 10:00 UTC by Jianwei Hou
Modified:	2017-04-10 11:09 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	1315995
Environment:
Last Closed:	2016-05-19 20:12:39 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
test log (10.50 KB, text/plain) 2016-05-11 06:32 UTC, Jianwei Hou	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2016:1094	0	normal	SHIPPED_LIVE	Important: Red Hat OpenShift Enterprise 3.2 security update	2016-05-20 00:12:27 UTC

Description Jianwei Hou 2016-03-18 10:00:34 UTC

+++ This bug was initially created as a clone of Bug #1315995 +++

Description of problem:

When creating Openshift pods with AWS EBS plugin it is possible only to attach /dev/xvdb - /dev/xvdp to openshift node and thus only to start number of pods which is equal to number of allocated EBS devices ( /dev/xvdb - /dev/xvdp ) 
Openshift pods uses AWS EBS as persistent storage 


Version-Release number of selected component (if applicable):

# Openshift packages installed 

# rpm -qa | grep atomic-opensh
tuned-profiles-atomic-openshift-node-3.1.1.911-1.git.0.14f4c71.el7.x86_64
atomic-openshift-sdn-ovs-3.1.1.911-1.git.0.14f4c71.el7.x86_64
atomic-openshift-clients-3.1.1.911-1.git.0.14f4c71.el7.x86_64
atomic-openshift-node-3.1.1.911-1.git.0.14f4c71.el7.x86_64
atomic-openshift-master-3.1.1.911-1.git.0.14f4c71.el7.x86_64
atomic-openshift-3.1.1.911-1.git.0.14f4c71.el7.x86_64
root@ip-172-31-7-106: ~ # uname -a 
Linux ip-172-31-7-106.us-west-2.compute.internal 3.10.0-327.el7.x86_64 #1 SMP Thu Oct 29 17:29:29 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

On functional Openshift environment ( I tested with two Openshift nodes ) with above packages, run 

# python create_ebs_pod.py --volumesize=1 --image=r7perffio --tagprefix=openshift_test --minpod=1 --maxpod=44 --pvfile=pv.json --pvcfile=pvc.json --podfile=pod.json

This will create 43 pods with one 1 GB  EBS device attached to each of them, and approximatelly 20 pods an ose node should get.  
However, not all 20 pods will start, but only up to exhaustion of devices ( /dev/xvda - /dev/xvdb ). 

where, 

create_ebs_pod.py
pv.json
pvc.json 
podfile.json

can be found at : https://github.com/ekuric/openshift/tree/master/_aws 

Steps to Reproduce:
Please see step above 

Actual results:

Last device attached to amazon instance ( acting as openshift node ) in tests when awsElasticBlockStore plugin was used was /dev/xvdp
what means that devices from /dev/xvdb - /dev/xvdp were available for pods to use as persistent storage. After this, no more pods were possible to start.


Expected results:

to be albe to attach more EBS devices to amazon instance when using awsElasticBlockStore plugin  - and be able to start more Openshift pods 


Additional info:

In logs is stated below [1] 

Amazon : 

WS cloud provider says:

// See:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/block-device-mapping-concepts.html
devices := []mountDevice{}
for c := 'f'; c <= 'p'; c++ {
	devices = append(devices, mountDevice(fmt.Sprintf("%c", c)))
}


pod file used : https://github.com/ekuric/openshift/blob/master/_aws/pod.json
pv file used: https://github.com/ekuric/openshift/blob/master/_aws/pv.json
pvcfile used : https://github.com/ekuric/openshift/blob/master/_aws/pvc.json 

[1]
Mar  8 08:24:23 ip-172-31-7-105 atomic-openshift-node: E0308 08:24:23.508726   12979 pod_workers.go:138] Error syncing pod 8f3bf98c-e529-11e5-b35d-028bb7d6e433, skipping: Could not attach EBS Disk "vol-9652ec60". Timeout waiting for mount paths to be created.
Mar  8 08:24:23 ip-172-31-7-105 atomic-openshift-node: W0308 08:24:23.656873   12979 aws_util.go:167] Retrying attach for EBS Disk "vol-9b53ed6d" (retry count=7).
Mar  8 08:24:23 ip-172-31-7-105 atomic-openshift-node: W0308 08:24:23.770064   12979 aws.go:950] Unexpected EBS DeviceName: "/dev/sda1"
Mar  8 08:24:23 ip-172-31-7-105 atomic-openshift-node: W0308 08:24:23.770114   12979 aws.go:983] Could not assign a mount device (all in use?).  mappings=map[b:vol-6936fc9f f:vol-b954ea4f j:vol-4154eab7 l:vol-1b54eaed m:vol-9855eb6e p:vol-4155ebb7 a1:vol-a236fc54 g:vol-c654ea30 h:vol-2754ead1 i:vol-6e54ea98 k:vol-7a54ea8c n:vol-2c55ebda o:vol-ad55eb5b], valid=[f g h i j k l m n

--- Additional comment from Jan Safranek on 2016-03-09 23:05:50 CST ---

This is the root cause:
https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L903

Kubernetes uses only devices /dev/xvd[f-p] or /dev/sd[f-p], as suggested by
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html#available-ec2-device-names:

    "Recommended for EBS Volumes: ... /dev/sd[f-p]"

This leaves us with 11 EBS volumes, which is quite few. IMO we could use more, up to 40 volumes supported by Amazon.


As related bug, when Kubernetes runs out of device names, 'kubectl describe pod' shows cryptic error:
'Value (/dev/xvd) for parameter device is invalid. /dev/xvd is not a valid EBS device name.'
IMO we should print something like 'Too many devices attached, only 11 devices are supported by AWS.'
Note to self: look at https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L1054 and try to return an error here.

--- Additional comment from Jan Safranek on 2016-03-14 23:43:08 CST ---

Created PR https://github.com/kubernetes/kubernetes/pull/22942 and waiting for upstream feedback.

--- Additional comment from Jan Safranek on 2016-03-16 20:43:36 CST ---

The first part, raising the limit to 39 has been merged to Kubernetes 1.2. Admins can adjust the limit by setting env. variable "KUBE_MAX_PD_VOLS" in scheduler process (openshift-master), however kubelet will refuse to attach more that 39 volumes anyway. 'oc describe pod' will show clear message that too many volumes are attached and a pod can't be started.

https://github.com/kubernetes/kubernetes/pull/22942


The second part, allowing kubelet to attach more than 39 volumes, is still open and I'm working on it. Tracked here: https://github.com/kubernetes/kubernetes/issues/22994

Comment 2 Eric Paris 2016-03-18 18:49:47 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1315995 has more info, check ongoing debuggin status there...

Comment 3 Jan Safranek 2016-03-20 13:33:27 UTC

Restoring needinfo flag, sorry.

Comment 4 Jan Safranek 2016-03-22 15:25:45 UTC

Status update:
- in Origin 3.2, there is hard limit of 39 devices now.

- I have PR pending upstream to remove this limit and depend only on the scheduler to schedule 39 (configurable) volumes to a node. 
https://github.com/kubernetes/kubernetes/pull/23254

Comment 5 Andy Goldstein 2016-03-29 13:58:19 UTC

Are we waiting on 23254 or can this bug go to ON_QA?

Comment 6 Jan Safranek 2016-03-30 18:28:17 UTC

Ok, let's move it to MODIFIED state for 3.2 - there is hard limit of 39 volumes in kubelet, which is way better than we have today.

Jeremy & Elvir, if you want this configurable, please fill a new bug for 3.3 or later.

Comment 9 Jianwei Hou 2016-05-11 06:32:03 UTC

Verified on 
openshift v3.2.0.43
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

There is no limit for volumes on AWS now. I have created 55 PVs, PVCs and pods, all the pods are running.

Comment 10 Jianwei Hou 2016-05-11 06:32:51 UTC

Created attachment 1155962 [details]
test log

Comment 11 Jianwei Hou 2016-05-11 09:08:45 UTC

Correction, the test in comment 9 was performed on a cluster with 2 nodes, I disabled one node so that all pods are scheduled to another node, the node can have as many as 39 pods with ebs volumes. When the 40th pod was created, got error 'Could not attach EBS Disk "aws://us-east-1d/vol-719f5ed4": Too many EBS volumes attached to node ip-172-18-12-231.ec2.internal'.  So this bug is fixed now.

Comment 13 errata-xmlrpc 2016-05-19 20:12:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1094

Note You need to log in before you can comment on or make changes to this bug.