1608625 – Number of NVMe disks attachable is lower than max predefined count for EBS

Bug 1608625 - Number of NVMe disks attachable is lower than max predefined count for EBS

Summary: Number of NVMe disks attachable is lower than max predefined count for EBS

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.10.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.10.z
Assignee:	Hemant Kumar
QA Contact:	Chao Yang
Docs Contact:
URL:
Whiteboard:
Depends On:	1602054 1608626
Blocks:
TreeView+	depends on / blocked

Reported:	2018-07-26 01:15 UTC by Hemant Kumar
Modified:	2023-10-06 17:51 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1602054
Environment:
Last Closed:	2018-09-22 04:55:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	kubernetes kubernetes issues 59015	0	None	closed	Number of NVMe disks attachable is lower than max predefined count for EBS	2021-01-06 09:22:57 UTC
Red Hat Product Errata	RHBA-2018:2660	0	None	None	None	2018-09-22 04:56:04 UTC

Description Hemant Kumar 2018-07-26 01:15:28 UTC

+++ This bug was initially created as a clone of Bug #1602054 +++

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Every time.


Steps to Reproduce:
1. Set up an AWS M5 node
2. Attach more than 27 EBS volumes to pods

Actual results:

The attaches fail.

Expected results:

Success.

Additional info:

See: https://github.com/kubernetes/kubernetes/issues/59015

Fixed in Kube 1.11 by: https://github.com/kubernetes/kubernetes/pull/64154

--- Additional comment from Hemant Kumar on 2018-07-17 14:38:40 EDT ---

It may be possible to fix this

--- Additional comment from Hemant Kumar on 2018-07-17 15:18:09 EDT ---

Sorry my comment got submitted before I can finish typing. It is probably possible to fix this without needing volume limit feature in 3.9. It is not super clean I think but EC2 instance type is available as label in node Object and hence scheduler can potentially look at that label and deduce volume attach limit from it rather than going on with hardcoded values.

I will try and open a PR for it.

--- Additional comment from Hemant Kumar on 2018-07-19 15:38:03 EDT ---

upstream PR - https://github.com/kubernetes/kubernetes/pull/66397

Comment 2 Hemant Kumar 2018-08-21 14:34:09 UTC

PR for origin - https://github.com/openshift/origin/pull/20608

Comment 4 Chao Yang 2018-09-12 05:12:13 UTC

Passed on 
openshift v3.10.45
kubernetes v1.10.0+b81c8f8

Create 27 dynamic pvc and pods, only 25 pods are running. 

[root@ip-172-18-9-136 test]# oc describe pods mypod26
Name:         mypod26
Namespace:    test
Node:         <none>
Labels:       name=frontendhttp
Annotations:  openshift.io/scc=anyuid
Status:       Pending
IP:           
Containers:
  myfrontend:
    Image:        aosqe/hello-openshift
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /tmp from aws (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-mqv98 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  aws:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  ebs26
    ReadOnly:   false
  default-token-mqv98:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-mqv98
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  node-role.kubernetes.io/compute=true
Tolerations:     <none>
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  26s (x26 over 6m)  default-scheduler  0/3 nodes are available: 1 node(s) exceed max volume count, 2 node(s) didn't match node selector.

Comment 6 errata-xmlrpc 2018-09-22 04:55:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2660

Comment 7 Anand Paladugu 2020-06-24 02:26:51 UTC

@Hemant

I have an OCP 3.11 customer-facing this issue.  He is now running some pods with NVME type volumes and it looks like OCP is still treating the attachment limit as 39 and is failing to attach volumes and hence deployments are failing.

1. Are these changes in OCP 3.10 available in OCP 3.11?
2. Are the changes only valid for M5 nodes?

From the code changes, I see that a new default limit is added for M5 nodes.  My customer is running R5 and R4 instances in AWS.

Thanks

Anand

Comment 8 Hemant Kumar 2020-06-24 21:59:18 UTC

Yes in 3.11 out of box - new default limit of 25 were only added for M5 and C5 node types. All other node types including R5 and R4 still use instance limit of 39 and hence can have failing deployments. Customer could define `KUBE_MAX_PD_VOLS` environment variable in scheduler and set it to 25 which would globally change maximum attach limit for all nodes types to 25. Please let me know if that workaround works for the customer.

Note You need to log in before you can comment on or make changes to this bug.