Description of problem: Use dynamic provision of ebs-pvc for pods. For a single compute node, we have at most 52 running pods. The rest are in pending status up to pvc provision. Version-Release number of selected component (if applicable): oc v3.6.172.0.0 How reproducible: Cluster: 2 masters, 1 infra, 2 compute, 1 lb. KUBE_MAX_PD_VOLS=260 in /etc/sysconfig/atomic-openshift-master-controllers on both master nodes. Steps to Reproduce: 1. make one of the compute node SchedulingDisabled 2. create 60 pods with pvc volumes Actual results: 52 pods are in running status. The rest 8 are in pending status. Expected results: 60 running pods. Master Log: Sep 6 16:09:15 ip-172-31-27-240 atomic-openshift-master-controllers: E0906 16:09:15.912766 23136 attacher.go:73] Error attaching volume "aws://us-west-2b/vol-064280bb36606dab9": Too many EBS volumes attached to node ip-172-31-50-156.us-west-2.compute.internal. Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info: related issues in the upstream: https://github.com/kubernetes/kubernetes/issues/41453 https://github.com/openshift/origin/issues/13025 https://github.com/kubernetes/kubernetes/pull/41455
@bchilds - re-opening this to ask if this should be documented. I could not find any mention of this limitation/magic number in the openshift doc - just the upstream issues (e.g. https://github.com/kubernetes/kubernetes/issues/41453) which hinted that it had been addressed. If you don't feel it warrants documentation, feel free to close it again. Thanks.
Yep, that is correct - the environment variable KUBE_MAX_PD_VOLS deos allow one to go beyond default 39 limit. I have opened documentation PR for documenting this - https://github.com/openshift/openshift-docs/pull/7002
Thanks for clarification, @Hemant.
The PR 7002 has not been merged yet. So I guess I cannot verify the document yet.
Verified on this page: https://docs.openshift.com/container-platform/3.6/install_config/persistent_storage/persistent_storage_aws.html It looks good to me. Thanks for updating the doc.
Also verified on origin doc (latest) https://docs.openshift.org/latest/install_config/persistent_storage/persistent_storage_aws.html
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1233