Description of problem: Max attached ebs volume number is counted separately for csi driver and in-tree plugin, it will make a pod is scheduled successfully, but hang on the volume attachment. Version-Release number of selected component (if applicable): 4.6.0-0.nightly-2020-09-07-210458 How reproducible: Always Steps to Reproduce: 1. Launched a 4.6 Cluster on OCP 2. Created 25 PVCs provisioned by ebs.csi.aws.com 3. Created a Pod(pod1) consumming these PVCs and scheduled this pod to worker1 4. Create 25 PVCs provisioned by kubernetes.io/aws-ebs, after pod1 is running 5. Created a Pod(pod2) consumming these PVCs and scheduled this pod to worker2 Actual results: pod2 is scheduled successfully,but hang on volume attachment Report the following event repeatly: Warning FailedMount 5m5s kubelet, ip-10-0-74-88.ap-northeast-1.compute.internal Unable to attach or mount volumes: unmounted volumes=[local20 local9 local10 local16 local5 local19 local21 local2 local3 local8 local25 local11 local1 local18 local22 local14 local6 local7 local23 local4 local13 local15 local12 local24], unattached volumes=[local20 local9 local10 local16 local5 local19 default-token-52hmc local21 local2 local3 local8 local25 local11 local1 local18 local22 local14 local6 local7 local23 local4 local13 local15 local17 local12 local24]: timed out waiting for the condition Expected results: pod2 can not be scheduled. Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
This is current limitation of OCP & AWS EBS CSI driver. Cluster admin should use either in-tree volumes or CSI volumes, but not both at the same time. This should be documented as limitation of the AWS EBS CSI driver in our docs.
Created https://github.com/openshift/openshift-docs/pull/25219 Moving to QE and SME review.
Feedback applied, awaiting second review by SME and QE before merge: https://github.com/openshift/openshift-docs/pull/25348#issuecomment-689819227
Docs live on 4.5, 4.6: https://docs.openshift.com/container-platform/4.5/storage/persistent_storage/persistent-storage-aws.html#maximum-number-of-ebs-volumes-on-a-node_persistent-storage-aws Waiting for answer from SME on whether this denotes a support status change or not. Also opened separate PRs for 4.3, 4.4 that do not include note about CSI because that is not supported until 4.5: - https://github.com/openshift/openshift-docs/pull/25413 - https://github.com/openshift/openshift-docs/pull/25411
Confirmed with Storage team that we have not removed KUBE_MAX_PD_VOLS support for in-tree plug-ins. According to Hemant, how to configure is tricky, and might be possible by modifying scheduler's pod spec and applying an environment variable. But it has to be supported by scheduler operator. Closing BZ.