Bug 1318346

Summary: Device limitation doesn't work if KUBE_MAX_PD_VOLS > default volumes
Product: OpenShift Container Platform Reporter: Qixuan Wang <qixuan.wang>
Component: NodeAssignee: Solly Ross <sross>
Status: CLOSED WONTFIX QA Contact: Qixuan Wang <qixuan.wang>
Severity: low Docs Contact:
Priority: low    
Version: 3.1.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-02 19:35:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Qixuan Wang 2016-03-16 15:23:01 UTC
Description of problem:
GCE instances can have up to 16 PD volumes attached, if KUBE_MAX_PD_VOLS > 16, I can still create pods and attach GCE PDs. I think getMaxVols function of kubernetes/plugin/pkg/scheduler/algorithmprovider/defaults/defaults.go should judge this condition. Same with AWS EBS.


Version-Release number of selected component (if applicable):
openshift v3.2.0.3
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1. Setup GCE environment with ansible and configure cloud provider on master and node.

2. Enable "MaxGCEPDVolumeCount " predicate in scheduler.json
# cat /home/master/scheduler.json
{
  "kind": "Policy",
  "apiVersion": "v1",
  "predicates": [
    {"name": "MaxGCEPDVolumeCount"}
  ]
}

3. Add env "KUBE_MAX_PD_VOLS=100" in /etc/sysconfig/atomic-openshift-master
# cat /etc/sysconfig/atomic-openshift-master
KUBE_MAX_PD_VOLS=100

4. Restart master service
# service atomic-openshift-master restart

5. Create many GCE Persistent Disks
# for i in $(seq 1 19); do gcloud compute disks create --size=1GB --zone=us-central1-a qwang-pd-$i; done

6. Create many pods with GCE PD
# cat pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: test-pd-1
spec:
  containers:
  - image: gcr.io/google_containers/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /test-pd
      name: test-volume
  volumes:
  - name: test-volume
    gcePersistentDisk:
      pdName: my-data-disk-1
      fsType: ext4

# for i in {1..19}; do sed -e 's/test-pd-1/test-pd-'$i'/g' -e 's/my-data-disk-1/my-data-disk-'$i'/g' pod.yaml | oc create -f -; done

7. Check pods status
# oc get pod -o wide


Actual results:
7. [root@ose-32-dma-master us]# oc get pod -o wide
NAME         READY     STATUS              RESTARTS   AGE       NODE
test-pd-1    0/1       ContainerCreating   0          19s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-10   0/1       ContainerCreating   0          16s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-11   0/1       ContainerCreating   0          16s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-12   0/1       ContainerCreating   0          15s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-13   0/1       ContainerCreating   0          15s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-14   0/1       ContainerCreating   0          15s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-15   0/1       ContainerCreating   0          14s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-16   0/1       ContainerCreating   0          14s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-17   0/1       ContainerCreating   0          14s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-18   0/1       ContainerCreating   0          13s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-19   0/1       ContainerCreating   0          13s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-2    0/1       ContainerCreating   0          19s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-3    0/1       ContainerCreating   0          18s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-4    0/1       ContainerCreating   0          18s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-5    0/1       ContainerCreating   0          18s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-6    0/1       ContainerCreating   0          17s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-7    0/1       ContainerCreating   0          17s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-8    0/1       ContainerCreating   0          17s       ose-32-dma-node-1.c.openshift-gce-devel.internal
test-pd-9    0/1       ContainerCreating   0          16s       ose-32-dma-node-1.c.openshift-gce-devel.internal


Expected results:
7. GCE can only attach to a maximum of 16 devices, if value of KUBE_MAX_PD_VOLS > 16, it should use DefaultMaxGCEPDVolumes = 16. So the 17th pod should pending and get FailedScheduling event like this:

Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  54s		23s		6	{default-scheduler }			Warning		FailedScheduling	pod (test-pd-17) failed to fit in any node
fit failure on node (ose-32-dma-node-1.c.openshift-gce-devel.internal): MaxGCEPDVolumeCount


Additional info:

Comment 1 Qixuan Wang 2016-03-16 15:32:53 UTC
Sorry, step 5 should be my-data-disk-$i instead of qwang-pd-$i
# for i in $(seq 1 19); do gcloud compute disks create --size=1GB --zone=us-central1-a  my-data-disk-$i; done

Comment 2 Solly Ross 2016-03-16 15:39:10 UTC
Hmm... I'm not so sure about this.  It seems to me that it would be unfortunate if AWS or GCE raised the max number (either globally, or in a new environment type, or something of the sort), and we had to release a patch to Kube and OpenShift to support that.  With the current behavior, we could simply instruct people to set the limit env variable accordingly.

Comment 4 Qixuan Wang 2016-03-18 02:40:19 UTC
I agree we could simply instruct people to set the limit env variable and specify in documents. Thanks. Downgrade the severity level.

Comment 5 Solly Ross 2017-03-02 19:35:13 UTC
Closing WONTFIX as per the discussion on the linked issues.