QE checked the bug on version v3.9.69 with below steps, and volumes allowed to attach to a single openstack instance is limited to 26. @Hemant Kumar, could you help confirm this is expected or not ? 1. Update the node and leave only one to schedulable status. # oc adm manage-node --schedulable xxx 2. Enable predicate via scheduler.json # grep -i cinder /etc/origin/master/scheduler.json -A4 -B4 { "name": "MaxAzureDiskVolumeCount" }, { "name": "MaxCinderVolumeCount" }, { "name": "MatchInterPodAffinity" }, 3. Restart api and controller service. 4. Keep creating pvc and pod. # oc get pods NAME READY STATUS RESTARTS AGE mypod01 1/1 Running 0 18m mypod02 1/1 Running 0 18m mypod03 1/1 Running 0 17m mypod04 1/1 Running 0 17m mypod05 1/1 Running 0 17m mypod06 1/1 Running 0 17m mypod07 1/1 Running 0 17m mypod08 1/1 Running 0 17m mypod09 1/1 Running 0 17m mypod10 1/1 Running 0 17m mypod11 1/1 Running 0 17m mypod12 1/1 Running 0 17m mypod13 1/1 Running 0 16m mypod14 1/1 Running 0 16m mypod15 1/1 Running 0 16m mypod16 1/1 Running 0 16m mypod17 1/1 Running 0 16m mypod18 1/1 Running 0 16m mypod19 1/1 Running 0 16m mypod20 1/1 Running 0 16m mypod21 1/1 Running 0 16m mypod22 1/1 Running 0 16m mypod23 1/1 Running 0 15m mypod24 1/1 Running 0 15m mypod25 1/1 Running 0 15m mypod26 0/1 ContainerCreating 0 15m mypod27 0/1 ContainerCreating 0 15m # oc describe pod mypod26 Name: mypod26 Namespace: bz1668893 Node: qe-chaoyang-node-registry-router-1/10.0.77.49 Start Time: Tue, 12 Feb 2019 01:00:18 -0500 Labels: <none> Annotations: openshift.io/scc=anyuid Status: Pending IP: Containers: dynamic: Container ID: Image: aosqe/hello-openshift Image ID: Port: 80/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /mnt/pv from dynamic (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-bs4hl (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: dynamic: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: mypvc26 ReadOnly: false default-token-bs4hl: Type: Secret (a volume populated by a Secret) SecretName: default-token-bs4hl Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/compute=true Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 16m default-scheduler Successfully assigned mypod26 to qe-chaoyang-node-registry-router-1 Normal SuccessfulMountVolume 16m kubelet, qe-chaoyang-node-registry-router-1 MountVolume.SetUp succeeded for volume "default-token-bs4hl" Warning FailedAttachVolume 1m (x15 over 16m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-795e2daf-2e8b-11e9-b868-fa163eb7596d" : failed to attach 66ece5e3-34c6-4408-8c95-4f2de0859adc volume to 27947252-a153-431d-9c27-e1bd4bc5ebbc compute: Internal Server Error Warning FailedMount 30s (x7 over 14m) kubelet, qe-chaoyang-node-registry-router-1 Unable to mount volumes for pod "mypod26_bz1668893(79a63ad0-2e8b-11e9-b868-fa163eb7596d)": timeout expired waiting for volumes to attach/mount for pod "bz1668893"/"mypod26". list of unattached/unmounted volumes=[dynamic]
QE tried again on version v3.9.69 with below steps, 1. Update the node and leave only one to schedulable status. # oc adm manage-node --schedulable xxx # oc get nodes NAME STATUS ROLES AGE VERSION qe-lxia-39-master-etcd-nfs-1 Ready,SchedulingDisabled master 17m v1.9.1+a0ce1bc657 qe-lxia-39-node-registry-router-1 Ready compute 17m v1.9.1+a0ce1bc657 2. Enable predicate via scheduler.json # grep -i cinder /etc/origin/master/scheduler.json -A4 -B4 { "name": "MaxAzureDiskVolumeCount" }, { "name": "MaxCinderVolumeCount" }, { "name": "MatchInterPodAffinity" }, 3. Set KUBE_MAX_PD_VOLS=3 # grep -i vol /etc/sysconfig/atomic-openshift-master-controllers KUBE_MAX_PD_VOLS=3 4. Restart api and controller service. 5. Create 4 pvc/pod. # oc get pods mypod{1..4} NAME READY STATUS RESTARTS AGE mypod1 1/1 Running 0 7m mypod2 1/1 Running 0 6m mypod3 1/1 Running 0 6m mypod4 0/1 Pending 0 5m # oc describe pod mypod4 Name: mypod4 Namespace: default Node: <none> Labels: <none> Annotations: openshift.io/scc=anyuid Status: Pending IP: Containers: dynamic: Image: aosqe/hello-openshift Port: 80/TCP Environment: <none> Mounts: /mnt/ocp_pv from dynamic (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-l9mpz (ro) Conditions: Type Status PodScheduled False Volumes: dynamic: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: mypvc4 ReadOnly: false default-token-l9mpz: Type: Secret (a volume populated by a Secret) SecretName: default-token-l9mpz Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 14s (x22 over 5m) default-scheduler 0/2 nodes are available: 1 MaxVolumeCount, 1 NodeUnschedulable.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0331