Bug 1668893
| Summary: | 3.9 Clarification on KUBE_MAX_PD_VOLS for OpenShift/OpenStack Integration | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Hemant Kumar <hekumar> |
| Component: | Storage | Assignee: | Hemant Kumar <hekumar> |
| Status: | CLOSED ERRATA | QA Contact: | Liang Xia <lxia> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.9.0 | CC: | agogala, aos-bugs, aos-storage-staff, hekumar |
| Target Milestone: | --- | ||
| Target Release: | 3.9.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1659442 | Environment: | |
| Last Closed: | 2019-02-20 08:46:56 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1659442, 1669543, 1669544 | ||
| Bug Blocks: | |||
QE tried again on version v3.9.69 with below steps,
1. Update the node and leave only one to schedulable status.
# oc adm manage-node --schedulable xxx
# oc get nodes
NAME STATUS ROLES AGE VERSION
qe-lxia-39-master-etcd-nfs-1 Ready,SchedulingDisabled master 17m v1.9.1+a0ce1bc657
qe-lxia-39-node-registry-router-1 Ready compute 17m v1.9.1+a0ce1bc657
2. Enable predicate via scheduler.json
# grep -i cinder /etc/origin/master/scheduler.json -A4 -B4
{
"name": "MaxAzureDiskVolumeCount"
},
{
"name": "MaxCinderVolumeCount"
},
{
"name": "MatchInterPodAffinity"
},
3. Set KUBE_MAX_PD_VOLS=3
# grep -i vol /etc/sysconfig/atomic-openshift-master-controllers
KUBE_MAX_PD_VOLS=3
4. Restart api and controller service.
5. Create 4 pvc/pod.
# oc get pods mypod{1..4}
NAME READY STATUS RESTARTS AGE
mypod1 1/1 Running 0 7m
mypod2 1/1 Running 0 6m
mypod3 1/1 Running 0 6m
mypod4 0/1 Pending 0 5m
# oc describe pod mypod4
Name: mypod4
Namespace: default
Node: <none>
Labels: <none>
Annotations: openshift.io/scc=anyuid
Status: Pending
IP:
Containers:
dynamic:
Image: aosqe/hello-openshift
Port: 80/TCP
Environment: <none>
Mounts:
/mnt/ocp_pv from dynamic (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-l9mpz (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
dynamic:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mypvc4
ReadOnly: false
default-token-l9mpz:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-l9mpz
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 14s (x22 over 5m) default-scheduler 0/2 nodes are available: 1 MaxVolumeCount, 1 NodeUnschedulable.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0331 |
QE checked the bug on version v3.9.69 with below steps, and volumes allowed to attach to a single openstack instance is limited to 26. @Hemant Kumar, could you help confirm this is expected or not ? 1. Update the node and leave only one to schedulable status. # oc adm manage-node --schedulable xxx 2. Enable predicate via scheduler.json # grep -i cinder /etc/origin/master/scheduler.json -A4 -B4 { "name": "MaxAzureDiskVolumeCount" }, { "name": "MaxCinderVolumeCount" }, { "name": "MatchInterPodAffinity" }, 3. Restart api and controller service. 4. Keep creating pvc and pod. # oc get pods NAME READY STATUS RESTARTS AGE mypod01 1/1 Running 0 18m mypod02 1/1 Running 0 18m mypod03 1/1 Running 0 17m mypod04 1/1 Running 0 17m mypod05 1/1 Running 0 17m mypod06 1/1 Running 0 17m mypod07 1/1 Running 0 17m mypod08 1/1 Running 0 17m mypod09 1/1 Running 0 17m mypod10 1/1 Running 0 17m mypod11 1/1 Running 0 17m mypod12 1/1 Running 0 17m mypod13 1/1 Running 0 16m mypod14 1/1 Running 0 16m mypod15 1/1 Running 0 16m mypod16 1/1 Running 0 16m mypod17 1/1 Running 0 16m mypod18 1/1 Running 0 16m mypod19 1/1 Running 0 16m mypod20 1/1 Running 0 16m mypod21 1/1 Running 0 16m mypod22 1/1 Running 0 16m mypod23 1/1 Running 0 15m mypod24 1/1 Running 0 15m mypod25 1/1 Running 0 15m mypod26 0/1 ContainerCreating 0 15m mypod27 0/1 ContainerCreating 0 15m # oc describe pod mypod26 Name: mypod26 Namespace: bz1668893 Node: qe-chaoyang-node-registry-router-1/10.0.77.49 Start Time: Tue, 12 Feb 2019 01:00:18 -0500 Labels: <none> Annotations: openshift.io/scc=anyuid Status: Pending IP: Containers: dynamic: Container ID: Image: aosqe/hello-openshift Image ID: Port: 80/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /mnt/pv from dynamic (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-bs4hl (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: dynamic: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: mypvc26 ReadOnly: false default-token-bs4hl: Type: Secret (a volume populated by a Secret) SecretName: default-token-bs4hl Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/compute=true Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 16m default-scheduler Successfully assigned mypod26 to qe-chaoyang-node-registry-router-1 Normal SuccessfulMountVolume 16m kubelet, qe-chaoyang-node-registry-router-1 MountVolume.SetUp succeeded for volume "default-token-bs4hl" Warning FailedAttachVolume 1m (x15 over 16m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-795e2daf-2e8b-11e9-b868-fa163eb7596d" : failed to attach 66ece5e3-34c6-4408-8c95-4f2de0859adc volume to 27947252-a153-431d-9c27-e1bd4bc5ebbc compute: Internal Server Error Warning FailedMount 30s (x7 over 14m) kubelet, qe-chaoyang-node-registry-router-1 Unable to mount volumes for pod "mypod26_bz1668893(79a63ad0-2e8b-11e9-b868-fa163eb7596d)": timeout expired waiting for volumes to attach/mount for pod "bz1668893"/"mypod26". list of unattached/unmounted volumes=[dynamic]