Bug 1734673
| Summary: | Rescheduling of pod cause local volume failed to mount | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Liang Xia <lxia> |
| Component: | kube-scheduler | Assignee: | Maciej Szulik <maszulik> |
| Status: | CLOSED ERRATA | QA Contact: | Qin Ping <piqin> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.2.0 | CC: | aos-bugs, aos-storage-staff, bchilds, chaoyang, geliu, hekumar, maszulik, mfojtik |
| Target Milestone: | --- | ||
| Target Release: | 4.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-10-16 06:34:11 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The project is clean-up by our automation framework, so I don't have the bound PVC YAML.
Provide the YAML which is used to create the PVC.
---
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "mypvc"
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"volumeMode": "Block",
"resources": {
"requests": {
"storage": "1Gi"
}
}
}
}
Maybe it looks like as below. (Got from another project, might be different).
$ oc get pvc mypvc -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2019-07-31T08:25:27Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
name: dynamic-pvc
name: mypvc
namespace: t5
resourceVersion: "936299"
selfLink: /api/v1/namespaces/t5/persistentvolumeclaims/mypvc
uid: c0676594-b36c-11e9-8232-000d3a92e41c
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local-block-sc
volumeMode: Block
volumeName: local-pv-158dfe47
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
phase: Bound
Got this reproduced.
$ oc get pvc,pod,pv
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mypvc Bound local-pv-7f58a50f 1Gi RWO local-block-sc 136m
NAME READY STATUS RESTARTS AGE
pod/mypod 0/1 ContainerCreating 0 137m
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/local-pv-158dfe47 2Gi RWO Delete Available local-block-sc 136m
persistentvolume/local-pv-7f58a50f 1Gi RWO Delete Bound 1i893/mypvc local-block-sc 21h
persistentvolume/local-pv-c692d3f2 2Gi RWO Delete Available local-filesystem-sc 22h
persistentvolume/local-pv-f012ba9e 1Gi RWO Delete Available local-filesystem-sc 24h
$ oc describe pod
Name: mypod
Namespace: 1i893
Priority: 0
PriorityClassName: <none>
Node: qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx/10.0.32.5
Start Time: Thu, 01 Aug 2019 11:37:59 +0800
Labels: <none>
Annotations: openshift.io/scc: restricted
Status: Pending
IP:
Containers:
mycontainer:
Container ID:
Image: aosqe/hello-openshift
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tktlc (ro)
Devices:
/dev/myblock from myvolume
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
myvolume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mypvc
ReadOnly: false
default-token-tktlc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-tktlc
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 137m (x2 over 137m) default-scheduler AssumePod failed: pod ba98a0cd-b40d-11e9-a378-000d3a92e02d is in the cache, so can't be assumed
Warning FailedScheduling 137m default-scheduler Binding rejected: Operation cannot be fulfilled on pods/binding "mypod": pod mypod is already assigned to node "qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx"
Warning FailedScheduling 137m default-scheduler Operation cannot be fulfilled on persistentvolumes "local-pv-158dfe47": the object has been modified; please apply your changes to the latest version and try again
Warning FailedScheduling 137m default-scheduler AssumePod failed: pod ba98a0cd-b40d-11e9-a378-000d3a92e02d is in the cache, so can't be assumed
Warning FailedScheduling 137m (x2 over 137m) default-scheduler AssumePod failed: pod ba98a0cd-b40d-11e9-a378-000d3a92e02d is in the cache, so can't be assumed
Warning FailedScheduling 137m default-scheduler pod "1i893/mypod" does not exist any more
Normal Scheduled 137m default-scheduler Successfully assigned 1i893/mypod to qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx
Warning FailedMount 2m30s (x80214 over 137m) kubelet, qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx MapVolume.NodeAffinity check failed for volume "local-pv-7f58a50f" : No matching NodeSelectorTerms
Typo in last message - "The entire point of pods that use local-storage is that they can't be rescheduled on another Node. I am curious how did we manage to reschedule them!" I got this to reproduce in my own Amazon cluster too. Getting multiple schedulers to start acting as "leaders" (even if they dont have leader lock) requires loss of connectivity from scheduler to api-server. The loss of connection to api-server causes leader lock lease to move around. For example: # When cluster first came up scheduler A --> waiting on lease lock scheduler B --> has leader lock scheduler C --> waiting on lease lock # loss of connectivity to api-server scheduler A --> New leader Scheduler B --> lost leader lock (because of api-server connectivity) scheduler C --> waiting on lease lock At this point, both scheduler A and scheduler B can schedule pods. Because scheduler A is the new leader but scheduler B although is not the current leader, its `Run` loop is still running. It is still watching pods/pvcs etc and doing everything a scheduler does. Unlike controller-managers (https://github.com/kubernetes/kubernetes/blob/master/cmd/kube-controller-manager/app/controllermanager.go#L282) , schedulers don't crash when they loose leader lock (https://github.com/kubernetes/kubernetes/blob/master/cmd/kube-scheduler/app/server.go#L265) . Upstream PR: https://github.com/kubernetes/kubernetes/pull/81306 Verified with: 4.2.0-0.nightly-2019-08-29-170426 No multiple schedulers are in active status. *** Bug 1734612 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |
Description of problem: Pod is rescheduled due to some conflict, which cause local volume failed to mount. Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-07-28-222114 local-storage-operator.v4.2.0 How reproducible: Not sure Steps to Reproduce: 1.Deploy local-storage-operator with several disks. 2.Create pod and pvc. 3.Check the pod Actual results: Pod in ContainerCreating status. Expected results: Pod is up and running with local volume. Additional info: $ oc describe pod Name: mypod Namespace: 6bgu5 Priority: 0 PriorityClassName: <none> Node: qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb/10.0.32.4 Start Time: Wed, 31 Jul 2019 13:38:24 +0800 Labels: <none> Annotations: openshift.io/scc: restricted Status: Pending IP: Containers: mycontainer: Container ID: Image: aosqe/hello-openshift Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-pdwkv (ro) Devices: /dev/myblock from myvolume Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: myvolume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: mypvc ReadOnly: false default-token-pdwkv: Type: Secret (a volume populated by a Secret) SecretName: default-token-pdwkv Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m21s default-scheduler Operation cannot be fulfilled on persistentvolumes "local-pv-7f58a50f": the object has been modified; please apply your changes to the latest version and try again Warning FailedScheduling 2m21s default-scheduler AssumePod failed: pod 6926f0bd-b355-11e9-8232-000d3a92e41c is in the cache, so can't be assumed Warning FailedScheduling 2m20s default-scheduler pod "6bgu5/mypod" does not exist any more Warning FailedScheduling 2m14s (x2 over 2m14s) default-scheduler AssumePod failed: pod 6926f0bd-b355-11e9-8232-000d3a92e41c is in the cache, so can't be assumed Warning FailedScheduling 2m13s (x2 over 2m14s) default-scheduler AssumePod failed: pod 6926f0bd-b355-11e9-8232-000d3a92e41c is in the cache, so can't be assumed Normal Scheduled 2m13s default-scheduler Successfully assigned 6bgu5/mypod to qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb Warning FailedScheduling 2m13s default-scheduler Binding rejected: Operation cannot be fulfilled on pods/binding "mypod": pod mypod is already assigned to node "qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb" Warning FailedMount 2m9s (x25 over 2m12s) kubelet, qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb MapVolume.NodeAffinity check failed for volume "local-pv-158dfe47" : No matching NodeSelectorTerms $ oc get pv local-pv-158dfe47 local-pv-7f58a50f -o yaml apiVersion: v1 items: - apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: local-volume-provisioner-qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx-740f8302-b1a4-11e9-9ac3-000d3a92e440 creationTimestamp: "2019-07-31T06:54:16Z" finalizers: - kubernetes.io/pv-protection labels: storage.openshift.com/local-volume-owner-name: local-disks storage.openshift.com/local-volume-owner-namespace: local-storage name: local-pv-158dfe47 resourceVersion: "910222" selfLink: /api/v1/persistentvolumes/local-pv-158dfe47 uid: 033566b5-b360-11e9-a378-000d3a92e02d spec: accessModes: - ReadWriteOnce capacity: storage: 2Gi local: path: /mnt/local-storage/local-block-sc/sdd nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx persistentVolumeReclaimPolicy: Delete storageClassName: local-block-sc volumeMode: Block status: phase: Available - apiVersion: v1 kind: PersistentVolume metadata: annotations: pv.kubernetes.io/provisioned-by: local-volume-provisioner-qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb-8790b985-b1a4-11e9-bde8-000d3a92e02d creationTimestamp: "2019-07-31T06:53:25Z" finalizers: - kubernetes.io/pv-protection labels: storage.openshift.com/local-volume-owner-name: local-disks storage.openshift.com/local-volume-owner-namespace: local-storage name: local-pv-7f58a50f resourceVersion: "909927" selfLink: /api/v1/persistentvolumes/local-pv-7f58a50f uid: e50c7cfa-b35f-11e9-a378-000d3a92e02d spec: accessModes: - ReadWriteOnce capacity: storage: 1Gi local: path: /mnt/local-storage/local-block-sc/sdd nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb persistentVolumeReclaimPolicy: Delete storageClassName: local-block-sc volumeMode: Block status: phase: Available kind: List metadata: resourceVersion: "" selfLink: "" $ oc get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS qe-lxia-0728-222114-d82x6-master-0 Ready master 2d5h v1.14.0+2e9d4a117 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_D4s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=centralus,failure-domain.beta.kubernetes.io/zone=centralus-2,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lxia-0728-222114-d82x6-master-0,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos qe-lxia-0728-222114-d82x6-master-1 Ready master 2d5h v1.14.0+2e9d4a117 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_D4s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=centralus,failure-domain.beta.kubernetes.io/zone=centralus-1,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lxia-0728-222114-d82x6-master-1,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos qe-lxia-0728-222114-d82x6-master-2 Ready master 2d5h v1.14.0+2e9d4a117 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_D4s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=centralus,failure-domain.beta.kubernetes.io/zone=centralus-3,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lxia-0728-222114-d82x6-master-2,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx Ready worker 2d5h v1.14.0+2e9d4a117 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_D2s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=centralus,failure-domain.beta.kubernetes.io/zone=centralus-1,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lxia-0728-222114-d82x6-worker-centralus1-hnlcx,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb Ready worker 2d5h v1.14.0+2e9d4a117 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_D2s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=centralus,failure-domain.beta.kubernetes.io/zone=centralus-2,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lxia-0728-222114-d82x6-worker-centralus2-g79gb,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos