Description of problem: When redeploying the pod, container stands at ContainerCreating, the volume is not unbound of the pvc when the previous pod is deleted: Jun 19 14:25:51 osbeta-rtp-master0 atomic-openshift-master-api: I0619 14:25:51.116038 89051 trace.go:61] Trace "Delete /api/v1/namespaces/fi-gf-or/persistentvolumeclaims/gv1-pvc-rhub1" (started 2017-06-19 14:25:50.716046621 -0400 EDT): Jun 19 14:25:51 osbeta-rtp-master0 atomic-openshift-master-api: "Delete /api/v1/namespaces/fi-gf-or/persistentvolumeclaims/gv1-pvc-rhub1" [399.861166ms] [258.542µs] END Jun 20 23:21:04 osbeta-rtp-master0 atomic-openshift-master-controllers: I0620 23:21:04.716009 73128 pv_controller.go:389] synchronizing PersistentVolume[pvc-dc15f42e-52cc-11e7-b8f8-fa163ee805fd]: phase: Failed, bound to: "fi-gf-or/gv1-pvc-rhub1 (uid: dc15f42e-52cc-11e7-b8f8-fa163ee805fd)", boundByController: true Version-Release number of selected component (if applicable): - oc v3.4.1.18 - OSP 8 - Liberty - What Cinder API version(s) is being used (v1, v2, v3)? v1 and v2 How reproducible: Sometimes Steps to Reproduce: 1. Deploy a pod using the dynamic Cinder storage 2. Re-deploy the pod 3. - Actual results: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "rhub1"/"fi-gf-or". list of unattached/unmounted volumes=[log] Expected results: Pod to be correctly deployed. Additional info: Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
The pvc definition: $ cat gv1-pvc-rhub1.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.alpha.kubernetes.io/storage-class: dynamic name: gv1-pvc-rhub1 spec: accessModes: - ReadWriteOnce resources: requests: storage: 500Mi
> When redeploying the pod, container stands at ContainerCreating, the volume is not unbound of the pvc when the previous pod is deleted This makes no sense to me. PVCs are not bound to pods, they're bound to PVs. Why would the customer unbind their PVC from a PV? How does it relate to redeploying a pod? Can you please get some steps to reproduce from the customer? What did they create first, what second, how did they "redeploy" their pod? In the log in the support ticket I can see this: Jun 20 23:27:34 osbeta-rtp-master0 atomic-openshift-master-controllers: I0620 23:27:34.737333 73128 pv_controller.go:436] synchronizing PersistentVolume[pvc-dc15f42e-52cc-11e7-b8f8-fa163ee805fd]: claim fi-gf-or/gv1-pvc-rhub1 has different UID, the old one must have been deleted That means that they created a PVC and it got bound to a dynamically provisioned PV. Then they deleted the PVC and created a new one with the same name. While there is nothing really bad about this, it shows that there were some not-so-standard actions performed, probably trying to fix the problem. In addition, this dynamically provisioned PV should have been automatically deleted when they deleted the first PVC. It is 'Failed' instead, meaning that deletion did not succeed, but I don't know why. To sum it up, there is something weird going on and we need further details. Please get: oc get pod -o yaml && oc describe pod (so we can see what pod uses what PVC and states of the pods) oc get pvc -o yaml && oc describe pvc (so we can see all the PVCs referenced by the pods and their states) oc get pv -o yaml && oc get describe pv (so we can see PVs and their states) And full logs from master and the node that can't run the problematic pod would be very helpful too - just the snippet the customer attached shows that there is something odd going on with PVCs, however it does show why a node can't run a pod. Tarball with yaml that the customer provided is nice, but it does not show *current* status of the system, just the initial one, that's why we need oc get * -o yaml.