Description of problem: While investigating one of the containers which was stuck in ContainerCreating state, I noticed that volume that it is trying to use is attached on another node. When I logged into this other node, I saw that volume is indeed attached but completely unmounted. The last line of node logs look like: operation_executor.go:1267] UnmountDevice succeeded for volume "kubernetes.io/aws-ebs/aws://us- east-2a/vol-03b5d7dbf226280fa" (spec.Name: "pvc-66da286a-2bb2-11e7-9954-02e52a0be43d"). Jumping back to controller, I could see no detach has been attempted for this volume and as a result pod was stuck in ContainerCreating state. The controller had uptime of 21 hours and hence it hasn't been restarted since volume was attached to the node in question. Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
I think "device is busy" error was confusing and wasn't really source of problems in this case. The root cause of the problem here is - a terminated pod doesn't detaches volumes in Kubernetes. I have opened a upstream bug to track this - https://github.com/kubernetes/kubernetes/issues/45191
I have opened a PR that fixes this - https://github.com/kubernetes/kubernetes/pull/45286
PR opened https://github.com/openshift/origin/pull/14191
Test is passed on container env. oc version oc v3.6.121 kubernetes v1.6.1+5115d708d7 1.Create pvc { "kind": "PersistentVolumeClaim", "apiVersion": "v1", "metadata": { "name": "ebsc", "annotations": { "volume.beta.kubernetes.io/storage-class": "gp2" } }, "spec": { "accessModes": [ "ReadWriteOnce" ], "resources": { "requests": { "storage": "1Gi" } } } } 2.Create pod kind: Pod apiVersion: v1 metadata: name: test-pod spec: containers: - name: test-pod image: gcr.io/google_containers/busybox:1.24 command: - "/bin/sh" args: - "-c" - "touch /mnt/SUCCESS && exit 0 || exit 1" volumeMounts: - name: ebs-pvc mountPath: "/mnt" restartPolicy: "Never" volumes: - name: ebs-pvc persistentVolumeClaim: claimName: ebsc After pod is Completed, the ebs volume is become available on aws web console
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716