Bug 1446788 - Volume failed to detach even after unmount is successful on the node
Summary: Volume failed to detach even after unmount is successful on the node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.5.0
Hardware: All
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Hemant Kumar
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks: 1450215
TreeView+ depends on / blocked
 
Reported: 2017-04-28 22:17 UTC by Hemant Kumar
Modified: 2017-08-16 19:51 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Openshift does not attempt detach operation for pods that are completed or terminated but not deleted from API server. Consequence: Volumes can be left attached to old nodes, preventing reuse of volume in other pods. Fix: Implement support for detaching volumes for pods that are completed or terminated. Result: After this bug is fixed - volumes for terminated or completed pods are detached automatically. Users are free to reuse such volumes in other pods.
Clone Of:
: 1450215 (view as bug list)
Environment:
Last Closed: 2017-08-10 05:21:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 09:02:50 UTC

Description Hemant Kumar 2017-04-28 22:17:23 UTC
Description of problem:

While investigating one of the containers which was stuck in ContainerCreating state, I noticed that volume that it is trying to use is attached on another node. 

When I logged into this other node, I saw that volume is indeed attached but completely unmounted. The last line of node logs look like:

operation_executor.go:1267] UnmountDevice succeeded for volume "kubernetes.io/aws-ebs/aws://us-
east-2a/vol-03b5d7dbf226280fa" (spec.Name: "pvc-66da286a-2bb2-11e7-9954-02e52a0be43d").

Jumping back to controller, I could see no detach has been attempted for this volume and as a result pod was stuck in ContainerCreating state.

The controller had uptime of 21 hours and hence it hasn't been restarted since volume was attached to the node in question.




Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 6 Hemant Kumar 2017-05-01 22:05:17 UTC
I think "device is busy" error was confusing and wasn't really source of problems in this case. The root cause of the problem here is - a terminated pod doesn't detaches volumes in Kubernetes. I have opened a upstream bug to track this - https://github.com/kubernetes/kubernetes/issues/45191

Comment 9 Hemant Kumar 2017-05-04 00:40:27 UTC
I have opened a PR that fixes this - https://github.com/kubernetes/kubernetes/pull/45286

Comment 11 Hemant Kumar 2017-05-15 13:55:09 UTC
PR opened https://github.com/openshift/origin/pull/14191

Comment 14 Chao Yang 2017-06-27 07:18:55 UTC
Test is passed on container env.
oc version
oc v3.6.121
kubernetes v1.6.1+5115d708d7

1.Create pvc 
{
  "kind": "PersistentVolumeClaim",
  "apiVersion": "v1",
  "metadata": {
    "name": "ebsc",
    "annotations": {
        "volume.beta.kubernetes.io/storage-class": "gp2"
    }
  },
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "resources": {
      "requests": {
        "storage": "1Gi"
      }
    }
  }
}
2.Create pod
kind: Pod
apiVersion: v1
metadata:
  name: test-pod
spec:
  containers:
  - name: test-pod
    image: gcr.io/google_containers/busybox:1.24
    command:
      - "/bin/sh"
    args:
      - "-c"
      - "touch /mnt/SUCCESS && exit 0 || exit 1"
    volumeMounts:
      - name: ebs-pvc
        mountPath: "/mnt"
  restartPolicy: "Never"
  volumes:
    - name: ebs-pvc
      persistentVolumeClaim:
        claimName: ebsc
After pod is Completed, the ebs volume is become available on aws web console

Comment 16 errata-xmlrpc 2017-08-10 05:21:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716


Note You need to log in before you can comment on or make changes to this bug.