Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1446788 - Volume failed to detach even after unmount is successful on the node
Volume failed to detach even after unmount is successful on the node
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.5.0
All Unspecified
unspecified Severity medium
: ---
: ---
Assigned To: Hemant Kumar
chaoyang
:
Depends On:
Blocks: 1450215
  Show dependency treegraph
 
Reported: 2017-04-28 18:17 EDT by Hemant Kumar
Modified: 2017-08-16 15 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Openshift does not attempt detach operation for pods that are completed or terminated but not deleted from API server. Consequence: Volumes can be left attached to old nodes, preventing reuse of volume in other pods. Fix: Implement support for detaching volumes for pods that are completed or terminated. Result: After this bug is fixed - volumes for terminated or completed pods are detached automatically. Users are free to reuse such volumes in other pods.
Story Points: ---
Clone Of:
: 1450215 (view as bug list)
Environment:
Last Closed: 2017-08-10 01:21:25 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 05:02:50 EDT

  None (edit)
Description Hemant Kumar 2017-04-28 18:17:23 EDT
Description of problem:

While investigating one of the containers which was stuck in ContainerCreating state, I noticed that volume that it is trying to use is attached on another node. 

When I logged into this other node, I saw that volume is indeed attached but completely unmounted. The last line of node logs look like:

operation_executor.go:1267] UnmountDevice succeeded for volume "kubernetes.io/aws-ebs/aws://us-
east-2a/vol-03b5d7dbf226280fa" (spec.Name: "pvc-66da286a-2bb2-11e7-9954-02e52a0be43d").

Jumping back to controller, I could see no detach has been attempted for this volume and as a result pod was stuck in ContainerCreating state.

The controller had uptime of 21 hours and hence it hasn't been restarted since volume was attached to the node in question.




Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
Comment 6 Hemant Kumar 2017-05-01 18:05:17 EDT
I think "device is busy" error was confusing and wasn't really source of problems in this case. The root cause of the problem here is - a terminated pod doesn't detaches volumes in Kubernetes. I have opened a upstream bug to track this - https://github.com/kubernetes/kubernetes/issues/45191
Comment 9 Hemant Kumar 2017-05-03 20:40:27 EDT
I have opened a PR that fixes this - https://github.com/kubernetes/kubernetes/pull/45286
Comment 11 Hemant Kumar 2017-05-15 09:55:09 EDT
PR opened https://github.com/openshift/origin/pull/14191
Comment 14 chaoyang 2017-06-27 03:18:55 EDT
Test is passed on container env.
oc version
oc v3.6.121
kubernetes v1.6.1+5115d708d7

1.Create pvc 
{
  "kind": "PersistentVolumeClaim",
  "apiVersion": "v1",
  "metadata": {
    "name": "ebsc",
    "annotations": {
        "volume.beta.kubernetes.io/storage-class": "gp2"
    }
  },
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "resources": {
      "requests": {
        "storage": "1Gi"
      }
    }
  }
}
2.Create pod
kind: Pod
apiVersion: v1
metadata:
  name: test-pod
spec:
  containers:
  - name: test-pod
    image: gcr.io/google_containers/busybox:1.24
    command:
      - "/bin/sh"
    args:
      - "-c"
      - "touch /mnt/SUCCESS && exit 0 || exit 1"
    volumeMounts:
      - name: ebs-pvc
        mountPath: "/mnt"
  restartPolicy: "Never"
  volumes:
    - name: ebs-pvc
      persistentVolumeClaim:
        claimName: ebsc
After pod is Completed, the ebs volume is become available on aws web console
Comment 16 errata-xmlrpc 2017-08-10 01:21:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.