Bug 1450215 - Volume failed to detach even after unmount is successful on the node
Summary: Volume failed to detach even after unmount is successful on the node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.5.1
Hardware: All
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.5.z
Assignee: Hemant Kumar
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On: 1446788
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-11 21:05 UTC by Hemant Kumar
Modified: 2017-06-15 18:39 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Openshift does not attempt detach operation for pods that are completed or terminated but not deleted from API server. Consequence: Volumes can be left attached to old nodes, preventing reuse of volume in other pods. Fix: Implement support for detaching volumes for pods that are completed or terminated. Result: After this bug is fixed - volumes for terminated or completed pods are detached automatically. Users are free to reuse such volumes in other pods.
Clone Of: 1446788
Environment:
Last Closed: 2017-06-15 18:39:25 UTC
Target Upstream Version:


Attachments (Terms of Use)
test log (122.75 KB, application/zip)
2017-06-05 10:42 UTC, Chao Yang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1425 0 normal SHIPPED_LIVE OpenShift Container Platform 3.5, 3.4, 3.3, and 3.2 bug fix update 2017-06-15 22:35:53 UTC

Comment 3 Chao Yang 2017-06-05 10:36:22 UTC
1.Create a pv
2.Create a pod named gracepod with terminationGracePeriodSeconds is 800
{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "gracepod",
    "creationTimestamp": null,
    "labels": {
      "name": "graceful"
    }
  },
  "spec": {
    "containers": [
      {
        "name": "hello-openshift",
        "image": "aosqe/sleep",
        "ports": [
          {
            "containerPort": 8080,
            "protocol": "TCP"
          }
        ],
        "resources": {},
        "volumeMounts": [
          {
            "name":"tmp",
            "mountPath":"/tmp"
          }
        ],
        "terminationMessagePath": "/dev/termination-log",
        "imagePullPolicy": "IfNotPresent",
        "securityContext": {
          "capabilities": {},
          "privileged": false
        }
      }
    ],
    "volumes": [
      {
        "name":"tmp",
        "persistentVolumeClaim": {
           "claimName": "ebsc2"
         }
      }
    ],
    "restartPolicy": "Always",
    "dnsPolicy": "ClusterFirst",
    "terminationGracePeriodSeconds": 800,
    "serviceAccount": ""
  },
  "status": {}
}
3.After pod is running, delete the pod, pod status became Terminating
4.Ebs volume is not umounted after 6 min

[root@ip-172-18-2-49 ~]# oc get pods;mount | grep ebs;date
NAME                       READY     STATUS        RESTARTS   AGE
docker-registry-1-z1hj0    1/1       Running       0          6h
grace40                    1/1       Terminating   0          35m
gracepod                   1/1       Terminating   0          13m
registry-console-1-kgmbh   1/1       Running       0          6h
router-1-6dqc9             1/1       Running       0          6h
/dev/xvdbw on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-0f2676690639f353d type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdbw on /var/lib/origin/openshift.local.volumes/pods/098e54d6-49d3-11e7-b9a7-0e342ab9837c/volumes/kubernetes.io~aws-ebs/pvc-32cf4a70-49cb-11e7-b9a7-0e342ab9837c type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdbf on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-0d823bf4d7b17ebbb type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdbf on /var/lib/origin/openshift.local.volumes/pods/0c83fa4e-49d6-11e7-b9a7-0e342ab9837c/volumes/kubernetes.io~aws-ebs/pvc-9f584687-49d5-11e7-b9a7-0e342ab9837c type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdba on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-0d4744fec18cda97b type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdba on /var/lib/origin/openshift.local.volumes/pods/f621de0b-49d6-11e7-b9a7-0e342ab9837c/volumes/kubernetes.io~aws-ebs/pvc-ef00c93b-49d6-11e7-b9a7-0e342ab9837c type ext4 (rw,relatime,seclabel,data=ordered)
Mon Jun  5 06:15:59 EDT 2017

[root@ip-172-18-2-49 ~]# oc get pods;mount | grep ebs;date
NAME                       READY     STATUS        RESTARTS   AGE
docker-registry-1-z1hj0    1/1       Running       0          6h
grace40                    1/1       Terminating   0          37m
registry-console-1-kgmbh   1/1       Running       0          6h
router-1-6dqc9             1/1       Running       0          6h
/dev/xvdbw on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-0f2676690639f353d type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdbw on /var/lib/origin/openshift.local.volumes/pods/098e54d6-49d3-11e7-b9a7-0e342ab9837c/volumes/kubernetes.io~aws-ebs/pvc-32cf4a70-49cb-11e7-b9a7-0e342ab9837c type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdba on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-0d4744fec18cda97b type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdba on /var/lib/origin/openshift.local.volumes/pods/f621de0b-49d6-11e7-b9a7-0e342ab9837c/volumes/kubernetes.io~aws-ebs/pvc-ef00c93b-49d6-11e7-b9a7-0e342ab9837c type ext4 (rw,relatime,seclabel,data=ordered)
Mon Jun  5 06:18:00 EDT 2017

oc v3.5.5.23
kubernetes v1.5.2+43a9be4

Comment 4 Chao Yang 2017-06-05 10:42:22 UTC
Created attachment 1285013 [details]
test log

Comment 5 Hemant Kumar 2017-06-05 14:38:54 UTC
Just to clarify, so pod is not actually "terminated" but becomes "terminating" right? For volumes to be detached (and unmounted) - the pod has to be either terminated or completed, it shouldn't be stuck in "terminating" state.

Comment 6 Chao Yang 2017-06-06 06:55:19 UTC
I re-test with pod status is "completed" and the ebs volume is umounted
1.Dynamic create pvc
2.Create pod
kind: Pod
apiVersion: v1
metadata:
  name: test-pod
spec:
  containers:
  - name: test-pod
    image: gcr.io/google_containers/busybox:1.24
    command:
      - "/bin/sh"
    args:
      - "-c"
      - "touch /mnt/SUCCESS && exit 0 || exit 1"
    volumeMounts:
      - name: ebs
        mountPath: "/mnt"
  restartPolicy: "Never"
  volumes:
    - name: ebs
      persistentVolumeClaim:
        claimName: ebsc
3.After pod status became Completed, the ebs volume is umounted

[root@ip-172-18-5-46 ~]# mount | grep ebs
/dev/xvdcc on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-002330445c5855177 type ext4 (rw,relatime,seclabel,data=ordered)
/dev/xvdcc on /var/lib/origin/openshift.local.volumes/pods/8ac7612c-4a84-11e7-b146-0ef49bdb7770/volumes/kubernetes.io~aws-ebs/pvc-f3180383-4a83-11e7-b146-0ef49bdb7770 type ext4 (rw,relatime,seclabel,data=ordered)
[root@ip-172-18-5-46 ~]# mount | grep ebs
[root@ip-172-18-5-46 ~]# mount | grep ebs

[root@ip-172-18-1-35 ~]# oc version
oc v3.5.5.23
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-1-35.ec2.internal:8443
openshift v3.5.5.23
kubernetes v1.5.2+43a9be4

Comment 7 Hemant Kumar 2017-06-06 12:41:38 UTC
Not just unmounted, but it should be detached too. If the volume is detached, we should consider this bug to be VERIFIED.

Comment 8 Chao Yang 2017-06-07 05:12:48 UTC
After pod is completed, the ebs volume is detached and became avaiable from aws web console.

Comment 10 errata-xmlrpc 2017-06-15 18:39:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1425


Note You need to log in before you can comment on or make changes to this bug.