Bug 1339154

Summary: Dynamic provisioned pv/volume are not deleted after pod/pvc are deleted.
Product: OpenShift Container Platform Reporter: Liang Xia <lxia>
Component: StorageAssignee: Jan Safranek <jsafrane>
Status: CLOSED ERRATA QA Contact: Jianwei Hou <jhou>
Severity: high Docs Contact:
Priority: high    
Version: 3.2.0CC: aos-bugs, bchilds, lxia, tdawson
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-3.3.0.16 Doc Type: Bug Fix
Doc Text:
Cause: Race condition in OpenShift code could cause PersistentVolume objects not being deleted when their retention policy was set to 'Delete' and appropriate PersistentVolumeClaim was deleted. Consequence: Fix: Whole PersistentVolume handling was rewritten in OpenShift 3.3. Result: PersistentVolumes are deleted at the end of their lifetime.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-27 09:33:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Liang Xia 2016-05-24 09:11:51 UTC
Description of problem:
GCE dynamic provisioned pv/volume are not deleted after pod/pvc are deleted.
PV changed to Released status, but are not deleted.
And volume remains attached to nodes.

Version-Release number of selected component (if applicable):
openshift v3.2.0.44
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1.Create dynamic PVC.
2.Check PV/PVC are bound.
3.Create pod that use the PVC.
4.Delete the pod and PVC.
5.Check PV and volumes.

Actual results:
PV changed to released status, but are not deleted.
Volumes remains attached to nodes.

# oc get pv
NAME           CAPACITY   ACCESSMODES   STATUS     CLAIM                 REASON    AGE
pv-gce-4r4h8   3Gi        ROX           Released   wu2hl/gcec3-wu2hl               25m
pv-gce-63jz7   3Gi        ROX           Released   uyxke/gcec3-uyxke               17m
pv-gce-75r33   2Gi        RWX           Released   uyxke/gcec2-uyxke               17m
pv-gce-cs9zb   1Gi        RWO           Released   uyxke/gcec1-uyxke               17m
pv-gce-nw2wk   1Gi        RWO           Released   wu2hl/gcec1-wu2hl               25m
pv-gce-rkxd2   2Gi        RWX           Released   wu2hl/gcec2-wu2hl               25m
regpv-volume   17G        RWX           Bound      default/regpv-claim             6h

# oc describe pv pv-gce-4r4h8
Name:        pv-gce-4r4h8
Labels:       
Status:        Released
Claim:        wu2hl/gcec3-wu2hl
Reclaim Policy:    Delete
Access Modes:    ROX
Capacity:    3Gi
Message:    
Source:
    Type:    GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
    PDName:    kubernetes-dynamic-pv-gce-4r4h8
    FSType:    ext4
    Partition:    0
    ReadOnly:    false

Expected results:
PV/volumes are deleted after pod/PVC are deleted.

Additional info:

Comment 2 Liang Xia 2016-05-24 09:14:29 UTC
on node qe-lxia-2401-node-container-registry-router-2

# mount | grep dynamic
/dev/sdb on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pv-gce-nw2wk type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sdc on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pv-gce-4r4h8 type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sdd on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pv-gce-rkxd2 type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sde on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pv-gce-cs9zb type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sdf on /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pv-gce-75r33 type ext4 (rw,relatime,seclabel,data=ordered)

Comment 3 Jianwei Hou 2016-05-24 09:19:55 UTC
I can reproduce this too on my containerized environment

Steps to Reproduce:
1.Create dynamic PVC.
2.Check PV/PVC are bound.
3.Create pod that use the PVC.
4.Delete the pod and PVC.
5.Check PV and volumes.

When the pod was deleted in step 4, I could see the volume was still mounted to my node(running `mount|grep gce`), the PV could not be deleted because the volume was still attached to the instance.

Comment 4 Jianwei Hou 2016-05-24 09:23:05 UTC
Happens to aws/cinder too

Comment 8 Jan Safranek 2016-06-17 14:14:06 UTC
Indeed, Kubernetes cannot delete a PV that is still attached somewhere. Attaching was completely rewritten in Kubernetes 1.3 (waiting to be rebased in OpenShift 3.3) and this bug should be already gone there. Do you have a script that can reproduce it? I could run it with current Kubernetes to be sure.

Liang, Hou, does it happen *only* to containerized OpenShift or also one installed from RPM? Also, does this happen always or does it break only sometimes?

Comment 9 Liang Xia 2016-06-20 09:03:31 UTC
Always happen on containerized OpenShift. 
Fine on RPM environment.

Comment 11 Jan Safranek 2016-06-23 10:07:01 UTC
This seems to be fixed in Kubernetes 1.3. It is *not* fixed in OpenShift Origin v1.3.0-alpha.2 (it was rebased to an older Kubernetes 1.3-alpha).

Comment 12 Liang Xia 2016-08-09 02:21:23 UTC
Verified on below version,
openshift v3.3.0.16
kubernetes v1.3.0+507d3a7
etcd 2.3.0+git

The PV and volume are deleted in a short time(less than 10 sec) after delete pod/pvc.

Comment 14 errata-xmlrpc 2016-09-27 09:33:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933