Bug 1327531

Summary: Dynamically provisioned PV and volume are left behind under rare conditions
Product: OpenShift Container Platform Reporter: Jianwei Hou <jhou>
Component: StorageAssignee: Bradley Childs <bchilds>
Status: CLOSED ERRATA QA Contact: Jianwei Hou <jhou>
Severity: low Docs Contact:
Priority: low    
Version: 3.2.0CC: aos-bugs, jsafrane, tdawson
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-27 09:38:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jianwei Hou 2016-04-15 10:23:41 UTC
Description of problem:
Create a PVC which dynamically provisions a PV, under some rare conditions, the PV will end up in failed status, at this time delete the PVC, the PV is left behind.

Version-Release number of selected component (if applicable):
openshift v3.2.0.15
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1. Use up openstack's all available volume storage

2. Create a PVC requesting provisioned PV: oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/cinder/dynamic-provisioning/pvc.json

3. PV was created in 'Failed' status with error: {"overLimit": {"message": "VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed Gigabytes quota. Requested 3G, quota is 100G and 100G has been consumed.", "code": 413}}

4. Delete the PVC

5. oc get pv

Actual results:
After step 5: The PV was not deleted.

NAME              CAPACITY   ACCESSMODES   STATUS    CLAIM           REASON    AGE
pv-cinder-axm2y   3Gi        RWO           Failed    jhou1/cinderc             29m

# oc describe pv pv-cinder-axm2y
Name:		pv-cinder-axm2y
Labels:		<none>
Status:		Failed
Claim:		jhou1/cinderc
Reclaim Policy:	Delete
Access Modes:	RWO
Capacity:	3Gi
Message:	Expected HTTP response code [200 201] when accessing [POST http://<hidden>/volumes], but got 413 instead
{"overLimit": {"message": "VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed Gigabytes quota. Requested 3G, quota is 100G and 100G has been consumed.", "code": 413}}
Source:


If PV is provisioned but due to some reason it was not successfully attached to the node. When pod and PVC is deleted, PV and volume are also left behind. This was discovered in https://bugzilla.redhat.com/show_bug.cgi?id=1313210#c12

Expected results:
It is better to also clean up dirty PV and volume when the PVC is deleted.

Additional info:

Comment 1 Jan Safranek 2016-06-03 14:35:29 UTC
This should be fixed by controller refactoring in Kubernetes 1.3 - when provisioning fails, no PV in is created, therefore it can't enter Failed state.

Comment 2 Troy Dawson 2016-09-01 15:57:39 UTC
This is in OSE v3.3.0.28 or newer.

Comment 4 Jianwei Hou 2016-09-02 05:38:31 UTC
Verified on 
openshift v3.3.0.28
kubernetes v1.3.0+507d3a7

Now PV won't be created if provision fails.

Comment 6 errata-xmlrpc 2016-09-27 09:38:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933