Bug 1990428

Summary: Deleting the PVC and RBD provisioner leader pod while provisioning is progressing, will leave a stale image
Product: OpenShift Container Platform Reporter: Mudit Agarwal <muagarwa>
Component: StorageAssignee: Fabio Bertinatto <fbertina>
Storage sub component: Storage QA Contact: Wei Duan <wduan>
Status: CLOSED DEFERRED Docs Contact:
Severity: medium    
Priority: medium CC: aos-bugs, gcharot, jsafrane, mrajanna, muagarwa
Version: 4.6   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-16 13:37:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1962956    

Description Mudit Agarwal 2021-08-05 11:41:22 UTC
Description of problem:

Deleting the PVC and RBD provisioner leader pod while provisioning is progressing, will leave a stale image. The issue can be seen frequently with thick provisioned volumes.

Version-Release number of selected component (if applicable):

How reproducible:
Always (with thick provisioning)

Steps to Reproduce:
1. Start creating a RBD PVC of size 15 GiB. Use thick provision enabled storage class.
2. When step 1 is progressing (PVC in Pending state), delete the csi-rbdplugin-provisioner leader pod.
3. Immediately after step 2 (PVC is still in Pending state), delete the PVC.
4. Wait for the PVC to get deleted.
5. Wait for the corresponding RBD image to get deleted.

Actual results:
PVC deleted. RBD image is not deleted.

Expected results:
The RBD image should be deleted.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 4 Jan Safranek 2021-08-10 14:25:14 UTC
This will need some discussion upstream, as it is hard problem to solve. We either need to sacrifice provisioning speed or leaked volumes.

Comment 5 Jan Safranek 2021-08-10 16:12:34 UTC
Upstream issue:
https://github.com/kubernetes-csi/external-provisioner/issues/486

Comment 7 Gregory Charot 2021-12-16 13:37:31 UTC
Closing deferred, this is no simple fix and requires design an proper integration. Tracking this work via this RFE https://issues.redhat.com/browse/RFE-2505