Bug 1840759

Summary: [aws-ebs-csi-driver] The volume created by aws ebs csi driver can not be deleted when the cluster is destroyed
Product: OpenShift Container Platform Reporter: Qin Ping <piqin>
Component: StorageAssignee: Fabio Bertinatto <fbertina>
Storage sub component: Operators QA Contact: Qin Ping <piqin>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: aos-bugs, fbertina, jsafrane
Version: 4.5   
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:12:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Qin Ping 2020-05-27 14:46:23 UTC
Description of problem:
The volume created by aws ebs csi driver can not be deleted when the cluster is destroyed

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Deploy aws ebs csi driver with the operator
2. Create a PVC and Pod using ebs.csi.aws.com provisioner(aws ebs csi driver)
3. Create a PVC and Pod using kubernetes.io/aws-ebs provisioner(in-tree plug-in)
4. Destroy the cluster

Actual results:
The volume created by the kubernetes.io/aws-ebs provisioner can be deleted when the cluster is destroyed.
The volume created by the ebs.csi.aws.com provisioner can not be deleted.

Expected results:
The volume created by the ebs.csi.aws.com provisioner should be deleted.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Found the volume created by in-tree plug-in is deleted in the installer log.
level=info msg=Deleted arn="arn:aws:ec2:us-east-2:301721915996:volume/vol-0de4f771de91394ef" id=vol-0de4f771de91394ef

Checked the volume tags from the amazon web console, they are different.
vol-07d8c287ced08f952(aws ebs csi driver) CSIVolumeName:pvc-4ec09014-19cf-430e-83b5-ccf4317ba956

vol-0de4f771de91394ef(in-tree plug-in)

Comment 1 Jan Safranek 2020-05-27 15:25:21 UTC
> The volume created by the ebs.csi.aws.com provisioner can not be deleted.

What does it mean? What blocks the deletion? What error message does it show?

Comment 3 Jan Safranek 2020-07-08 13:35:03 UTC
In-tree provisioner creates volumes with these tags:

kubernetes.io/cluster/jsafrane-10235-vgdv6: owned     
Name:                                       jsafrane-10235-vgdv6-dynamic-pvc-447cc711-bb65-4b4d-836d-a822e4e77e43     
kubernetes.io/created-for/pv/name:          pvc-447cc711-bb65-4b4d-836d-a822e4e77e43     
kubernetes.io/created-for/pvc/name:         myclaim     
kubernetes.io/created-for/pvc/namespace:    default     

The first tag seems to be the most important.

The current version of AWS EBS CSI driver creates only this tag:
CSIVolumeName: pvc-4e4cb311-3907-4192-bc58-cde8ea112392

I tried to pass "--extra-volume-tags=kubernetes.io/cluster/<cluster id>=owned" to the CSI driver, however, this gets blocked by the driver with "Invalid driver options: Invalid extra volume tags: Volume tag key prefix 'kubernetes.io' is reserved". I need to fix the driver first.

Comment 4 Jan Safranek 2020-07-10 12:18:45 UTC
Upstream PR to fix the driver part: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/530
The driver operator still needs to be fixed to pass the --cluster-id to the driver!

Comment 8 Fabio Bertinatto 2020-08-28 13:04:22 UTC
PR openshift/aws-ebs-csi-driver-operator/pull/83 waiting for review. Once it's OK, I'll submit the library-go changes to its repo.

Also, I'd like to start a discussion about how we're going to solve this problem in other CSI drivers.

Comment 9 Fabio Bertinatto 2020-08-31 11:57:58 UTC
Should this behavior (volumes deleted when the cluster is deleted) be the same for all volumes created by the CSI drivers shipped with OpenShift?

What's the current behavior with oVirt CSI Driver volumes?

Comment 10 Fabio Bertinatto 2020-08-31 13:53:50 UTC
Apparently, non-attached volumes created by oVirt CSI driver are NOT deleted when the cluster is destroyed (CC @bzlotnik). Created a ticket here: https://bugzilla.redhat.com/show_bug.cgi?id=1874065

Manila ticket: https://bugzilla.redhat.com/show_bug.cgi?id=1820238

Comment 11 Fabio Bertinatto 2020-09-08 08:06:20 UTC
Moving back to ASSIGNED until we discuss what's the right approach for all CSI drivers.

Comment 14 Fabio Bertinatto 2020-10-01 08:14:12 UTC
This requires changes in the CSI driver (done), in library-go and in the AWS EBS CSI Driver Operator.

The library-go patch is here: https://github.com/openshift/library-go/pull/909

Once that's merged we need to merge the operator patch here: https://github.com/openshift/aws-ebs-csi-driver-operator/pull/83

Comment 16 Qin Ping 2020-10-20 07:27:00 UTC
Verified with: 4.7.0-0.nightly-2020-10-17-034503

Comment 19 errata-xmlrpc 2021-02-24 15:12:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.