Bug 2209969

Summary: DataImportCron controller does not garbage collect old DVs/PVCs created before CDI DV GC was enabled buy default
Product: Container Native Virtualization (CNV) Reporter: Arnon Gilboa <agilboa>
Component: StorageAssignee: Ido Aharon <iaharon>
Status: CLOSED MIGRATED QA Contact: Harel Meir <hmeir>
Severity: high Docs Contact:
Priority: high    
Version: 4.12.1CC: akalenyu, alitke, dafrank
Target Milestone: ---   
Target Release: 4.14.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: CNV v4.14.1.rhel9-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-12-14 16:15:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Arnon Gilboa 2023-05-25 11:27:08 UTC
Description of problem:
When you have old DIC-created DVs and upgrade to >= 4.12.1, the old DVs and PVCs are never deleted by DIC garbage collection.

Encountered the issue in cnv2.engineering.redhat.com:
$ oc get pvc -n openshift-virtualization-os-images | grep rhel9
rhel9-35d9b2336799            Bound    pvc-4d65b0e2-a8c8-4236-91be-7f0df7868ebc   30Gi       RWX            ocs-storagecluster-ceph-rbd   175d
rhel9-8350135038fe            Bound    pvc-9ee22647-9246-474b-b032-e51f5aea949d   30Gi       RWX            ocs-storagecluster-ceph-rbd   28d
rhel9-87d2b5f15665            Bound    pvc-e139c7b3-a162-4627-94bb-29ef4ba5e939   30Gi       RWX            ocs-storagecluster-ceph-rbd   177d
rhel9-9d0d9575e03e            Bound    pvc-4ddde24f-daae-4e45-b9b7-752278d07c37   30Gi       RWX            ocs-storagecluster-ceph-rbd   70d
rhel9-d1d2fc222d93            Bound    pvc-9f152d58-98ca-474c-b9b0-5a3b9d4615fa   30Gi       RWX            ocs-storagecluster-ceph-rbd   38m

The newer PVCs are labeled with the DataImportCron which will keep only the 3 (default) latest ones, and delete the older ones using the DIC controller internal GC.
$ oc get pvc -n openshift-virtualization-os-images rhel9-d1d2fc222d93 -o yaml | grep dataImportCron
    cdi.kubevirt.io/dataImportCron: rhel9-image-cron
$ oc get pvc -n openshift-virtualization-os-images rhel9-8350135038fe -o yaml | grep dataImportCron
    cdi.kubevirt.io/dataImportCron: rhel9-image-cron
$ oc get pvc -n openshift-virtualization-os-images rhel9-9d0d9575e03e -o yaml | grep dataImportCron
    cdi.kubevirt.io/dataImportCron: rhel9-image-cron

The older PVCs (175d) don't have this label, so they are not candidates for garbage collection:
$ oc get pvc -n openshift-virtualization-os-images rhel9-35d9b2336799 -o yaml | grep dataImportCron
$ oc get pvc -n openshift-virtualization-os-images rhel9-87d2b5f15665 -o yaml | grep dataImportCron

Their DVs are the only ones who exist (it was before CDI DV GC was enabled by default in 4.12) and have the DataImportCron label:
$ oc get dv -n openshift-virtualization-os-images | grep rhel9
rhel9-35d9b2336799            Succeeded   100.0%                175d
rhel9-87d2b5f15665            Succeeded   100.0%                177d

$ oc get dv -n openshift-virtualization-os-images rhel9-35d9b2336799 -o yaml | grep dataImportCron
    cdi.kubevirt.io/dataImportCron: rhel9-image-cron
$ oc get dv -n openshift-virtualization-os-images rhel9-87d2b5f15665 -o yaml | grep dataImportCron
    cdi.kubevirt.io/dataImportCron: rhel9-image-cron


Version-Release number of selected component (if applicable):
4.12.1

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:
The old DVs and PVCs are never deleted by DIC garbage collection.

Expected results:
The old DVs and PVCs should be deleted by DIC garbage collection.


Additional info:

Comment 1 Adam Litke 2023-09-27 18:01:56 UTC
Arnon, can you help with a cherry-pick PR?  This will need to wait until 4.14.1 to merge since we are frozen for 4.14.0.  Thanks.  Moving back to POST since there is no code merged yet in the target y-stream branch.

Comment 2 Arnon Gilboa 2023-09-28 11:47:42 UTC
Sure Adam. Done.