Description of problem: If your default storage class was not supporting snapshots, boot source images, created by the DataImportCron in openshift-virtualization-os-images namespace, will be imported as the DVs/PVCs. When you switch the default storage class to OCS, you can re-import the images by deleting the old DVs. The DV/PVC will be re-imported, VolumeSnapshot object will be created, and DV/PVC will be removed automatically. Alex akalenyu looked at it, and sees 2 issues: Issue 1: Snapshots are being made out of the previous storage class (when changing SC from HPP->OCS) Issue 2: When deleting the old storage class DVs, there may be a race where the snapshot got created, but the DV didn't recreate Version-Release number of selected component (if applicable): 4.14 How reproducible: Always Steps to Reproduce: 1. Have a non-snapshotable default storage class (HPP) 2. See that DVs/PVCs were imported $ oc get dv -A NAMESPACE NAME PHASE PROGRESS RESTARTS AGE openshift-virtualization-os-images centos-stream8-b9b768dcd73b Succeeded 100.0% 18h openshift-virtualization-os-images centos-stream9-362e1f1d9f11 Succeeded 100.0% 18h openshift-virtualization-os-images centos7-680e9b4e0fba Succeeded 100.0% 18h openshift-virtualization-os-images fedora-f7cc15256f08 Succeeded 100.0% 18h openshift-virtualization-os-images rhel8-0da894200daa Succeeded 100.0% 18h openshift-virtualization-os-images rhel9-b006ef7856b6 Succeeded 100.0% 18h 3. Make HPP non-default, make OCS default oc patch storageclass ocs-storagecluster-ceph-rbd -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}' 4. Delete one DV $ oc delete dv -n openshift-virtualization-os-images rhel9-b006ef7856b6 datavolume.cdi.kubevirt.io "rhel9-b006ef7856b6" deleted 5. DV didn't get recreated (but should have been), VolumeSnapshot was created, but it's not Ready $ oc get VolumeSnapshot -A NAMESPACE NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE openshift-virtualization-os-images rhel9-b006ef7856b6 false rhel9-b006ef7856b6 ocs-storagecluster-rbdplugin-snapclass 13s [cloud-user@ocp-psi-executor ~]$ oc get VolumeSnapshot -n openshift-virtualization-os-images rhel9-b006ef7856b6 -oyaml apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: annotations: cdi.kubevirt.io/storage.import.lastUseTime: "2023-07-27T14:31:32.631870881Z" creationTimestamp: "2023-07-27T14:31:32Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection generation: 1 labels: app: containerized-data-importer app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.14.0 cdi.kubevirt.io: "" cdi.kubevirt.io/dataImportCron: rhel9-image-cron name: rhel9-b006ef7856b6 namespace: openshift-virtualization-os-images resourceVersion: "1182048" uid: d69181d0-4195-4b3f-91b4-ba3631f05249 spec: source: persistentVolumeClaimName: rhel9-b006ef7856b6 volumeSnapshotClassName: ocs-storagecluster-rbdplugin-snapclass status: error: message: 'Failed to create snapshot content with error snapshot controller failed to update rhel9-b006ef7856b6 on API server: cannot get claim from snapshot' 6. See that 2 minutes later, other VolumeSnapshots are created while old DVs were not yet deleted $ oc get VolumeSnapshot -A NAMESPACE NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE openshift-virtualization-os-images centos-stream8-b9b768dcd73b false centos-stream8-b9b768dcd73b ocs-storagecluster-rbdplugin-snapclass snapcontent-8455f2ea-0d70-4998-9fa5-bbc42133b1f5 23s openshift-virtualization-os-images centos-stream9-362e1f1d9f11 false centos-stream9-362e1f1d9f11 ocs-storagecluster-rbdplugin-snapclass snapcontent-3eec6ff1-f73f-493f-b61b-58abfeec5b65 23s openshift-virtualization-os-images centos7-680e9b4e0fba false centos7-680e9b4e0fba ocs-storagecluster-rbdplugin-snapclass snapcontent-76229453-37ff-40f6-8ce0-94e15a5b912c 23s openshift-virtualization-os-images fedora-f7cc15256f08 false fedora-f7cc15256f08 ocs-storagecluster-rbdplugin-snapclass snapcontent-94d05d80-20f5-4861-a7af-344f19842a61 23s openshift-virtualization-os-images rhel8-0da894200daa false rhel8-0da894200daa ocs-storagecluster-rbdplugin-snapclass snapcontent-df7f9a06-4a2e-41b1-8f04-a16758daf4e8 23s openshift-virtualization-os-images rhel9-b006ef7856b6 false rhel9-b006ef7856b6 ocs-storagecluster-rbdplugin-snapclass 2m47s 7. See the yaml of another VolumeSnapshot, whose DV/PVC wasn't deleted and still using non-snapshotable HPP: spec: source: persistentVolumeClaimName: centos-stream8-b9b768dcd73b volumeSnapshotClassName: ocs-storagecluster-rbdplugin-snapclass status: boundVolumeSnapshotContentName: snapcontent-8455f2ea-0d70-4998-9fa5-bbc42133b1f5 error: message: 'Failed to check and update snapshot content: failed to take snapshot of the volume pvc-e59ee8cd-57d0-4ecf-906f-0ab7a1f8ba72: "rpc error: code = Internal desc = panic runtime error: invalid memory address or nil pointer dereference"' time: "2023-07-27T14:33:56Z" readyToUse: false 8. To fix the broken VolumeSnapshot of the first deleted DV: delete that VolumeSnapshot $ oc delete VolumeSnapshot -n openshift-virtualization-os-images rhel9-b006ef7856b6 volumesnapshot.snapshot.storage.k8s.io "rhel9-b006ef7856b6" deleted 9. This will trigger the DV/PVC to re-import on OCS, create a VolumeSnapshot that will be ReadyToUse, and DV/PVC will be deleted automatically. Actual results: Re-importing requires more steps. Expected results: Re-importing should happen once we switch the storage class and delete the old DVs.
We also should encounter this situation: OCS was the default, DataImportCron images were imported and stayed as VolumeSnapshots But then we changed the default storage class to HPP - new DVs/PVCs are not created unless we delete the VolumeSnapshot And there are reconcile errors in the log