Bug 1893739
Summary: | Force deletion doesn't work for snapshots if snapshotclass is already deleted | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Neha Berry <nberry> | ||||
Component: | Storage | Assignee: | Christian Huffman <chuffman> | ||||
Storage sub component: | Storage | QA Contact: | Wei Duan <wduan> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | unspecified | CC: | aos-bugs, assingh, jijoy, jsafrane, mfojtik, mrajanna, ocs-bugs, wduan, xxia | ||||
Version: | 4.6 | ||||||
Target Milestone: | --- | ||||||
Target Release: | 4.7.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
Cause: When creating snapshots that require credentials, deleting the VolumeSnapshotClass would prevent the resulting snapshots from being deleted.
Consequence: Once the VolumeSnapshotClass is deleted, the associated VolumeSnapshots and VolumeSnapshotContents could not be deleted.
Fix: The credentials are fetched from the VolumeSnapshotContent instead of relying on the VolumeSnapshotClass to exist.
Result: VolumeSnapshots and VolumeSnapshotContents that use credentials can now be deleted as long as the secret containing these credentials continues to exist.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-02-24 15:29:19 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Neha Berry
2020-11-02 13:45:14 UTC
Hi Chris, Does this have anything to do OCS based CSI containers? Anyways, sorry I was working on something else. I could try to reproduce with GA'd OCP 4.6 build by Monday (Nov 9) and provide you with the cluster. Created attachment 1727165 [details] Finalizer-null-approach-to-delete-the-leftovers (In reply to Jan Safranek from comment #2) > Yes, removing the finalizer with oc patch is the best workaround. Yes the finalizer approach worked $ oc patch -n default volu[nberry@localhost bug-repro-1893739]$ oc patch -n default volumesnapshot/test-cephfs-snapshot --type=merge -p '{"metadata": {"finalizers":null}}' volumesnapshot.snapshot.storage.k8s.io/test-cephfs-snapshot patched oc patch -n default volumesnapshot/test-rbd-snapshot --type=merge -p '{"metadata": {"finalizers":null}}'; date --utc volumesnapshot.snapshot.storage.k8s.io/test-rbd-snapshot patched Fri Nov 6 16:31:41 UTC 2020 $ oc get volumesnapshot -A No resources found $ oc get volumesnapshotcontent NAME READYTOUSE RESTORESIZE DELETIONPOLICY DRIVER VOLUMESNAPSHOTCLASS VOLUMESNAPSHOT AGE snapcontent-45ed1739-4e66-47ae-a725-9f754c8fc418 true 5368709120 Delete openshift-storage.cephfs.csi.ceph.com ocs-storagecluster-cephfsplugin-snapclass test-cephfs-snapshot 85m snapcontent-730a0fc5-18b5-441e-8b6f-2c448c23300b true 10737418240 Delete openshift-storage.rbd.csi.ceph.com ocs-storagecluster-rbdplugin-snapclass test-rbd-snapshot 84m [nberry@localhost bug-repro-1893739]$ oc patch -n default volumesnapshotcontent/snapcontent-45ed1739-4e66-47ae-a725-9f754c8fc418 --type=merge -p '{"metadata": {"finalizers":null}}'; date --utc volumesnapshotcontent.snapshot.storage.k8s.io/snapcontent-45ed1739-4e66-47ae-a725-9f754c8fc418 patched Fri Nov 6 16:32:29 UTC 2020 [nberry@localhost bug-repro-1893739]$ oc patch -n default volumesnapshotcontent/snapcontent-730a0fc5-18b5-441e-8b6f-2c448c23300b --type=merge -p '{"metadata": {"finalizers":null}}'; date --utc volumesnapshotcontent.snapshot.storage.k8s.io/snapcontent-730a0fc5-18b5-441e-8b6f-2c448c23300b patched Fri Nov 6 16:32:49 UTC 2020 [nberry@localhost bug-repro-1893739]$ oc get volumesnapshotcontent No resources found Attached the controller logs for reference This issue is pertaining to deleting the VolumeSnapshotClass when the driver requires credentials. In this case, the driver fails to remove the backend snapshot, which prevents the VolumeSnapshotContent being deleted, which prevents the VolumeSnapshot from being deleted. Note that even if this is resolved, we still won't be able to delete a VolumeSnapshot if the secret itself has been deleted, as the driver requires the secret to exist to proceed. At this point I think we can fix the issue with the VolumeSnapshotClass being deleted and attempt to include the description in the VolumeSnapshotContent status message. The relevant sections are below: - Here we try to get the class and return a nil credentials if it's not found - https://github.com/kubernetes-csi/external-snapshotter/blob/23b415b6aaa7e0eba402bc3984ae2b726b101a80/pkg/sidecar-controller/snapshot_controller.go#L184-L189 - And then we pass these nil credentials into the deletion request - https://github.com/kubernetes-csi/external-snapshotter/blob/23b415b6aaa7e0eba402bc3984ae2b726b101a80/pkg/sidecar-controller/snapshot_controller.go#L341-L350 Since we have nil credentials, the deletion request fails, and then we see this error logged from line 350. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |