Bug 2203086
| Summary: | csi-snap are not deleted while volumesnapshotclass policy set to 'Delete' | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | David Vaanunu <dvaanunu> |
| Component: | csi-driver | Assignee: | Niels de Vos <ndevos> |
| Status: | CLOSED COMPLETED | QA Contact: | krishnaram Karthick <kramdoss> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.12 | CC: | mrajanna, muagarwa, ndevos, odf-bz-bot |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-08-10 11:19:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
David Vaanunu
2023-05-11 08:06:27 UTC
It seems that the VolumeSnapshotClass is not correctly configured. The logs from the csi-rbdplugin-provisioner/csi-snapshotter contain many messages like the following: 2023-04-26T04:03:09.673248713Z E0426 04:03:09.673201 1 snapshot_controller_base.go:283] could not sync content "snapcontent-e5c948cd-6435-457b-8d22-67bd35db0398-clone": failed to delete snapshot "snapcontent-e5c948cd-6435-457b-8d22-67bd35db0398-clone", err: failed to delete snapshot content snapcontent-e5c948cd-6435-457b-8d22-67bd35db0398-clone: "rpc error: code = Internal desc = provided secret is empty" This prevents the VolumeSnapshotContent from being deleted. These objects are expected to be still in the cluster, until deletion succeeded. Could you please check: 1. secrets in the VolumeSnapshotClass 2. VolumeSnapshotContent objects in the cluster Not a 4.13 blocker I have not been able to reproduce this with simple steps:
---- >% ----
#
# yaml files from github.com/ceph/ceph-csi/examples/rbd/
#
oc_wait_status() {
local TEMPLATE="${1}" UNTIL="${2}" OBJ="${3}"
local STATUS=''
while [ "${STATUS}" != "${UNTIL}" ]
do
[ -z "${STATUS}" ] || sleep 1
STATUS=$(oc get --template="{{${TEMPLATE}}}" "${OBJ}")
done
}
create_pvc() {
oc create -f pvc.yaml
oc_wait_status .status.phase Bound persistentvolumeclaim/rbd-pvc
}
create_snapshot() {
oc create -f snapshot.yaml
oc_wait_status .status.readyToUse true volumesnapshot/rbd-pvc-snapshot
}
restore_pvc() {
oc create -f pvc-restore.yaml
oc_wait_status .status.phase Bound persistentvolumeclaim/rbd-pvc-restore
}
cleanup() {
cat pvc.yaml snapshot.yaml pvc-restore.yaml | oc delete -f- --wait
}
RUNS=0
while true
do
create_pvc
create_snapshot
restore_pvc
cleanup
RUNS=$[RUNS+1]
echo "run ${RUNS} finished"
sleep 3
done
---- %< ----
There is no growing list of snapshots on the Ceph side after running this for a working day (~5000 iterations).
Hello Madhu, ODF version was updated to 4.12.5, still same behavior will send you live cluster info in gchat thanks once the policy change to 'Delete' and delete the 'volumesnapshot' & 'volumesnapshotcontent' the 'csi-snap' was deleted too. |