Description of problem : The issue is observed in OpenShift Virtualization environment where the snapshot hits the hard limit of 450 when the user creates 450+ VMs from the golden image/template image. When a VM is created from the golden image/template image, the containerized-data-importer(CDI) does smart cloning[1]: => Create a snapshot of the source PVC => Create a PVC from the created snapshot => Delete the snapshot So even if we delete the snapshot it will be in the trash as it's still linked to the cloned images. ~~~ A cloned image: # rbd info orion/csi-vol-fdc42829-e939-4cad-be29-d5f6046eefc2 |grep parent parent: orion/csi-snap-5a84979d-7ab0-4be2-afd4-c10fdab3dde2@2762fc24-7919-4ea2-b096-d92636352751 (trash 30fdb8fd11e6fc) # rbd trash ls --pool orion |grep 30fdb8fd11e6fc 30fdb8fd11e6fc csi-snap-5a84979d-7ab0-4be2-afd4-c10fdab3dde2 Source image: rbd snap ls orion/csi-vol-577fa68a-4b37-479a-bf58-023a320af82c --all |grep csi-snap-5a84979d-7ab0-4be2-afd4-c10fdab3dde2 2153 57a5cce4-8561-4093-812a-bf881cae177d 30 GiB Tue Aug 15 10:49:42 2023 trash (csi-snap-5a84979d-7ab0-4be2-afd4-c10fdab3dde2) ~~~ The csi driver starts flattening the older snapshots when it reaches the soft limit of 250. However, it's failing with the error below: ~~~ I0815 13:49:48.576387 1 controllerserver.go:557] ID: 2371 Req-ID: snapshot-e57264db-b2c8-4bc7-8126-13fd9df21038 snapshots count 254 on image: orion/csi-vol-577fa68a-4b37-479a-bf58-023a320af82c reached configured soft limit 250 E0815 13:49:48.786336 1 rbd_util.go:823] ID: 2369 Req-ID: snapshot-e4e5635f-7f7c-499e-a287-c58d15c115f4 failed to add task flatten for orion/csi-snap-54fa655a-a088-463f-a2fc-7e747fe5d2b7 : rados: ret=-2, No such file or directory: "[errno 2] RBD image not found (Image orion/csi-snap-54fa655a-a088-463f-a2fc-7e747fe5d2b7 does not exist)" E0815 13:49:48.797025 1 rbd_util.go:823] ID: 2369 Req-ID: snapshot-e4e5635f-7f7c-499e-a287-c58d15c115f4 failed to add task flatten for orion/csi-snap-d4613468-6990-4ffd-8494-a87d1fd6fa08 : rados: ret=-2, No such file or directory: "[errno 2] RBD image not found (Image orion/csi-snap-d4613468-6990-4ffd-8494-a87d1fd6fa08 does not exist)" E0815 13:49:48.807744 1 rbd_util.go:771] ID: 2369 Req-ID: snapshot-e4e5635f-7f7c-499e-a287-c58d15c115f4 failed to flatten orion/csi-snap-5b39031a-644c-4579-b58b-5c3357efd15b; err rados: ret=-2, No such file or directory: "[errno 2] RBD image not found (Image orion/csi-snap-5b39031a-644c-4579-b58b-5c3357efd15b does not exist)" E0815 13:49:48.819930 1 rbd_util.go:823] ID: 2369 Req-ID: snapshot-e4e5635f-7f7c-499e-a287-c58d15c115f4 failed to add task flatten for orion/csi-snap-b16fe0cc-07fd-4b75-a59d-541b96151f40 : rados: ret=-2, No such file or directory: "[errno 2] RBD image not found (Image orion/csi-snap-b16fe0cc-07fd-4b75-a59d-541b96151f40 does not exist)" ~~~ These images which csi is trying to flatten are in the trash: ~~~ # rbd trash ls --pool orion |egrep "54fa655a|d4613468|5b39031a|b16fe0cc" 30fdb892b3c763 csi-snap-54fa655a-a088-463f-a2fc-7e747fe5d2b7 30fdb8adb44384 csi-snap-5b39031a-644c-4579-b58b-5c3357efd15b 30fdb8b8b7c76f csi-snap-d4613468-6990-4ffd-8494-a87d1fd6fa08 30fdb8e671371f csi-snap-b16fe0cc-07fd-4b75-a59d-541b96151f40 ~~~ Subsequently, after creating more VMs from the golden image, it hit the hard limit: ~~~ I0815 15:11:45.211253 1 controllerserver.go:536] ID: 3701 Req-ID: snapshot-ecd44693-7d3c-4021-aa39-502857abce2a snapshots count 454 on image: orion/csi-vol-577fa68a-4b37-479a-bf58-023a320af82c reached configured hard limit 450 # rbd info csi-vol-577fa68a-4b37-479a-bf58-023a320af82c --pool orion rbd image 'csi-vol-577fa68a-4b37-479a-bf58-023a320af82c': size 30 GiB in 7680 objects order 22 (4 MiB objects) snapshot_count: 454 <=== id: 27cab14fb82d74 block_name_prefix: rbd_data.27cab14fb82d74 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, operations op_features: clone-parent, snap-trash flags: create_timestamp: Tue Aug 1 09:52:03 2023 access_timestamp: Tue Aug 15 11:30:17 2023 modify_timestamp: Tue Aug 1 09:52:03 2023 ~~~ So the newer volumesnapshots have to indefinitely wait. Shouldn't it run flatten on the cloned image rather than the temporary snapshot in the trash? ~~~ # rbd info orion/csi-vol-f6c4ae6d-de88-4840-b278-a226c8fc6942 |grep parent parent: orion/csi-snap-37801a60-460d-4ff1-a73d-0fb5f7d7a349@2fdad284-f361-44eb-bcc0-b5f7e2bf11f4 (trash 30fdb86b06e5f9) # rbd flatten orion/csi-snap-37801a60-460d-4ff1-a73d-0fb5f7d7a349 rbd: error opening image csi-snap-37801a60-460d-4ff1-a73d-0fb5f7d7a349: (2) No such file or directory # rbd flatten orion/csi-vol-f6c4ae6d-de88-4840-b278-a226c8fc6942 Image flatten: 100% complete...done. ~~~ Version of all relevant components (if applicable): OpenShift Data Foundation 4.12.5-rhodf Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? The user is not able to create more than 450+ VMs from a golden image. Is there any workaround available to the best of your knowledge? Users have to manually flatten the cloned image. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2. Can this issue reproducible? Yes, 100%. Can this issue reproduce from the UI? Yes. If this is a regression, please provide more details to justify this: Steps to Reproduce: I reproduced it in OpenShifift Virtualization, but I think it should be also easily reproducible by creating snapshots and then cloning images from the snapshot, 450+ times. I created 450+ VMs using the below for loop: # for i in {1..460};do virtctl create vm --volume-datasource=src:openshift-virtualization-os-images/rhel9 |oc create -f -;sleep 10;done Actual results: The automatic flattening of snapshots is not working. Expected results: It should start flattening the snapshots automatically when the number of snapshots of the image reaches 250. Additional info: [1] https://github.com/kubevirt/containerized-data-importer/blob/main/doc/smart-clone.md#smart-cloning