Description of problem: Cinder backup does not cleanup rbd volume snapshot (ceph) Version-Release number of selected component (if applicable): 16.2 current How reproducible: 100% Steps to Reproduce: Lab Example: $ openstack volume create --size 10 test01 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2022-08-23T16:19:38.000000 | | description | None | | encrypted | False | | id | 03ccce5c-0572-4855-9bab-0bb706d2793f | | migration_status | None | | multiattach | False | | name | test01 | | properties | | | replication_status | None | | size | 10 | | snapshot_id | None | | source_volid | None | | status | creating | | type | tripleo | | updated_at | None | | user_id | 13deedfd3f1a4b609794840cc1f9367e | +---------------------+--------------------------------------+ $ openstack volume list |grep test01 | 03ccce5c-0572-4855-9bab-0bb706d2793f | test01 | available | 10 | | - Verify volume in ceph: # rbd du -p volumes volume-03ccce5c-0572-4855-9bab-0bb706d2793f NAME PROVISIONED USED volume-03ccce5c-0572-4855-9bab-0bb706d2793f 10 GiB 0 B - Create backup $ openstack volume backup create test01 +-------+--------------------------------------+ | Field | Value | +-------+--------------------------------------+ | id | bd86b8c6-ba93-4a3f-ab84-b4954fa46d2f | | name | None | +-------+--------------------------------------+ - Now a snapshot exist in volumes pool: # rbd du -p volumes volume-03ccce5c-0572-4855-9bab-0bb706d2793f NAME PROVISIONED USED volume-03ccce5c-0572-4855-9bab-0bb706d2793f.snap.1661271680.6115808 10 GiB 0 B volume-03ccce5c-0572-4855-9bab-0bb706d2793f 10 GiB 0 B <TOTAL> 10 GiB 0 B - Backup in backups pool: # rbd du -p backups NAME PROVISIONED USED volume-03ccce5c-0572-4855-9bab-0bb706d2793f.backup.bd86b8c6-ba93-4a3f-ab84-b4954fa46d2f.snap.1661271680.6115808 10 GiB 0 B volume-03ccce5c-0572-4855-9bab-0bb706d2793f.backup.bd86b8c6-ba93-4a3f-ab84-b4954fa46d2f 10 GiB 0 B <TOTAL> 10 GiB 0 B - Remove backup: $ openstack volume backup delete bd86b8c6-ba93-4a3f-ab84-b4954fa46d2f - Snaphost still exists in volumes pool: # rbd du -p volumes volume-03ccce5c-0572-4855-9bab-0bb706d2793f NAME PROVISIONED USED volume-03ccce5c-0572-4855-9bab-0bb706d2793f.snap.1661271680.6115808 10 GiB 0 B volume-03ccce5c-0572-4855-9bab-0bb706d2793f 10 GiB 0 B <TOTAL> 10 GiB 0 B - Backup has been removed in backups pool: # rbd du -p backups (nil) - Create 2nd backup: $ openstack volume backup create test01 +-------+--------------------------------------+ | Field | Value | +-------+--------------------------------------+ | id | 23a44964-e1b2-4e23-8730-b82e338d0086 | | name | None | +-------+--------------------------------------+ - Now we have multiple snapshots in volumes pool: # rbd du -p volumes volume-03ccce5c-0572-4855-9bab-0bb706d2793f NAME PROVISIONED USED volume-03ccce5c-0572-4855-9bab-0bb706d2793f.snap.1661271680.6115808 10 GiB 0 B volume-03ccce5c-0572-4855-9bab-0bb706d2793f.snap.1661271809.5704668 10 GiB 0 B volume-03ccce5c-0572-4855-9bab-0bb706d2793f 10 GiB 0 B <TOTAL> 10 GiB 0 B - And a single backup # rbd du -p backups NAME PROVISIONED USED volume-03ccce5c-0572-4855-9bab-0bb706d2793f.backup.23a44964-e1b2-4e23-8730-b82e338d0086.snap.1661271809.5704668 10 GiB 0 B volume-03ccce5c-0572-4855-9bab-0bb706d2793f.backup.23a44964-e1b2-4e23-8730-b82e338d0086 10 GiB 0 B <TOTAL> 10 GiB 0 B - These snapshots continue to build up in volumes pools with additional backups. - I don't see any errors in cinder volume or backup logs: # grep -i error /var/log/containers/cinder/cinder-volume.log (nil) # grep -i error /var/log/containers/cinder/cinder-backup.log (nil)
The root complaint here seems to be that backup snapshots are causing extra RBD storage to be used, and backup snapshots are not always deleted when backups are deleted. https://review.opendev.org/c/openstack/cinder/+/810457 is an upstream attempt to address the storage usage by limiting the number of backup snapshots that are stored, deleting them when they are no longer necessary. This means that less space is held, but backups take longer to restore. This patch is still in review and needs to be assessed for whether these snapshots can always be successfully deleted. (Currently, the Ceph backup driver attempts to delete the snapshots when backups are deleted, but it isn't clear whether this will always succeed.)
(In reply to Eric Harney from comment #2) > The root complaint here seems to be that backup snapshots are causing extra > RBD storage to be used, and backup snapshots are not always deleted when > backups are deleted. > > https://review.opendev.org/c/openstack/cinder/+/810457 is an upstream > attempt to address the storage usage by limiting the number of backup > snapshots that are stored, deleting them when they are no longer necessary. > This means that less space is held, but backups take longer to restore. > This patch is still in review and needs to be assessed for whether these > snapshots can always be successfully deleted. (Currently, the Ceph backup > driver attempts to delete the snapshots when backups are deleted, but it > isn't clear whether this will always succeed.) Eric, what's the current status of this BZ? It's been more than a whole year without any interaction here. Thanks.