Bug 2123341 - ODF CSI RBD VolumeSnapshot are taking more space than what they're using
Summary: ODF CSI RBD VolumeSnapshot are taking more space than what they're using
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-driver
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Niels de Vos
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-01 12:35 UTC by Pablo Rodriguez Guillamon
Modified: 2023-08-09 16:37 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-13 13:58:05 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHSTOR-3799 0 None None None 2022-09-13 13:58:04 UTC

Comment 6 Niels de Vos 2022-09-09 15:32:14 UTC
Trying to explain things a little differently, maybe that makes things clearer.

Copy-On-Write (COW) works like this:
1. a snapshot is taken at a certain time, and refers to the blocks that an rbd-image consumes (takes not much space)
2. the rbd-image changes, the snapshot is now the only owner of the blocks that were changed (takes up more space)

So, when the rbd-image makes modifications to its blocks, the space consumption of the snapshot grows.


This initial snapshot requires almost no space, even if the rbd-image contained lots of data already. The snapshot is not a self-containing full copy of the rbd-image (it becomes one when it is flattened). A snapshot tracks what changes to the original rbd-image were done, and stores those changes. If over time, the number of changes increase, and the snapshot grows. You can see a snapshot as a large diff between the state of the rbd-image when the snapshot was made, and the latest version of the rbd-image with all its modifications.

Comment 10 Niels de Vos 2022-09-13 13:58:05 UTC
The space consumption is expected behaviour. This comes from the Kubernetes/CSI requirement that VolumeSnapshots must be an independent object; it should be possible to delete the original volume, and restore from the snapshot.

The current design and implementation in Ceph-CSI is documented at https://github.com/ceph/ceph-csi/blob/39b1f2b4d3b79237ffa956ff16a3459cf21bf63f/docs/design/proposals/rbd-snap-clone.md#create-a-snapshot-from-pvc


As a feature request, it should be possible (with added complexity) to create dependent RBD-snapshots (only storing the diff between time of snapshot and current RBD-image), without the cloning step. This should reduce the data consumption quite a bit. Ceph-CSI will need to get extended to do accounting of RBD-image consumers, so that an RBD-image only is deleted if no PersistentVolume or VolumeSnapshotSource references it (indirectly).

As this is a feature request, and we have it tracked at https://issues.redhat.com/browse/RHSTOR-3799 , I will close this BZ now. For further release planning and progress, check the linked issue.


Note You need to log in before you can comment on or make changes to this bug.