Description of problem:
RBD utilizes the RADOS write-same operation to thick-provision RBD images by transferring a small zeroed buffer with the op with the write-same length of the maximum RBD object size (default 4MiB).
There is a desire to optimize the Ceph cluster IO impact for the thick-provisioned case by having BlueStore treat a write-same of zeroes as a request to allocate the specified amount of space but avoid the need to actually zero the space (i.e. track that the extent is in-use but flag it as being zeroed/uninitialized).
In the future, CephFS could also add support for utilizing write-same for its "fallocate" handler (seems to only support punch-hole right now).
Version-Release number of selected component (if applicable):
5.0
Comment 1RHEL Program Management
2021-02-02 16:42:56 UTC
> There is a desire to optimize the Ceph cluster IO impact for the thick-provisioned case by having BlueStore treat a write-same of zeroes as a request to allocate the specified amount of space but avoid the need to actually zero the space (i.e. track that the extent is in-use but flag it as being zeroed/uninitialized).
I am working on a solution for this, where I avoid writing bufferlists in BlueStore that contain zeroes. See this PR, which is still a work in progress, for more details: https://github.com/ceph/ceph/pull/43337