Bug 1924129

Summary:	[RFE] write-same operation should efficiently allocate zeroed extents
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Jason Dillaman <jdillama>
Component:	RADOS	Assignee:	Neha Ojha <nojha>
Status:	NEW ---	QA Contact:	Manohar Murthy <mmurthy>
Severity:	high	Docs Contact:
Priority:	high
Version:	5.0	CC:	akupczyk, bhubbard, ceph-eng-bugs, danken, idryomov, ndevos, nojha, pdhiran, rzarzyns, sseshasa, vereddy, vumrao
Target Milestone:	---	Keywords:	FutureFeature, Performance
Target Release:	9.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jason Dillaman 2021-02-02 16:42:49 UTC

Description of problem:
RBD utilizes the RADOS write-same operation to thick-provision RBD images by transferring a small zeroed buffer with the op with the write-same length of the maximum RBD object size (default 4MiB). 

There is a desire to optimize the Ceph cluster IO impact for the thick-provisioned case by having BlueStore treat a write-same of zeroes as a request to allocate the specified amount of space but avoid the need to actually zero the space (i.e. track that the extent is in-use but flag it as being zeroed/uninitialized). 

In the future, CephFS could also add support for utilizing write-same for its "fallocate" handler (seems to only support punch-hole right now).

Version-Release number of selected component (if applicable):
5.0

Comment 1 RHEL Program Management 2021-02-02 16:42:56 UTC

Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 7 Laura Flores 2021-10-25 19:30:16 UTC

> There is a desire to optimize the Ceph cluster IO impact for the thick-provisioned case by having BlueStore treat a write-same of zeroes as a request to allocate the specified amount of space but avoid the need to actually zero the space (i.e. track that the extent is in-use but flag it as being zeroed/uninitialized). 

I am working on a solution for this, where I avoid writing bufferlists in BlueStore that contain zeroes. See this PR, which is still a work in progress, for more details: https://github.com/ceph/ceph/pull/43337