1540722 – VDO statistics do not account for concurrent dedupe

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1540722 - VDO statistics do not account for concurrent dedupe

Summary: VDO statistics do not account for concurrent dedupe

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	kmod-kvdo
Sub Component:
Version:	7.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	sclafani
QA Contact:	Jakub Krysl
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-01-31 21:13 UTC by sclafani
Modified:	2021-09-03 12:03 UTC (History)
CC List:	3 users (show)
Fixed In Version:	6.1.1.117
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-10-30 09:38:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:3094	0	None	None	None	2018-10-30 09:39:26 UTC

Description sclafani 2018-01-31 21:13:05 UTC

Description of problem:

The new concurrent dedupe handling (VDOSTORY-190) doesn't have separate VDO statistics for deduplication of concurrent writes, so it's no longer possible to correctly determine the total cumulative dedupe VDO has found in non-zero data.

Version-Release number of selected component (if applicable):

kmod-kvdo-6.1.0.124-11.el7.x86_64

How reproducible:

Somewhat timing-depending since it relies on concurrent writes of the same data, but with the right dataset it's easy to reproduce.


Steps to Reproduce:
1. Create a VDO device (/dev/dm-1 below)
2. Write many copies of the same block to the device. This is the incantation used 
   in one of the VDO automated tests:
   fio --minimal --bs=4096 --rw=write
     --name=generic_job_name  --filename=/dev/dm-1 --numjobs=4 --size=10737418240
     --thread --norandommap --randrepeat=1 --group_reporting
     --buffer_pattern=0xDeadBeef --unlink=0 --direct=1 --iodepth=128
     --ioengine=libaio --offset=0 --offset_increment=10737418240 
3. vdostats /dev/dm-1 --all

Actual results:

"dedupe advice valid" in the stats will be much much smaller than the number of blocks written despite (on a new volume) "data blocks used" being much smaller than the dataset size.

Expected results:

We need to add a new stat (probably "concurrent hash matches" along with "concurrent hash collisions" for complete coverage of all the cases) that will count the bios that deduped against others in memory. 

Additional info:

Comment 3 sclafani 2018-03-05 22:12:50 UTC

I have fixes ready to merge if this gets ack'ed.

Comment 5 Jakub Krysl 2018-07-11 09:59:25 UTC

I tested with kmod-kvdo-6.1.1.99 and vdo-6.1.1.99. I noticed 2 things using the reproducer:

There is no mention of the new fields in vdostats manpage.

Testing with kmod-kvdo-6.1.0.176 showed 'dedupe advice valid' 405 and 76 in 2 runs. Doing the same with kmod-kvdo-6.1.1.99 showed both 1 in 2 runs.
I got 10485760 'bios in/out' and 10485759 (10485758 in second run) on 'concurrent data matches'. 'saving percent' is 99 in all cases.

Comment 6 sclafani 2018-07-18 20:22:12 UTC

I missed that these were documented in the manpage. I'll remedy that. Please re-open this so it can merged back to 7.6.

6.1.0.176 has the changes to concurrent dedupe, but not the stats fix, so I think what you're seeing is exactly the problem the new stats are addressing: if you write many copies of the same block concurrently, they aren't counted in 'dedupe advice valid', nor anywhere else. In that version, if I write the same block over and over, but very slowly (not concurrently), all the dedupe should be accounted for in 'dedupe advice valid'. If I write them very quickly, some might be counted there, but most will not, which is why you're seeing '405' and '76'.

In 6.1.1.84+, if I write the same block over and over, non-concurrently, the dedupe will still be counted in 'dedupe advice valid' (because when each write arrives, we have to go to the UDS index to find the dedupe candidate and validate it). But if another write arrives with the same data while a write for that data is still pending, we don't have to use index advice at all, and it's counted in 'concurrent data matches'. It sounds like that's what you saw in your test.

Comment 7 Jakub Krysl 2018-07-23 11:38:54 UTC

Thanks for explanation, giving back to fix the manpage.

Comment 8 Jakub Krysl 2018-08-30 15:02:13 UTC

Tested with vdo-6.1.1.120-3.el7, the new values are explained in manpage now:
       concurrent data matches
              The number of writes with the same data as another in-flight write.

       concurrent hash collisions
              The number of writes whose hash collided with an in-flight write.

Comment 10 errata-xmlrpc 2018-10-30 09:38:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3094

Note You need to log in before you can comment on or make changes to this bug.