Bug 1990135

Summary: Fix data corruption in nbdkit-cow-filter and nbdkit-cache-filter
Product: Red Hat Enterprise Linux 8 Reporter: Richard W.M. Jones <rjones>
Component: nbdkitAssignee: Richard W.M. Jones <rjones>
Status: CLOSED ERRATA QA Contact: mxie <mxie>
Severity: medium Docs Contact:
Priority: high    
Version: 8.5CC: eblake, juzhou, kkiwi, mxie, rjones, tyan, tzheng, virt-bugs, virt-maint, vwu, xiaodwan, ymankad
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nbdkit-1.24.0-3.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1990134
: 2040775 (view as bug list) Environment:
Last Closed: 2022-05-10 13:20:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1990134    
Bug Blocks: 2040775, 2040781    

Description Richard W.M. Jones 2021-08-04 19:57:45 UTC
+++ This bug was initially created as a clone of Bug #1990134 +++

Description of problem:

A data corrupter bug has been identified in the following
nbdkit filters:

https://libguestfs.org/nbdkit-cow-filter.1.html
https://libguestfs.org/nbdkit-cache-filter.1.html

It's quite subtle, but it affects modular virt-v2v and was
picked up by Ming Xie during testing.

The full bug is described in this link, as well as a simple way
to reproduce it, and the fix:

https://listman.redhat.com/archives/libguestfs/2021-August/msg00044.html

Version-Release number of selected component (if applicable):

All versions of nbdkit since 1.14 up to 1.27.4

How reproducible:

100%

Steps to Reproduce:
1. See link above.

Comment 3 Richard W.M. Jones 2021-08-05 09:31:06 UTC
Upstream fix is:
https://gitlab.com/nbdkit/nbdkit/-/commit/c0b15574647672cb5c48178333acdd07424692ef

Comment 4 Richard W.M. Jones 2021-08-05 11:36:59 UTC
Waiting on RHEL AV 8.6 branch to open before I can push this.

Comment 7 mxie@redhat.com 2021-09-16 10:47:11 UTC
Test the bug with below builds:
nbdkit-1.24.0-3.module+el8.6.0+12512+6129c95d.x86_64
libnbd-1.6.0-4.module+el8.6.0+12490+ec3e565c.x86_64
python3-libnbd-1.6.0-4.module+el8.6.0+12490+ec3e565c.x86_64

Steps:
1.# nbdkit --filter=cow data "33 * 100000" --run 'nbdsh -u $uri -c "h.trim(100000, 0)" ; nbdcopy $uri - | hexdump -C'
nbdsh: command line script failed: nbd_trim: server does not support trim operations: Invalid argument
00000000  21 21 21 21 21 21 21 21  21 21 21 21 21 21 21 21  |!!!!!!!!!!!!!!!!|
*
00018000


Result:
   Got unexpected result, bug is not fixed.

Comment 9 Richard W.M. Jones 2021-09-16 11:44:12 UTC
Actually the problem is with the test itself:

  nbdsh: command line script failed: nbd_trim: server does not support trim operations: Invalid argument

Replacing ";" with "&&" shows that the script failed:

  $ nbdkit --filter=cow data "33 * 100000" --run 'nbdsh -u $uri -c "h.trim(100000, 0)" && nbdcopy $uri - | hexdump -C'
  nbdsh: command line script failed: nbd_trim: server does not support trim operations: Invalid argument
  $ echo $?
  1

I believe it's not possible to hit this code path with nbdkit 1.24.
This is because trim is not support in the cache and cow filters in
1.24.  zero is supported, but is not reachable because:

  https://gitlab.com/nbdkit/nbdkit/-/blob/7f4f80fd418452ffcc29e3f4d99e7ae90880dbde/filters/cow/cow.c#L415

I still want to keep this patch in 1.24 because it is possible
that future changes on this branch (eg. enabling trim support)
could trigger this bug.

Comment 10 Richard W.M. Jones 2021-09-16 11:45:23 UTC
May be best to check that the patch is applied in build.log?
In other words, simple sanity only check.

Comment 11 John Ferlan 2021-09-29 11:55:07 UTC
Rich - assigning to you directly since we have an ITM/release+ and you've made the change. Feel free to move to eblake if he's doing the downstream work for nbdkit.

Comment 14 mxie@redhat.com 2021-10-12 08:58:29 UTC
Verify the bug with nbdkit-1.24.0-3.module+el8.6.0+12861+13975d62.x86_64

Steps:
1.According to comment10, check if the patch 'https://gitlab.com/nbdkit/nbdkit/-/commit/c0b15574647672cb5c48178333acdd07424692ef' is added to nbdkit-1.24.0-3

# rpmbuild -rp nbdkit-1.24.0-3.module+el8.6.0+12861+13975d62.src.rpm

# cat /root/rpmbuild/BUILD/nbdkit-1.24.0/filters/cache/cache.c |grep memset
      memset (&block[blkoffs], 0, n);
    memset (block, 0, blksize);
      memset (block, 0, count);

# cat /root/rpmbuild/BUILD/nbdkit-1.24.0/filters/cow/cow.c |grep memset
      memset (&block[blkoffs], 0, n);
    memset (block, 0, BLKSIZE);
      memset (block, 0, count);


Hi Richard,

    I found the code changes of https://gitlab.com/nbdkit/nbdkit/-/commit/c0b15574647672cb5c48178333acdd07424692ef are different with the changes of https://gitlab.com/nbdkit/nbdkit/-/commit/a0ae7b2158598ce48ac31706319007f716d01c87, which commit is expected for the bug?

Check the code changes for the bug in nbdkit-1.28.0-1.el9.src.rpm, the result of cow.c is different with nbdkit-1.24.0-3

# cat /root/rpmbuild/BUILD/nbdkit-1.28.0/filters/cow/cow.c |grep memset
      memset (&block[blkoffs], 0, n);
    memset (block, 0, blksize);
      memset (block, 0, count);
      memset (&block[blkoffs], 0, n);
      memset (block, 0, count);

Comment 15 Richard W.M. Jones 2021-10-12 12:07:21 UTC
The reason for the difference is because nbdkit 1.24 cow/cache
filters did not support the TRIM operation.  So the whole function
(cow_trim, cache_trim) is missing in 1.24.  This leads to fewer
calls to memset (only cow_zero/cow_trim).  In addition the *_zero
functions call memset three times because they actually zero the
head, main body of data, and tail.  But the *_trim functions call
memset twice because they only zero the head and tail (there's a trim-
specific operation for the main body of data).

Anyway, these match the patched files, so the verification is
correct, thanks.

Comment 16 mxie@redhat.com 2021-10-12 15:07:46 UTC
Thanks Richard, move the bug from ON_QA to VERIFIED according to comment14 and comment15

Comment 17 Yash Mankad 2022-01-14 17:24:08 UTC
*** Bug 2040775 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2022-05-10 13:20:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759