Bug 2029751

Summary: nbdkit-cow-filter does not allow cow-block-size=4096
Product: Red Hat Enterprise Linux 9 Reporter: Richard W.M. Jones <rjones>
Component: nbdkitAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED ERRATA QA Contact: mxie <mxie>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0CC: eblake, lersek, mxie, rjones, tyan, tzheng, virt-maint, vwu, xiaodwan
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nbdkit-1.28.3-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-17 12:50:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2011709    
Bug Blocks: 2011713    
Deadline: 2022-02-14   

Description Richard W.M. Jones 2021-12-07 08:40:22 UTC
Description of problem:

Because of a bug in a boundary check, nbdkit-cow-filter does not
allow the valid setting cow-block-size=4096.  This affects virt-v2v
because to get better performance we'd like to use this block size.

Version-Release number of selected component (if applicable):

nbdkit-1.28.2-2.el9.x86_64

How reproducible:

100%

Steps to Reproduce:

The cow-block-size parameter is available:

$ nbdkit --filter=cow null --help | grep block-size
cow-block-size=<N>       Set COW block size.

If you try to set it to 4096, it gives an error:

$ nbdkit --filter=cow null cow-block-size=4096 --run true
nbdkit: error: cow-block-size is out of range (4096..2G) or not a power of 2

If you set it to another power of 2 value, it's OK:

$ nbdkit --filter=cow null cow-block-size=8192 --run true

Comment 2 Richard W.M. Jones 2021-12-07 08:44:52 UTC
Add bug 2011709 because we'll probably pick this fix up in a rebase.

Comment 3 mxie@redhat.com 2021-12-17 04:49:10 UTC
Test the bug with nbdkit-1.28.3-2.el9.x86_64

Steps:
1.Check the man page of nbdkit-cow-filter
# man nbdkit-cow-filter  |grep cow-block-size -A 3
                                   [cow-block-size=N]
                                   [cow-on-cache=false|true]
                                   [cow-on-read=false|true|/PATH]

--
       cow-block-size=N
           Set the block size used by the filter.  This has to be a power of two and the minimum block size is 4K.
           The maximum block size depends on the plugin, but a block size larger than a few megabytes is not
           usually a good idea.

2. # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=0
....
nbdkit: error: cow-block-size is out of range (4096..2G) or not a power of 2
....

3.Convert a guest from VMware with nbdkit and add cow-block-size=4096/4k to command line

3.1 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=4096 
.....
real	6m53.898s
user	0m12.463s
sys	1m40.819s


3.2 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=4k
....
real	7m34.770s
user	0m10.748s
sys	1m26.901s


4.Convert a guest from VMware with nbdkit and add cow-block-size=8196/8K to command line
4.1 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=8192
.....
real	7m37.623s
user	0m11.539s
sys	1m32.833s

# time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=8K
....
real	7m20.561s
user	0m10.992s
sys	1m28.543s


Hi rjones,
   
  It's strange that setting cow-block-size=4k takes slightly longer to convert than setting cow-block-size=4096, I get same result after trying many times, is it normal?

Comment 6 Richard W.M. Jones 2021-12-17 10:09:28 UTC
It's pretty strange and I don't understand it.  However the block size is
definitely set to 4096 in both cases.  In the logs you can see the cow
filter getting a 512 byte request from the client, and making a 4096 byte request
to the underlying plugin (in both logs):

nbdkit: vddk.2: debug: cow: pread count=512 offset=0
nbdkit: vddk.2: debug: vddk: pread count=4096 offset=0

If this is consistently reproducible, I can take a closer look when I get back from holiday.

Comment 9 mxie@redhat.com 2022-01-20 03:52:30 UTC
Verify the bug with nbdkit-1.28.4-2.el9.x86_64

Steps:
1.Check the man page of nbdkit-cow-filter
# man nbdkit-cow-filter  |grep cow-block-size -A 3
                                   [cow-block-size=N]
                                   [cow-on-cache=false|true]
                                   [cow-on-read=false|true|/PATH]

--
       cow-block-size=N
           Set the block size used by the filter.  This has to be a power of two and the minimum block size is 4K.
           The maximum block size depends on the plugin, but a block size larger than a few megabytes is not
           usually a good idea.

2. # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=0
....
nbdkit: error: cow-block-size is out of range (4096..2G) or not a power of 2
....

3.Convert a guest from VMware with nbdkit and add cow-block-size=4096/4K/4k to command line

3.1 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=4096 
.....
real	6m44.493s
user	0m12.028s
sys	1m33.559s


3.2 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=4K
....
real	7m19.641s
user	0m12.294s
sys	1m33.509s

3.3 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=4k
....
real	7m21.094s
user	0m12.076s
sys	1m32.004s


4.Convert a guest from VMware with nbdkit and add cow-block-size=8196/8K to command line
4.1 # time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=8192
.....
real	7m37.161s
user	0m11.950s
sys	1m32.963s

# time nbdkit -rfv -U - --exportname --filter=retry  vddk server=10.73.198.169 user=root password=+/home/passwd vm=moref=vm-6300  file='[esx7.0-matrix] esx7.0-win11-x86_64/esx7.0-win11-x86_64.vmdk' libdir=/home/vddk7.0.2 thumbprint='B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78' transports=nbdssl  compression=skipz --run 'qemu-img convert $nbd /home/esx7.0-win11-x86_64' --filter=cacheextents --filter=delay rdelay=20ms --filter=cow cow-on-read=true cow-block-size=8K
....
real	7m38.320s
user	0m11.771s
sys	1m33.122s


Hi rjones,
   
  I still can reproduce the problem 'cow-block-size=4k/4K takes slightly longer to convert than setting cow-block-size=4096, do you think this problem deserves our attention?

Comment 10 Richard W.M. Jones 2022-01-20 08:47:30 UTC
I think just to verify this bug on its own, all that is needed is:

  $ nbdkit --filter=cow null cow-block-size=4096 --run true

(which should exit with no error).  Previously trying to set this gave
an error.

The performance benefits of cow-block-size=4096 are difficult to measure
in the way you're doing for a few reasons:

 - The actual performance benefit was with fstrim (not copying).  It's very
   difficult to test fstrim on its own, it can only really be measured as
   part of the whole virt-v2v process.  See the commit message for more details:
   https://github.com/libguestfs/virt-v2v/commit/351d61f768c51287917a04b9fbedf24d79f5deb4

 - Virt-v2v now uses nbdcopy, not qemu-img convert.

I would leave performance testing in the bug / email thread that we already
have and not try to do it here.

Comment 11 mxie@redhat.com 2022-01-20 09:36:24 UTC
# nbdkit --filter=cow null cow-block-size=4096 --run true
Result:exit with no error

Move the bug from ON_QA to VERIFIED  according to comment9 and comment10

Comment 13 errata-xmlrpc 2022-05-17 12:50:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: nbdkit), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:2408