Bug 2041480

Summary: [incremental_backup] Inconsistent block status reply in qemu-nbd
Product: Red Hat Enterprise Linux 8 Reporter: Nir Soffer <nsoffer>
Component: qemu-kvmAssignee: Hanna Czenczek <hreitz>
qemu-kvm sub component: Storage QA Contact: Tingting Mao <timao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: coli, jinzhao, juzhang, kkiwi, virt-maint
Version: 8.6Keywords: Regression, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-6.2.0-7.module+el8.6.0+14250+cd2c04c3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-10 13:25:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2041461    
Bug Blocks:    
Attachments:
Description Flags
Reproducer script none

Description Nir Soffer 2022-01-17 13:12:19 UTC
Created attachment 1851351 [details]
Reproducer script

Description of problem:

When using qemu-nbd --allocation-depth option, and registering "qemu:allocation-depth"
during handshake, and using raw image, NBD_CMD_BLOCK_STATUS returns correct reply in 
the first call, but in the next call for the same byte range is reported as data
(flags=0).

This may break clients (e.g. backup applications) if they issue multiple
NBD_CMD_BLOCK_STATUS commands for the same range. If the client use the second
reply, it may read the byte range and copy it over the newtwork for no reason.
This can cause a severe performance regression for users.

Version-Release number of selected component (if applicable):
qemu-img-6.2.0-2.module+el8.6.0+13738+17338784.x86_64

How reproducible:
Always

Steps to Reproduce:

1. Start qemu-nbd with --allocation-depth option and raw image

$ truncate -s1g empty.raw\

$ qemu-nbd --read-only \
    --persistent \
    --allocation-depth \
    --socket=/tmp/nbd.sock \
    --format=raw \
    empty.raw &

2. Connect to qemu-nbd with nbd client.

I'm using ovirt-imageio nbd client, but this should be reproducible
with libnbd. To install the client use:

# dnf copr enable nsoffer/ovirt-imageio-preview
# dnf install ovirt-imageio-client

$ python3
Python 3.6.8 (default, Oct 15 2021, 10:57:33) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.

Enable debug logs so we can see all nbd commands and replies...

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)

Connect to qemu-nbd:

>>> from ovirt_imageio._internal import nbd
>>> c = nbd.Client(nbd.UnixAddress("/tmp/nbd.sock"))
DEBUG:nbd:Connecting address='/tmp/nbd.sock' export_name='' dirty=False
DEBUG:nbd:Received server flags: 3
DEBUG:nbd:Sending client flags: 1
DEBUG:nbd:Sending option: 8 data: b''
DEBUG:nbd:Received reply magic=3e889045565a9 option=8 type=1 len=0
DEBUG:nbd:Structured reply enabled
DEBUG:nbd:Sending option: 10 data: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x0fbase:allocation\x00\x00\x00\x15qemu:allocation-depth')
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=4 len=19
DEBUG:nbd:Meta context base:allocation is available id=0
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=4 len=25
DEBUG:nbd:Meta context qemu:allocation-depth is available id=1
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=1 len=0
DEBUG:nbd:Sending option: 7 data: bytearray(b'\x00\x00\x00\x00\x00\x00')
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=3 len=14
DEBUG:nbd:Received block size info minimum=1 preferred=4096 maximum=33554432
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=3 len=12
DEBUG:nbd:Received export info size=1073741824 flags=1423
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=1 len=0
DEBUG:nbd:Ready for transmission

Get extents (first call):

>>> from pprint import pprint
>>> pprint(c.extents(0, c.export_size))
DEBUG:nbd:Sending NBD_CMD_BLOCK_STATUS handle=0 offset=0 length=1073741824 flags=0
DEBUG:nbd:Extent length=1073741824 flags=3 context=0
DEBUG:nbd:Extent length=1073741824 flags=1 context=1
{'base:allocation': [Extent(length=1073741824, flags=3)],
 'qemu:allocation-depth': [Extent(length=1073741824, flags=0)]}

We got the flags = 3  (NBD_STATE_HOLE | NBD_STATE_ZERO)

Get extents (second call):

>>> pprint(c.extents(0, c.export_size))
DEBUG:nbd:Sending NBD_CMD_BLOCK_STATUS handle=1 offset=0 length=1073741824 flags=0
DEBUG:nbd:Extent length=1073741824 flags=0 context=0
DEBUG:nbd:Extent length=1073741824 flags=1 context=1
{'base:allocation': [Extent(length=1073741824, flags=0)],
 'qemu:allocation-depth': [Extent(length=1073741824, flags=0)]}

We got the flags = 0  (data)


Actual results:
NBD_CMD_BLOCK_STATUS reply is inconsistent, second call does not match
first call.

Expected results:
Getting same reply in all calls.

Additional info:

This does not happen if we disable --allocation-depth:

$ qemu-nbd --read-only \
    --persistent \
    --socket=/tmp/nbd.sock \
    --format=raw \
    empty.raw &

$ python
>>> from ovirt_imageio._internal import nbd
>>> from pprint import pprint
>>> c = nbd.Client(nbd.UnixAddress("/tmp/nbd.sock"))
Failed to activate qemu:allocation-depth
>>> pprint(c.extents(0, c.export_size))
{'base:allocation': [Extent(length=1073741824, flags=3)]}
>>> pprint(c.extents(0, c.export_size))
{'base:allocation': [Extent(length=1073741824, flags=3)]}

Got the expected reply in both calls.


Using git bisect with the attached reproducer script show this commit:

commit 0bc329fbb009f8601cec23bf2bc48ead0c5a5fa2
Author: Hanna Reitz <hreitz>
Date:   Thu Aug 12 10:41:44 2021 +0200

    block: block-status cache for data regions

Upstream discussion:
https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00292.html

Comment 1 Nir Soffer 2022-01-17 13:13:46 UTC
This is mostly a clone of bug 2041461 (rhel 9).

Comment 2 Nir Soffer 2022-01-17 16:13:03 UTC
I tried to reproduce this with RHV API but I could not reproduce.

In RHV we do not support yet getting partial extents, only getting
all extents for entire image. Practically in RHV we always do
something like this (assuming 6g image):

    extents = []
    for offset in range(0, 6*1024**3, 2*1024**3):
        r = c1.extents(offset, 2*1024**3)
        extents.extend(r["base:allocation"])

Repeating this again return the same results on every call.

Example usage with RHV API:

$ ./image_transfer.py -c engine-dev download eaaf6789-90d9-4ad0-ab54-04e16fa7570a
[   0.0 ] Connecting to engine...
[   0.0 ] Looking up disk eaaf6789-90d9-4ad0-ab54-04e16fa7570a
[   0.1 ] Creating image transfer for download
[   1.4 ] Transfer ID: 30ec7761-64d7-4db7-8073-30cb5474de03
[   1.4 ] Transfer host name: host4
[   1.4 ] Transfer URL: https://host4:54322/images/bd64768a-167d-4172-b37d-bef9ce075f77
[   1.4 ] Proxy URL: https://engine-dev:54323/images/bd64768a-167d-4172-b37d-bef9ce075f77
[   1.4 ] Conneted to imageio server

Getting extents for entire image:

>>> import requests
>>> s = requests.Session()
>>> s.verify = False
>>> first = s.get("https://host4:54322/images/bd64768a-167d-4172-b37d-bef9ce075f77/extents").json()
>>> second = s.get("https://host4:54322/images/bd64768a-167d-4172-b37d-bef9ce075f77/extents").json()
>>> first == second
True

Comment 7 Tingting Mao 2022-02-16 07:42:29 UTC
Reproduced in qemu-kvm-6.2.0-6.module+el8.6.0+14165+5e5e76ac:

Steps:
# qemu-img create -f raw empty.raw 1g

# qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  allocated
      4096  1073737728    3  hole,zero

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  allocated

Results:
As above, after latest step, the dumped info is changed to all allocated data which is different from the results of the 3rd command. 



Verified in qemu-kvm-6.2.0-7.module+el8.6.0+14250+cd2c04c3:

Steps:
# qemu-img create -f raw empty.raw 1g

# qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  allocated
      4096  1073737728    3  hole,zero

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  allocated
      4096  1073737728    3  hole,zero

Results:
As above, after latest step, the dumped info is still the same as the result of 3rd command.

Comment 8 Yanan Fu 2022-02-22 09:12:55 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 10 errata-xmlrpc 2022-05-10 13:25:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759