RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2041480 - [incremental_backup] Inconsistent block status reply in qemu-nbd
Summary: [incremental_backup] Inconsistent block status reply in qemu-nbd
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Hanna Czenczek
QA Contact: Tingting Mao
URL:
Whiteboard:
Depends On: 2041461
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-17 13:12 UTC by Nir Soffer
Modified: 2022-05-10 13:41 UTC (History)
5 users (show)

Fixed In Version: qemu-kvm-6.2.0-7.module+el8.6.0+14250+cd2c04c3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-10 13:25:20 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Reproducer script (628 bytes, text/plain)
2022-01-17 13:12 UTC, Nir Soffer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/qemu-kvm qemu-kvm merge_requests 112 0 None None None 2022-02-08 13:39:01 UTC
Red Hat Issue Tracker RHELPLAN-108317 0 None None None 2022-01-17 13:15:12 UTC
Red Hat Product Errata RHSA-2022:1759 0 None None None 2022-05-10 13:25:59 UTC

Description Nir Soffer 2022-01-17 13:12:19 UTC
Created attachment 1851351 [details]
Reproducer script

Description of problem:

When using qemu-nbd --allocation-depth option, and registering "qemu:allocation-depth"
during handshake, and using raw image, NBD_CMD_BLOCK_STATUS returns correct reply in 
the first call, but in the next call for the same byte range is reported as data
(flags=0).

This may break clients (e.g. backup applications) if they issue multiple
NBD_CMD_BLOCK_STATUS commands for the same range. If the client use the second
reply, it may read the byte range and copy it over the newtwork for no reason.
This can cause a severe performance regression for users.

Version-Release number of selected component (if applicable):
qemu-img-6.2.0-2.module+el8.6.0+13738+17338784.x86_64

How reproducible:
Always

Steps to Reproduce:

1. Start qemu-nbd with --allocation-depth option and raw image

$ truncate -s1g empty.raw\

$ qemu-nbd --read-only \
    --persistent \
    --allocation-depth \
    --socket=/tmp/nbd.sock \
    --format=raw \
    empty.raw &

2. Connect to qemu-nbd with nbd client.

I'm using ovirt-imageio nbd client, but this should be reproducible
with libnbd. To install the client use:

# dnf copr enable nsoffer/ovirt-imageio-preview
# dnf install ovirt-imageio-client

$ python3
Python 3.6.8 (default, Oct 15 2021, 10:57:33) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.

Enable debug logs so we can see all nbd commands and replies...

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)

Connect to qemu-nbd:

>>> from ovirt_imageio._internal import nbd
>>> c = nbd.Client(nbd.UnixAddress("/tmp/nbd.sock"))
DEBUG:nbd:Connecting address='/tmp/nbd.sock' export_name='' dirty=False
DEBUG:nbd:Received server flags: 3
DEBUG:nbd:Sending client flags: 1
DEBUG:nbd:Sending option: 8 data: b''
DEBUG:nbd:Received reply magic=3e889045565a9 option=8 type=1 len=0
DEBUG:nbd:Structured reply enabled
DEBUG:nbd:Sending option: 10 data: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x0fbase:allocation\x00\x00\x00\x15qemu:allocation-depth')
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=4 len=19
DEBUG:nbd:Meta context base:allocation is available id=0
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=4 len=25
DEBUG:nbd:Meta context qemu:allocation-depth is available id=1
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=1 len=0
DEBUG:nbd:Sending option: 7 data: bytearray(b'\x00\x00\x00\x00\x00\x00')
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=3 len=14
DEBUG:nbd:Received block size info minimum=1 preferred=4096 maximum=33554432
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=3 len=12
DEBUG:nbd:Received export info size=1073741824 flags=1423
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=1 len=0
DEBUG:nbd:Ready for transmission

Get extents (first call):

>>> from pprint import pprint
>>> pprint(c.extents(0, c.export_size))
DEBUG:nbd:Sending NBD_CMD_BLOCK_STATUS handle=0 offset=0 length=1073741824 flags=0
DEBUG:nbd:Extent length=1073741824 flags=3 context=0
DEBUG:nbd:Extent length=1073741824 flags=1 context=1
{'base:allocation': [Extent(length=1073741824, flags=3)],
 'qemu:allocation-depth': [Extent(length=1073741824, flags=0)]}

We got the flags = 3  (NBD_STATE_HOLE | NBD_STATE_ZERO)

Get extents (second call):

>>> pprint(c.extents(0, c.export_size))
DEBUG:nbd:Sending NBD_CMD_BLOCK_STATUS handle=1 offset=0 length=1073741824 flags=0
DEBUG:nbd:Extent length=1073741824 flags=0 context=0
DEBUG:nbd:Extent length=1073741824 flags=1 context=1
{'base:allocation': [Extent(length=1073741824, flags=0)],
 'qemu:allocation-depth': [Extent(length=1073741824, flags=0)]}

We got the flags = 0  (data)


Actual results:
NBD_CMD_BLOCK_STATUS reply is inconsistent, second call does not match
first call.

Expected results:
Getting same reply in all calls.

Additional info:

This does not happen if we disable --allocation-depth:

$ qemu-nbd --read-only \
    --persistent \
    --socket=/tmp/nbd.sock \
    --format=raw \
    empty.raw &

$ python
>>> from ovirt_imageio._internal import nbd
>>> from pprint import pprint
>>> c = nbd.Client(nbd.UnixAddress("/tmp/nbd.sock"))
Failed to activate qemu:allocation-depth
>>> pprint(c.extents(0, c.export_size))
{'base:allocation': [Extent(length=1073741824, flags=3)]}
>>> pprint(c.extents(0, c.export_size))
{'base:allocation': [Extent(length=1073741824, flags=3)]}

Got the expected reply in both calls.


Using git bisect with the attached reproducer script show this commit:

commit 0bc329fbb009f8601cec23bf2bc48ead0c5a5fa2
Author: Hanna Reitz <hreitz>
Date:   Thu Aug 12 10:41:44 2021 +0200

    block: block-status cache for data regions

Upstream discussion:
https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00292.html

Comment 1 Nir Soffer 2022-01-17 13:13:46 UTC
This is mostly a clone of bug 2041461 (rhel 9).

Comment 2 Nir Soffer 2022-01-17 16:13:03 UTC
I tried to reproduce this with RHV API but I could not reproduce.

In RHV we do not support yet getting partial extents, only getting
all extents for entire image. Practically in RHV we always do
something like this (assuming 6g image):

    extents = []
    for offset in range(0, 6*1024**3, 2*1024**3):
        r = c1.extents(offset, 2*1024**3)
        extents.extend(r["base:allocation"])

Repeating this again return the same results on every call.

Example usage with RHV API:

$ ./image_transfer.py -c engine-dev download eaaf6789-90d9-4ad0-ab54-04e16fa7570a
[   0.0 ] Connecting to engine...
[   0.0 ] Looking up disk eaaf6789-90d9-4ad0-ab54-04e16fa7570a
[   0.1 ] Creating image transfer for download
[   1.4 ] Transfer ID: 30ec7761-64d7-4db7-8073-30cb5474de03
[   1.4 ] Transfer host name: host4
[   1.4 ] Transfer URL: https://host4:54322/images/bd64768a-167d-4172-b37d-bef9ce075f77
[   1.4 ] Proxy URL: https://engine-dev:54323/images/bd64768a-167d-4172-b37d-bef9ce075f77
[   1.4 ] Conneted to imageio server

Getting extents for entire image:

>>> import requests
>>> s = requests.Session()
>>> s.verify = False
>>> first = s.get("https://host4:54322/images/bd64768a-167d-4172-b37d-bef9ce075f77/extents").json()
>>> second = s.get("https://host4:54322/images/bd64768a-167d-4172-b37d-bef9ce075f77/extents").json()
>>> first == second
True

Comment 7 Tingting Mao 2022-02-16 07:42:29 UTC
Reproduced in qemu-kvm-6.2.0-6.module+el8.6.0+14165+5e5e76ac:

Steps:
# qemu-img create -f raw empty.raw 1g

# qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  allocated
      4096  1073737728    3  hole,zero

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  allocated

Results:
As above, after latest step, the dumped info is changed to all allocated data which is different from the results of the 3rd command. 



Verified in qemu-kvm-6.2.0-7.module+el8.6.0+14250+cd2c04c3:

Steps:
# qemu-img create -f raw empty.raw 1g

# qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  allocated
      4096  1073737728    3  hole,zero

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  allocated
      4096  1073737728    3  hole,zero

Results:
As above, after latest step, the dumped info is still the same as the result of 3rd command.

Comment 8 Yanan Fu 2022-02-22 09:12:55 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 10 errata-xmlrpc 2022-05-10 13:25:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759


Note You need to log in before you can comment on or make changes to this bug.