Bug 2041461

Summary: Inconsistent block status reply in qemu-nbd
Product: Red Hat Enterprise Linux 9 Reporter: Nir Soffer <nsoffer>
Component: qemu-kvmAssignee: Hanna Czenczek <hreitz>
qemu-kvm sub component: Storage QA Contact: Tingting Mao <timao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: bstinson, coli, hreitz, jinzhao, juzhang, jwboyer, kkiwi, timao, virt-maint, ymankad
Version: CentOS StreamKeywords: Regression, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-6.2.0-8.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-17 12:25:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2041480    
Attachments:
Description Flags
Reproducer script none

Description Nir Soffer 2022-01-17 11:57:45 UTC
Created attachment 1851337 [details]
Reproducer script

Description of problem:

When using qemu-nbd --allocation-depth option, and registering "qemu:allocation-depth"
during handshake, and using raw image, NBD_CMD_BLOCK_STATUS returns correct reply in 
the first call, but in the next call for the same byte range is reported as data
(flags=0).

This may break clients if they issue multiple NBD_CMD_BLOCK_STATUS commands for the same
range. If the client use the second reply, it may read the byte range and copy it over
the newtwork for no reason. This can cause a severe performance regression for users.

Version-Release number of selected component (if applicable):
qemu-img-17:6.2.0-3.el9.x86_64

How reproducible:
Always

Steps to Reproduce:

1. Start qemu-nbd with --allocation-depth option and raw image

$ truncate -s1g empty.raw\

$ qemu-nbd --read-only \
    --persistent \
    --allocation-depth \
    --socket=/tmp/nbd.sock \
    --format=raw \
    empty.raw &

2. Connect to qemu-nbd with nbd client.

I'm using ovirt-imageio nbd client, but this should be reproducible
with libnbd. To install the client use:

# dnf copr enable nsoffer/ovirt-imageio-preview
# dnf install ovirt-imageio-client

$ python
Python 3.9.9 (main, Nov 22 2021, 00:00:00) 
[GCC 11.2.1 20211019 (Red Hat 11.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.

Enable debug logs so we can see all nbd commands and replies...

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)

Connect to qemu-nbd:

>>> from ovirt_imageio._internal import nbd
>>> c = nbd.Client(nbd.UnixAddress("/tmp/nbd.sock"))
DEBUG:nbd:Connecting address='/tmp/nbd.sock' export_name='' dirty=False
DEBUG:nbd:Received server flags: 3
DEBUG:nbd:Sending client flags: 1
DEBUG:nbd:Sending option: 8 data: b''
DEBUG:nbd:Received reply magic=3e889045565a9 option=8 type=1 len=0
DEBUG:nbd:Structured reply enabled
DEBUG:nbd:Sending option: 10 data: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x0fbase:allocation\x00\x00\x00\x15qemu:allocation-depth')
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=4 len=19
DEBUG:nbd:Meta context base:allocation is available id=0
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=4 len=25
DEBUG:nbd:Meta context qemu:allocation-depth is available id=1
DEBUG:nbd:Received reply magic=3e889045565a9 option=10 type=1 len=0
DEBUG:nbd:Sending option: 7 data: bytearray(b'\x00\x00\x00\x00\x00\x00')
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=3 len=14
DEBUG:nbd:Received block size info minimum=1 preferred=4096 maximum=33554432
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=3 len=12
DEBUG:nbd:Received export info size=1073741824 flags=1423
DEBUG:nbd:Received reply magic=3e889045565a9 option=7 type=1 len=0
DEBUG:nbd:Ready for transmission

Get extents (first call):

>>> from pprint import pprint
>>> pprint(c.extents(0, c.export_size))
DEBUG:nbd:Sending NBD_CMD_BLOCK_STATUS handle=0 offset=0 length=1073741824 flags=0
DEBUG:nbd:Extent length=1073741824 flags=3 context=0
DEBUG:nbd:Extent length=1073741824 flags=1 context=1
{'base:allocation': [Extent(length=1073741824, flags=3)],
 'qemu:allocation-depth': [Extent(length=1073741824, flags=0)]}

We got the flags = 3  (NBD_STATE_HOLE | NBD_STATE_ZERO)

Get extents (second call):

>>> pprint(c.extents(0, c.export_size))
DEBUG:nbd:Sending NBD_CMD_BLOCK_STATUS handle=1 offset=0 length=1073741824 flags=0
DEBUG:nbd:Extent length=1073741824 flags=0 context=0
DEBUG:nbd:Extent length=1073741824 flags=1 context=1
{'base:allocation': [Extent(length=1073741824, flags=0)],
 'qemu:allocation-depth': [Extent(length=1073741824, flags=0)]}

We got the flags = 0  (data)


Actual results:
NBD_CMD_BLOCK_STATUS reply is inconsistent, second call does not match
first call.

Expected results:
Getting same reply in all calls.

Additional info:

This does not happen if we disable --allocation-depth:

$ qemu-nbd --read-only \
    --persistent \
    --socket=/tmp/nbd.sock \
    --format=raw \
    empty.raw &

$ python
>>> from ovirt_imageio._internal import nbd
>>> from pprint import pprint
>>> c = nbd.Client(nbd.UnixAddress("/tmp/nbd.sock"))
Failed to activate qemu:allocation-depth
>>> pprint(c.extents(0, c.export_size))
{'base:allocation': [Extent(length=1073741824, flags=3)]}
>>> pprint(c.extents(0, c.export_size))
{'base:allocation': [Extent(length=1073741824, flags=3)]}

Got the expected reply in both calls.


Using git bisect with the attached reproducer script show this commit:

commit 0bc329fbb009f8601cec23bf2bc48ead0c5a5fa2
Author: Hanna Reitz <hreitz>
Date:   Thu Aug 12 10:41:44 2021 +0200

    block: block-status cache for data regions

Upstream discussion:
https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00292.html

Comment 2 Tingting Mao 2022-01-26 08:37:42 UTC
Tried to reproduce this issue, but failed. Could anyone help to check where is wrong? Thanks.


Tested with:
qemu-kvm-6.2.0-4.el9
kernel-5.14.0-39.el9.x86_64


Steps:
1. Prepare an image and expose it via nbd
# truncate -s1g empty.raw
# qemu-nbd --read-only     --persistent     --allocation-depth     --socket=/tmp/nbd.sock     --format=raw     empty.raw &

2. Access the image to get info via qemu-img map
# qemu-img map --output=json --image-opts driver=nbd,server.path=/tmp/nbd.sock,server.type=unix,x-dirty-bitmap=qemu:allocation-depth
[{ "start": 0, "length": 1073741824, "depth": 0, "present": false, "zero": false, "data": false}]

# qemu-img map --output=json --image-opts driver=nbd,server.path=/tmp/nbd.sock,server.type=unix,x-dirty-bitmap=qemu:allocation-depth
[{ "start": 0, "length": 1073741824, "depth": 0, "present": false, "zero": false, "data": false}]


3. Access the image via nbdinfo
# nbdinfo --map=qemu:allocation-depth --json nbd+unix:///?socket=/tmp/nbd.sock
[
{ "offset": 0, "length": 1073741824, "type": 1, "description": "local"}
]
# nbdinfo --map=qemu:allocation-depth --json nbd+unix:///?socket=/tmp/nbd.sock
[
{ "offset": 0, "length": 1073741824, "type": 1, "description": "local"} 
]


Results:
As above, the dumped info is always the same. And from step3, the type is always '1', but not '0' or '3' as Comment0.

Comment 3 Klaus Heinrich Kiwi 2022-02-07 14:43:40 UTC
(In reply to Tingting Mao from comment #2)
> Tried to reproduce this issue, but failed. Could anyone help to check where
> is wrong? Thanks.
Nir, Can you help?

Hanna, I think this one was submitted/merged upstream? Can you update us on the bug status?

Thanks.

Comment 5 Nir Soffer 2022-02-07 16:08:14 UTC
Reproducing this is tricky, here is example flow:

$ qemu-img create -f raw empty.raw 1g

$ qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

$ nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  data
      4096  1073737728    3  hole,zero

Correct results ^^^^

Now run twice nbd info...

$ nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local

$ nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local

Also correct - but now the internal cache is corrupted:

$ nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data

Incorrect results ^^^

All future nbdinfo --map commands will return incorrect info:

$ nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data

$ nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data

Comment 6 Nir Soffer 2022-02-07 16:12:17 UTC
The patch adding the tests how to reproduce it with qemu-img map, and add more info
why it needs 2 calls with qemu:alloction-depth to trigger the issue.
https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00371.html

Comment 7 Hanna Czenczek 2022-02-07 16:27:28 UTC
The patch(es) were merged upstream a week ago (Jan 30), I’m working on a backport.

Comment 8 Tingting Mao 2022-02-08 01:55:40 UTC
Reproduced this bug in latest rhel9, thanks Nir.


Tested with:
qemu-kvm-6.2.0-6.el9
kernel-5.14.0-39.el9.x86_64


Steps:
1. Create test image and export is over nbd
# qemu-img create -f raw empty.raw 1g
# qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

2. Map the image via base:allocation
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  data
      4096  1073737728    3  hole,zero

3. Map the image via qemu:allocaton-depth twice
# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local
# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local

4. Map the image via base:allocation
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data

5. Map more times
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data
# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    0  data


Results:
As above, the results for step4 and step5 are incorrect. The correct result should be like the step2.

Comment 10 Yanan Fu 2022-02-15 06:19:16 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 13 Tingting Mao 2022-02-16 12:08:09 UTC
Verified this bug as below:


Tested with:
qemu-kvm-6.2.0-8.el9
kernel-5.14.0-54.kpq0.el9.x86_64


Steps:
1. Create test image and export is over nbd
# qemu-img create -f raw empty.raw 1g
# qemu-nbd --read-only --persistent --allocation-depth --socket=/tmp/nbd.sock --format=raw empty.raw &

2. Map the image via base:allocation
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  data
      4096  1073737728    3  hole,zero

3. Map the image via qemu:allocaton-depth twice
# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local
# nbdinfo --map=qemu:allocation-depth nbd+unix:///?socket=/tmp/nbd.sock
         0  1073741824    1  local

4. Map the image via base:allocation
# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  data
      4096  1073737728    3  hole,zero

5. Map more times
#  nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  data
      4096  1073737728    3  hole,zero

# nbdinfo --map=base:allocation nbd+unix:///?socket=/tmp/nbd.sock
         0        4096    0  data
      4096  1073737728    3  hole,zero


Results:
As above, the dumped info of step5 do not changed, that means it stay the same as the results of step2.

Comment 15 errata-xmlrpc 2022-05-17 12:25:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: qemu-kvm), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2307