Bug 1629056
Summary: | qemu NBD server failure with block status of dirty bitmaps | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Eric Blake <eblake> |
Component: | qemu-kvm-rhev | Assignee: | Eric Blake <eblake> |
Status: | CLOSED ERRATA | QA Contact: | aihua liang <aliang> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.6 | CC: | chayang, coli, eblake, jsnow, juzhang, ngu, rjones, virt-maint |
Target Milestone: | rc | Keywords: | TestOnly |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.12.0-27.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-22 09:18:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eric Blake
2018-09-14 18:15:34 UTC
An easy workaround for anyone writing a custom NBD client to experiment with incremental block backup via pull over NBD: clamp your NBD_CMD_BLOCK_STATUS requests to something like 2 or 3G per request, rather than trying to get a maximum 4G-1 bytes at a time. Have not get the feedback before the deadline's coming. Based on my test result, set this bug's status to "Verified". If any feedback come later and new verification needed, i'll append the result then. (In reply to Eric Blake from comment #0) > Description of problem: > Vladimir discovered that qemu as NBD server for the qemu:dirty-bitmap:FOO > context can send invalid extents of length 0, when it splits large (4G) > regions of the bitmap to a client that permits the server to send more than > one extent: > https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg01804.html > > Version-Release number of selected component (if applicable): > qemu-kvm-rhev-2.12.0-15.el7 > > How reproducible: > Difficult. qemu as client can't trigger it; it requires a separate client > that knows how to request qemu:dirty-bitmap:FOO context (and I don't know of > any open source clients other than qemu that currently support that). Found > by code inspection. Rich's libnbd (so far unreleased) can now do that: https://github.com/libguestfs/libnbd Typical test program from Python 3: ----------------- #!/usr/bin/python3 import nbd # Change this to the name of the dirty bitmap: name = "qemu:dirty-bitmap:bitmap0" h = nbd.NBD () h.request_meta_context (name) # Change these parameters to point to the qemu NBD server: h.connect_tcp ("localhost", "nbd") entries = [] def f (data, metacontext, offset, e): global entries if metacontext != name: return entries = e size = h.get_size () h.block_status (size, 0, 0, 0, f) h.shutdown () print ("%r" % entries) ----------------- We don't have a release of libnbd yet, but if you compile from source you can run this by saving that as "bitmap.py" and running: /path/to/libnbd/run ./bitmap.py (In reply to Eric Blake from comment #8) > > How reproducible: > > Difficult. qemu as client can't trigger it; it requires a separate client > > that knows how to request qemu:dirty-bitmap:FOO context (and I don't know of > > any open source clients other than qemu that currently support that). Found > > by code inspection. > > Rich's libnbd (so far unreleased) can now do that: > https://github.com/libguestfs/libnbd $ # Create an image and give it an empty persistent dirty bitmap $ qemu-img create -f qcow2 scratch.qcow2 5g $ cat <<\EOF | qemu-kvm -nodefaults -nographic -qmp stdio {'execute':'qmp_capabilities'} {'execute':'blockdev-add','arguments':{'node-name':'n','driver':'qcow2','file':{'driver':'file','filename':'scratch.qcow2'}}} {'execute':'block-dirty-bitmap-add','arguments':{'node':'n','name':'bitmap0','persistent':true}} {'execute':'quit'} EOF $ # Serve that image $ qemu-nbd -t -f qcow2 -B bitmap0 scratch.qcow2 & $ # And use libnbd to provoke the problem $ ./run sh/nbdsh nbd> h.request_meta_context("qemu:dirty-bitmap:bitmap0") nbd> h.connect_tcp ("localhost", "10809") nbd> def f (_, m, o, e): ... print (m, o, e) ... nbd> h.block_status(4 * 1024 * 1024 * 1024 - 1, 512, 0, None, f) qemu:dirty-bitmap:bitmap0 512 [4294901760, 0, 1073806848, 0] With broken qemu, the result is different. If you are testing upstream qemu 3.0, rather than downstream where 'qemu-nbd -B' was backported, the sequence for exposing an empty bitmap changes to this: {'execute':'qmp_capabilities'} {'execute':'blockdev-add','arguments':{'node-name':'n','driver':'qcow2','file':{'driver':'file','filename':'scratch.qcow2'}}} {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'}}}} {'execute':'nbd-server-add','arguments':{'device':'n','name':''}} {'execute':'block-dirty-bitmap-add','arguments':{'node':'n','name':'bitmap0'}} {'execute':'x-block-dirty-bitmap-disable','arguments':{'node':'n','name':'bitmap0'}} {'execute':'x-nbd-server-add-bitmap','arguments':{'name':'','bitmap':'bitmap0'}} and the nbdsh output is: nbd> h.block_status(4 * 1024 * 1024 * 1024 - 1, 512, 0, None, f) qemu:dirty-bitmap:b0 512 [4294901760, 0, 0, 1, 1073806848, 0] where that 0,1 pair in the middle is bogus. libnbd 0.2 may change the signature of h.block_status() to rearrange flags; see https://www.redhat.com/archives/libguestfs/2019-May/msg00207.html ... and it does change request_meta_context -> add_meta_context: https://github.com/libguestfs/libnbd/commit/b32b8172079a6626560836142149154e3be5bdab Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2553 |