RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1631052 - x-block-dirty-bitmap-merge into a disabled bitmap dest causes assert
Summary: x-block-dirty-bitmap-merge into a disabled bitmap dest causes assert
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: John Snow
QA Contact: aihua liang
URL:
Whiteboard:
Depends On: 1658343
Blocks: 1636224 1638690 1651787
TreeView+ depends on / blocked
 
Reported: 2018-09-19 19:59 UTC by John Snow
Modified: 2019-08-22 09:19 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1638690 (view as bug list)
Environment:
Last Closed: 2019-08-22 09:18:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
gdb_debug_info_for_coredump-76-bz1631052-10122018 (19.54 KB, text/plain)
2018-10-12 09:15 UTC, Gu Nini
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2553 0 None None None 2019-08-22 09:19:46 UTC

Description John Snow 2018-09-19 19:59:19 UTC
Description of problem:

Merging into a disabled bitmap (i.e., one that has been turned 'off') causes an assertion.

Version-Release number of selected component (if applicable):

qemu 3.0,
qemu-kvm-rhev-2.12.0-17.el7

How reproducible:

100%


Steps to Reproduce:
1. {"execute":"x-block-dirty-bitmap-merge","arguments":{"node":"n","src_name":"existing_map","dst_name":"disabled_map"}}
2.
3.

Actual results:

{"execute":"x-block-dirty-bitmap-merge","arguments":{"node":"n","src_name":"existing_map","dst_name":"disabled_map"}}


Expected results:

The command must either be rejected, OR it must complete successfully. 

Additional info:

Reported-by: Eric Blake

Comment 2 Eric Blake 2018-09-19 20:15:56 UTC
Preferably complete successfully. Merging into a frozen bitmap (one in use by a block job) should probably be forbidden, but merging into a disabled one (where it is merely not tracking new writes) makes sense in at least the following scenario:

If the image is keeping a linear sequence of bitmaps in order to track various points in time which can each serve as the reference point at the start of a differential backup, where all but the most recent bitmap is disabled, then creating the differential backup is a matter of creating a temporary live bitmap, then live-merging all of the earlier bitmaps in. In that context, merges can always be performed into a live bitmap, immediately prior to the transaction that kicks off the blockdev-backup job acting on the temporary bitmap.

But once the user decides that a point in time is no longer important (that is, that no further differential backups will be made from a given point in time), the two bitmaps on either side of that point in time need to be merged into a single bitmap. If both of those bitmaps are disabled, and this bug is not fixed, then the only workaround I can think of to be able to perform the merge without triggering the assertion would be to temporarily pause the guest (to ensure no further I/O), then re-enable the destination bitmap, perform the merge, disable the destination bitmap, resume guest I/O, and finally delete the unused bitmap.  But pausing the guest I/O is contrary to the goal of bitmap management being non-invasive to a running guest, compared to just allowing the merge of one disabled bitmap into another.

I'm also trying to figure if merging a live bitmap into a disabled one ever makes sense.  I suppose it does, if the merge is atomic, by starting with bitmap1 tracking the sectors touched between times T1-T2 and bitmap2 from T2-present. Then we perform a transaction that does both a merge of bitmap2 into bitmap1, followed by a clear of bitmap2.  The point in time of our transaction is T3, and serves to reshuffle things so that now bitmap1 tracks sectors touched between T1-T3, and bitmap2 from T3-present.

Comment 3 Eric Blake 2018-09-19 21:44:12 UTC
(In reply to John Snow from comment #1)
> https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg02358.html

In addition to the relaxation of the assertion for disabled bitmaps (in that linked thread), we probably should not declare this bug complete without also including a second patch that freezes any bitmap tied to an NBD export with x-nbd-server-add-bitmap (freeze goes away at subsequent nbd-server-remove), and double-check that merging to a frozen bitmap is a graceful error. Perhaps it also warrants iotests enhancements.

Comment 4 Eric Blake 2018-09-21 19:17:44 UTC
We also need to permit clearing of a disabled bitmap. And just as merge should not be permitted on a locked bitmap, neither should enable/disable be allowed on a locked bitmap, and it would not hurt if enable on an enabled bitmap, or disable on a disabled bitmap, reports an error about a redundant operation instead of being a no-op.

Comment 5 John Snow 2018-10-09 20:58:29 UTC
(In reply to Eric Blake from comment #4)
> We also need to permit clearing of a disabled bitmap. And just as merge
> should not be permitted on a locked bitmap, neither should enable/disable be
> allowed on a locked bitmap, and it would not hurt if enable on an enabled
> bitmap, or disable on a disabled bitmap, reports an error about a redundant
> operation instead of being a no-op.

https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00406.html
Awaiting merge upstream.

- Clearing is permitted on disabled bitmaps after this series
- Merge/Enable/Disable are all prohibited on locked/frozen bitmaps
- Redundant operations are not prohibited (new BZ if needed.)

"we probably should not declare this bug complete without also including a second patch that freezes any bitmap tied to an NBD export with x-nbd-server-add-bitmap (freeze goes away at subsequent nbd-server-remove), and double-check that merging to a frozen bitmap is a graceful error. Perhaps it also warrants iotests enhancements."

-NBD exports "lock" the bitmap, instead of "freezing" it
-merging to a locked/frozen bitmap is a graceful error.

Comment 6 Gu Nini 2018-10-12 09:15:28 UTC
Created attachment 1493178 [details]
gdb_debug_info_for_coredump-76-bz1631052-10122018

I reproduced the bug with following steps:

{"execute":"block-dirty-bitmap-add","arguments":{"node":"disk0","name":"bitmap0"}} 
{"return": {}}
{"execute":"block-dirty-bitmap-add","arguments":{"node":"disk0","name":"bitmap1"}}
{"return": {}}
{"execute":"x-block-dirty-bitmap-disable","arguments":{"node":"disk0","name":"bitmap1"}}
{"return": {}}
{"execute":"x-block-dirty-bitmap-merge","arguments":{"node":"disk0","src_name":"bitmap0","dst_name":"bitmap1"}}
Ncat: Connection reset by peer.


(qemu) 
(qemu) qemu-kvm: block/dirty-bitmap.c:775: bdrv_merge_dirty_bitmap: Assertion `bdrv_dirty_bitmap_enabled(dest)' failed.
./vm1.sh: line 30:  9204 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name

Comment 7 John Snow 2018-12-12 00:27:19 UTC
This is addressed by the patchset posted against BZ #1658343. Setting depends-on and TestOnly flag.

Eric, if the upstream behavior needs to be addressed in some way WRT your Libvirt API, can you open new bug(s) as needed?

Comment 9 aihua liang 2019-02-14 09:21:54 UTC
Test on qemu-kvm-rhev-2.12.0-23.el7.x86_64, don't this issue any more.
 
  Test steps:
    1.start guest with qemu cmds:
      -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 \
      -blockdev driver=file,node-name=file_base,filename=/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2,auto-read-only=on \
      -blockdev driver=qcow2,file=file_base,node-name=drive_image1,auto-read-only=on \
      -device scsi-hd,id=image1,drive=drive_image1 \

    2.add bitmap0 on drive_image1
       {"execute": "block-dirty-bitmap-add","arguments":{ "node": "drive_image1", "name": "bitmap0"}}

    3.add a disabled bitmap bitmap_tmp
       {"execute": "block-dirty-bitmap-add","arguments":{ "node": "drive_image1", "name": "bitmap_tmp","x-disabled": true }}

    4.check bitmap info:
       {"execute":"query-block"}
{"return": [{"device": "", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 21474836480, "filename": "/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": 4326883328, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "node-name": "drive_image1", "backing_file_depth": 0, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {"no-flush": false, "direct": false, "writeback": true}, "file": "/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2", "encryption_key_missing": false}, "qdev": "image1", "dirty-bitmaps": [{"name": "bitmap_tmp", "status": "disabled", "granularity": 65536, "count": 0}, {"name": "bitmap0", "status": "active", "granularity": 65536, "count": 9961472}], "type": "unknown"}]}

     5.merge bitmap0 to bitmap_tmp
       {"execute": "x-block-dirty-bitmap-merge","arguments":{"node": "drive_image1", "src_name": "bitmap0", "dst_name":"bitmap_tmp" } }

     6.check bitmap info:
       {"execute":"query-block"}
{"return": [{"device": "", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 21474836480, "filename": "/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": 4326883328, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "node-name": "drive_image1", "backing_file_depth": 0, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {"no-flush": false, "direct": false, "writeback": true}, "file": "/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2", "encryption_key_missing": false}, "qdev": "image1", "dirty-bitmaps": [{"name": "bitmap_tmp", "status": "disabled", "granularity": 65536, "count": 10027008}, {"name": "bitmap0", "status": "active", "granularity": 65536, "count": 10223616}], "type": "unknown"}]}

Additional info:
  kernel version:3.10.0-993.el7.x86_64

Comment 10 aihua liang 2019-02-15 04:08:26 UTC
I'll set bug's status to "Verified" after it changes to "ON_QA".

Comment 11 aihua liang 2019-03-19 06:36:35 UTC
Set bug's status to "Verified", for it has keywords "TestOnly", thanks.

Comment 13 errata-xmlrpc 2019-08-22 09:18:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553


Note You need to log in before you can comment on or make changes to this bug.