Bug 1452148 - Op blockers don't work after postcopy migration
Summary: Op blockers don't work after postcopy migration
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev   
(Show other bugs)
Version: 7.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: xianwang
URL:
Whiteboard:
Keywords:
: 1455986 (view as bug list)
Depends On:
Blocks: 1440030 1446211 1441684
TreeView+ depends on / blocked
 
Reported: 2017-05-18 12:36 UTC by Kevin Wolf
Modified: 2017-08-02 04:41 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-rhev-2.9.0-6.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1441684
Environment:
Last Closed: 2017-08-02 04:41:00 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
qemu log (1.86 KB, text/plain)
2017-05-31 10:20 UTC, IBM Bug Proxy
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Kevin Wolf 2017-05-18 12:36:28 UTC
This clone of originally reported bug deals only with part concerning postcopy
migration, for which patches have been merged upstream. The original bug is left
open to fix the rest later.


+++ This bug was initially created as a clone of Bug #1441684 +++

In commit e3e0003a, upstream qemu disabled the op blocker assertions for the
2.9 release because some bugs could not be fixed in time. After rebasing to
2.9, we'll want to revert the commit and include proper fixes for the bugs.
Without the bugs fixed, op blockers can't keep the promises they are making.

Known problems with op blockers so far that need to be fixed before the commit
can be safely reverted:

* Old style block migration (migrate -b) triggers an assertion because it
  reuses the guest device's BlockBackend. During migration, this BlockBackend
  is not ready to be used yet (its real permissions are only enabled in
  blk_resume_after_migration() immediately before the guest starts to run).
  Block migration needs to use its own BlockBackend here.

* Postcopy migration. Commit d35ff5e6 added blk_resume_after_migration() in two
  places, but postcopy migration uses loadvm_postcopy_handle_run_bh(), which is
  the third one. In order to avoid assertion failures, the call needs to be
  added there as well. Without this fix, the guest device's op blockers are
  ineffective after postcopy migration.

Comment 1 xianwang 2017-05-22 03:41:20 UTC
Hi, Kevin,
After reviewing bug description, I am not clear how to reproduce this bug , could you help to give the steps of reproducing or verifying this bug? thanks

Comment 2 Kevin Wolf 2017-05-22 08:49:11 UTC
Essentially just do the same op blocker tests as in bug 1293975, just after
postcopy live migration.

Comment 3 Miroslav Rezanina 2017-05-23 08:16:22 UTC
Fix included in qemu-kvm-rhev-2.9.0-6.el7

Comment 5 xianwang 2017-05-26 09:15:01 UTC
This bug is verified pass for two scenarios, but I am not sure whether other scenario is expected.

Host:
3.10.0-671.el7.x86_64
qemu-kvm-rhev-2.9.0-6.el7.x86_64
seabios-1.10.2-3.el7.x86_64

scenario I:
# /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive node-name=disk1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/rhel73-64-virtio.qcow2 -device virtio-blk-pci,drive=disk1,id=virtio-blk-0 -device virtio-blk-pci,drive=disk1,id=virtio-blk-1 -monitor stdio
QEMU 2.9.0 monitor - type 'help' for more information
(qemu) qemu-kvm: -device virtio-blk-pci,drive=disk1,id=virtio-blk-1: Conflicts with use by /machine/peripheral/virtio-blk-0/virtio-backend as 'root', which does not allow 'write' on disk1

# /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive node-name=disk1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/rhel73-64-virtio.qcow2 -device virtio-blk-pci,drive=disk1,id=virtio-blk-0,share-rw=on -device virtio-blk-pci,drive=disk1,id=virtio-blk-1,share-rw=on -monitor stdio
QEMU 2.9.0 monitor - type 'help' for more information
(qemu) VNC server running on ::1:5900

scenario II:
src host:
# /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive id=drive0,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/rhel74-64-virtio.qcow2 -device virtio-blk-pci,drive=drive0,id=disk0 -drive id=drive1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/r1.qcow2 -device virtio-blk-pci,drive=drive1,id=disk1 -monitor stdio -qmp tcp:0:8881,server,nowait -vnc :1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:15

dst host:
# /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive id=drive0,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/rhel74-64-virtio.qcow2 -device virtio-blk-pci,drive=drive0,id=disk0 -drive id=drive1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/r1.qcow2 -device virtio-blk-pci,drive=drive1,id=disk1 -monitor stdio -qmp tcp:0:8881,server,nowait -vnc :1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:15 -incoming tcp:0:5801

1.in guest:
(1)execute a program to generate dirty pages
(2)stress --cpu 8 --io 8 --vm 5 --vm-bytes 256M
2.in src host:
(qemu) migrate_set_capability postcopy-ram on
(qemu) migrate -d tcp:10.66.10.208:5801
after generating some dirty pages, switch to postcopy mode.
(qemu) migrate_start_postcopy
at the same time, in src host execute following operation no matter whether migration completed.
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute": "blockdev-backup", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}}
{"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block333"}}
{"execute": "blockdev-mirror", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}}
{"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block333"}}
{"execute": "blockdev-snapshot-sync","arguments":{"device":"drive1","snapshot-file":"sn1","mode":"absolute-paths","format":"qcow2"}}
{"return": {}}
{"execute":"drive-mirror","arguments":{"device":"drive1","target":"m_top","format":"qcow2","mode":"absolute-paths","sync":"top"}}
{"timestamp": {"seconds": 1495788291, "microseconds": 250747}, "event": "BLOCK_JOB_READY", "data": {"device": "drive1", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
{"return": {}}
{"execute": "block-job-complete", "arguments":{"device": "drive1"}}
{"return": {}}
{"timestamp": {"seconds": 1495788308, "microseconds": 4628}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive1", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}

{"execute": "blockdev-backup", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}}
{"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block1885"}}
{"execute": "blockdev-mirror", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}}
{"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block1885"}}
{"timestamp": {"seconds": 1495788418, "microseconds": 574931}, "event": "SHUTDOWN"}

for scenario II, "blockdev-backup" and "blockdev-mirror" operation are failed, but the "blockdev-snapshot-sync" and "drive-mirror" are successfully.
Kevin, does this test result are expected ? and, is this bug fixed?

Comment 6 Kevin Wolf 2017-05-26 09:26:30 UTC
Yes, this is correct behaviour. There is no reason for blockdev-snapshot-sync or
drive-mirror to be blocked because both commands do not change the disk content
that the guest sees.

Comment 7 xianwang 2017-05-26 09:40:47 UTC
(In reply to Kevin Wolf from comment #6)
> Yes, this is correct behaviour. There is no reason for
> blockdev-snapshot-sync or
> drive-mirror to be blocked because both commands do not change the disk
> content
> that the guest sees.

OK, thanks Kevin's reply in time, I will modify this bug to verifed.

Comment 8 Kevin Wolf 2017-05-31 09:59:47 UTC
*** Bug 1455986 has been marked as a duplicate of this bug. ***

Comment 9 IBM Bug Proxy 2017-05-31 10:20:20 UTC
Created attachment 1283692 [details]
qemu log

Comment 11 errata-xmlrpc 2017-08-02 04:41:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.