Red Hat Bugzilla – Bug 1452148
Op blockers don't work after postcopy migration
Last modified: 2017-08-02 00:41:00 EDT
This clone of originally reported bug deals only with part concerning postcopy migration, for which patches have been merged upstream. The original bug is left open to fix the rest later. +++ This bug was initially created as a clone of Bug #1441684 +++ In commit e3e0003a, upstream qemu disabled the op blocker assertions for the 2.9 release because some bugs could not be fixed in time. After rebasing to 2.9, we'll want to revert the commit and include proper fixes for the bugs. Without the bugs fixed, op blockers can't keep the promises they are making. Known problems with op blockers so far that need to be fixed before the commit can be safely reverted: * Old style block migration (migrate -b) triggers an assertion because it reuses the guest device's BlockBackend. During migration, this BlockBackend is not ready to be used yet (its real permissions are only enabled in blk_resume_after_migration() immediately before the guest starts to run). Block migration needs to use its own BlockBackend here. * Postcopy migration. Commit d35ff5e6 added blk_resume_after_migration() in two places, but postcopy migration uses loadvm_postcopy_handle_run_bh(), which is the third one. In order to avoid assertion failures, the call needs to be added there as well. Without this fix, the guest device's op blockers are ineffective after postcopy migration.
Hi, Kevin, After reviewing bug description, I am not clear how to reproduce this bug , could you help to give the steps of reproducing or verifying this bug? thanks
Essentially just do the same op blocker tests as in bug 1293975, just after postcopy live migration.
Fix included in qemu-kvm-rhev-2.9.0-6.el7
This bug is verified pass for two scenarios, but I am not sure whether other scenario is expected. Host: 3.10.0-671.el7.x86_64 qemu-kvm-rhev-2.9.0-6.el7.x86_64 seabios-1.10.2-3.el7.x86_64 scenario I: # /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive node-name=disk1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/rhel73-64-virtio.qcow2 -device virtio-blk-pci,drive=disk1,id=virtio-blk-0 -device virtio-blk-pci,drive=disk1,id=virtio-blk-1 -monitor stdio QEMU 2.9.0 monitor - type 'help' for more information (qemu) qemu-kvm: -device virtio-blk-pci,drive=disk1,id=virtio-blk-1: Conflicts with use by /machine/peripheral/virtio-blk-0/virtio-backend as 'root', which does not allow 'write' on disk1 # /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive node-name=disk1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/rhel73-64-virtio.qcow2 -device virtio-blk-pci,drive=disk1,id=virtio-blk-0,share-rw=on -device virtio-blk-pci,drive=disk1,id=virtio-blk-1,share-rw=on -monitor stdio QEMU 2.9.0 monitor - type 'help' for more information (qemu) VNC server running on ::1:5900 scenario II: src host: # /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive id=drive0,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/rhel74-64-virtio.qcow2 -device virtio-blk-pci,drive=drive0,id=disk0 -drive id=drive1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/r1.qcow2 -device virtio-blk-pci,drive=drive1,id=disk1 -monitor stdio -qmp tcp:0:8881,server,nowait -vnc :1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:15 dst host: # /usr/libexec/qemu-kvm -name vm1 -m 4096 -smp 2 -drive id=drive0,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/rhel74-64-virtio.qcow2 -device virtio-blk-pci,drive=drive0,id=disk0 -drive id=drive1,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,file=/root/mount_point/r1.qcow2 -device virtio-blk-pci,drive=drive1,id=disk1 -monitor stdio -qmp tcp:0:8881,server,nowait -vnc :1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:15 -incoming tcp:0:5801 1.in guest: (1)execute a program to generate dirty pages (2)stress --cpu 8 --io 8 --vm 5 --vm-bytes 256M 2.in src host: (qemu) migrate_set_capability postcopy-ram on (qemu) migrate -d tcp:10.66.10.208:5801 after generating some dirty pages, switch to postcopy mode. (qemu) migrate_start_postcopy at the same time, in src host execute following operation no matter whether migration completed. {"execute":"qmp_capabilities"} {"return": {}} {"execute": "blockdev-backup", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}} {"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block333"}} {"execute": "blockdev-mirror", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}} {"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block333"}} {"execute": "blockdev-snapshot-sync","arguments":{"device":"drive1","snapshot-file":"sn1","mode":"absolute-paths","format":"qcow2"}} {"return": {}} {"execute":"drive-mirror","arguments":{"device":"drive1","target":"m_top","format":"qcow2","mode":"absolute-paths","sync":"top"}} {"timestamp": {"seconds": 1495788291, "microseconds": 250747}, "event": "BLOCK_JOB_READY", "data": {"device": "drive1", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}} {"return": {}} {"execute": "block-job-complete", "arguments":{"device": "drive1"}} {"return": {}} {"timestamp": {"seconds": 1495788308, "microseconds": 4628}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive1", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}} {"execute": "blockdev-backup", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}} {"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block1885"}} {"execute": "blockdev-mirror", "arguments": {"device": "drive0", "target": "drive1","sync": "full"}} {"error": {"class": "GenericError", "desc": "Conflicts with use by drive1 as 'root', which does not allow 'write' on #block1885"}} {"timestamp": {"seconds": 1495788418, "microseconds": 574931}, "event": "SHUTDOWN"} for scenario II, "blockdev-backup" and "blockdev-mirror" operation are failed, but the "blockdev-snapshot-sync" and "drive-mirror" are successfully. Kevin, does this test result are expected ? and, is this bug fixed?
Yes, this is correct behaviour. There is no reason for blockdev-snapshot-sync or drive-mirror to be blocked because both commands do not change the disk content that the guest sees.
(In reply to Kevin Wolf from comment #6) > Yes, this is correct behaviour. There is no reason for > blockdev-snapshot-sync or > drive-mirror to be blocked because both commands do not change the disk > content > that the guest sees. OK, thanks Kevin's reply in time, I will modify this bug to verifed.
*** Bug 1455986 has been marked as a duplicate of this bug. ***
Created attachment 1283692 [details] qemu log
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392