Description of problem: 'block-job-cancel' can not cancel a "drive-mirror" job. The job will keep running and get "BLOCK_JOB_READY" and "BLOCK_JOB_COMPLETED" at last. And this is a regression issue for RHEL7.6 Version-Release number of selected component (if applicable): qemu: qemu-kvm-rhev-2.12.0-1.el7.x86_64 kernel: kernel-3.10.0-877.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Boot vm. 2. Execute "driver-mirror" with qmp. {"execute": "drive-mirror", "arguments": {"mode": "absolute-paths", "format": "qcow2", "device": "drive_image1", "speed": 0, "sync": "full", "target": "/home/kvm_autotest_root/images/target1.qcow2"}} 3.Execute "block-job-cancel" with qmp. {"execute": "block-job-cancel", "arguments": {"device": "drive_image1"}} qmp command log: {"execute": "drive-mirror", "arguments": {"mode": "absolute-paths", "format": ^^^^^^^^^^^^^^^^^ "qcow2", "device": "drive_image1", "speed": 0, "sync": "full", "target": "/home/kvm_autotest_root/images/target1.qcow2"}} {"return": {}} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 4331208704, "offset": 417923072, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]} {"execute": "block-job-cancel", "arguments": {"device": "drive_image1"}} ^^^^^^^^^^^^^^^^^^^^ {"return": {}} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 4331208704, "offset": 1353318400, ^^^^^^^^^^^^ "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 4331208704, "offset": 1566965760, ^^^^^^^^^^^^^ "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 4331208704, "offset": 3257335808, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]} {"timestamp": {"seconds": 1524901760, "microseconds": 253202}, "event": "BLOCK_JOB_READY", "data": {"device": "drive_image1", "len": 4331208704, ^^^^^^^^^^^^^^^^^^ "offset": 4331208704, "speed": 0, "type": "mirror"}} {"timestamp": {"seconds": 1524901760, "microseconds": 253509}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive_image1", "len": 4331208704, ^^^^^^^^^^^^^^^^^^^^^ "offset": 4331208704, "speed": 0, "type": "mirror"}} Actual results: can not cancel the block job. Expected results: job should be cancelled, and get "BLOCK_JOB_CANCELLED" event. Additional info: qemu command line: /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga qxl \ -device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x3,chassis_nr=1 \ -device intel-hda,bus=pci.0,addr=0x4 \ -device hda-duplex \ -chardev socket,id=qmp_id_qmp1,path=/var/tmp/avocado_azbh2D/monitor-qmp1-20180427-082517-cs9RLzLn,server,nowait \ -mon chardev=qmp_id_qmp1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_azbh2D/monitor-catch_monitor-20180427-082517-cs9RLzLn,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idJw3ilG \ -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_azbh2D/serial-serial0-20180427-082517-cs9RLzLn,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=0x5 \ -chardev socket,path=/var/tmp/avocado_azbh2D/virtio_port-vs-20180427-082517-cs9RLzLn,nowait,id=idyoBE6X,server \ -device virtserialport,id=idgzmqQo,name=vs,bus=virtio_serial_pci0.0,chardev=idyoBE6X \ -object rng-random,filename=/dev/random,id=passthrough-9bzWyTnW \ -device virtio-rng-pci,id=virtio-rng-pci-MUWns1SF,rng=passthrough-9bzWyTnW,bus=pci.0,addr=0x6 \ -chardev socket,id=seabioslog_id_20180427-082517-cs9RLzLn,path=/var/tmp/avocado_azbh2D/seabios-20180427-082517-cs9RLzLn,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20180427-082517-cs9RLzLn,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \ -device nec-usb-xhci,id=usb2,bus=pci.0,addr=0x7 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x8 \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel75-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1 \ -device virtio-net-pci,mac=9a:8b:8c:8d:8e:8f,id=idcutDVH,vectors=4,netdev=idS6OpFS,bus=pci.0,addr=0x9 \ -netdev tap,id=idS6OpFS,vhost=on \ -m 4096 \ -smp 8,cores=4,threads=1,sockets=2 \ -cpu 'SandyBridge',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb2.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -no-hpet \ -enable-kvm \ -watchdog i6300esb \ -watchdog-action reset \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa \ -monitor stdio \
One simpler bz during RHEL7.3 bz1404137, and didn't hit this problem during rhel7.5.
Broken (?) since commit b76e4458b1eb3c32e9824fe6aa51f67d2b251748. I’m not sure yet whether that was intentional, but in any case this behavior is not reflected by the description of block-job-cancel. The description now states that you have to set the @force flag post-READY so that the job is cancelled – without, the job will complete normally. (Before that patch, setting the flag did not have any effect in this case.) But now you have to always set @force to cancel the job, independently of whether that was before READY or not. I think that is not intended, not least because it is an incompatible change. (And it is not documented, so that’s even more reason to believe it wasn’t intentional.) So I will send a patch to revert the pre-READY behavior and we’ll see. Max
Sent an upstream series: http://lists.nongnu.org/archive/html/qemu-devel/2018-05/msg00143.html
Same issue happened in kvm acceptance test on Power: kernel-3.10.0-887.el7.ppc64le qemu-kvm-rhev-2.12.0-2.el7.ppc64le
Merged upstream, pending backport: commit eb36639f7bbc16055e551593b81365e8ae3b0b05 Author: Max Reitz <mreitz> Date: Wed May 2 00:05:08 2018 +0200 block/mirror: Make cancel always cancel pre-READY Commit b76e4458b1eb3c32e9824fe6aa51f67d2b251748 made the mirror block job respect block-job-cancel's @force flag: With that flag set, it would now always really cancel, even post-READY. Unfortunately, it had a side effect: Without that flag set, it would now never cancel, not even before READY. Considering that is an incompatible change and not noted anywhere in the commit or the description of block-job-cancel's @force parameter, this seems unintentional and we should revert to the previous behavior, which is to immediately cancel the job when block-job-cancel is called before source and target are in sync (i.e. before the READY event). Cc: qemu-stable Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1572856 Reported-by: Yanan Fu <yfu> Signed-off-by: Max Reitz <mreitz> Reviewed-by: Eric Blake <eblake> Message-id: 20180501220509.14152-2-mreitz Reviewed-by: Jeff Cody <jcody> Signed-off-by: Jeff Cody <jcody>
Fix included in qemu-kvm-rhev-2.12.0-5.el7
The bug didn't appear in the qemu-kvm-rhev-2.12.0-5.el7 acceptance testing on ppc.
qemu-kvm-rhev-2.12.0-5.el7 acceptance test on x86_64 didn't hit this issue too.
Verify the bug on following sw versions: Host kernel: 3.10.0-915.el7.x86_64 qemu-kvm-rhev: qemu-kvm-rhev-2.12.0-5.el7.x86_64 # nc -U /var/tmp/avocado1 {"QMP": {"version": {"qemu": {"micro": 0, "minor": 12, "major": 2}, "package": "qemu-kvm-rhev-2.12.0-5.el7"}, "capabilities": []}} {"execute":"qmp_capabilities"} {"return": {}} { "execute": "drive-mirror", "arguments": { "device": "drive_image1", "target": "mirror1", "format": "qcow2", "mode": "absolute-paths", "sync": "full" } } {"return": {}} {"execute": "block-job-cancel", "arguments": {"device": "drive_image1"}} {"return": {}} {"timestamp": {"seconds": 1530158982, "microseconds": 498950}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "drive_image1", "len": 15066529792, "offset": 1113980928, "speed": 0, "type": "mirror"}} ...... { "execute": "drive-mirror", "arguments": { "device": "drive_image1", "target": "mirror1", "format": "qcow2", "mode": "absolute-paths", "sync": "full" } } {"return": {}} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 15275786240, "offset": 2719809536, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]} {"timestamp": {"seconds": 1530168785, "microseconds": 973943}, "event": "BLOCK_JOB_READY", "data": {"device": "drive_image1", "len": 15277293568, "offset": 15277293568, "speed": 0, "type": "mirror"}} {"execute": "block-job-complete", "arguments": { "device": "drive_image1"} } {"return": {}} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 18223529984, "offset": 16170024960, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 18840616960, "offset": 16245391360, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]} {"execute":"block-job-cancel","arguments":{"device":"drive_image1"}} {"return": {}} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 22204579840, "offset": 16752508928, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]} {"execute": "query-block-jobs"} {"return": [{"auto-finalize": true, "io-status": "ok", "device": "drive_image1", "auto-dismiss": true, "busy": true, "len": 22683320320, "offset": 16811229184, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]} {"execute":"block-job-cancel","arguments":{"device":"drive_image1","force":true}} {"return": {}} {"timestamp": {"seconds": 1530169235, "microseconds": 839978}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "drive_image1", "len": 24414584832, "offset": 17069113344, "speed": 0, "type": "mirror"}}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3443