Bug 1372763
| Summary: | RHSA-2016-1756 breaks migration of instances | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Karen Noel <knoel> | |
| Component: | qemu-kvm-rhev | Assignee: | Stefan Hajnoczi <stefanha> | |
| Status: | CLOSED ERRATA | QA Contact: | huiqingding <huding> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 7.3 | CC: | amedeo.salvati, aperotti, areis, artem, berrange, blake.c.anderson, chayang, c.hendrickson09, cww, dasmith, eglynn, furlongm, huding, jherrman, jmelvin, juzhang, kamfonik, kchamart, knoel, lmiksik, moshele, qizhu, rbryant, sbauza, sferdjao, sgordon, srevivo, stefanha, virt-maint, vromanso | |
| Target Milestone: | rc | Keywords: | Regression, ZStream | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | qemu-kvm-rhev-2.6.0-25.el7 | Doc Type: | Bug Fix | |
| Doc Text: |
The fix for CVE-2016-5403 caused migrating guest instances to fail with a "Virtqueue size exceeded" error message. With this update, the value of the virtualization queue is recalculated after the migration, and the described problem no longer occurs.
|
Story Points: | --- | |
| Clone Of: | 1371943 | |||
| : | 1374623 1376542 (view as bug list) | Environment: | ||
| Last Closed: | 2016-11-07 21:33:27 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1371943, 1374364, 1374365, 1374366, 1374367, 1374368, 1374369, 1374623, 1376542 | |||
|
Description
Karen Noel
2016-09-02 15:23:40 UTC
Here's a brew build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11696939 backporting two patches: virtio: decrement vq->inuse in virtqueue_discard() virtio: recalculate vq->inuse after migration from 2.7. can we get a confirmation on whether this fixes the issues? I have posted a backport for RHEL 7.2.z similar to Michael's: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11706790 Additional test scenarios: 1. virtio-balloon stats virtqueue test $ qemu-img create -f qcow2 -b test.img test.qcow2 $ qemu-system-x86_64 -enable-kvm -m 1024 -cpu host -drive if=virtio,cache=none,format=qcow2,file=test.qcow2 -device virtio-balloon-pci,id=virtio-balloon0 -S (qemu) qom-set virtio-balloon guest-stats-polling-interval 5 (qemu) c ...let it boot and log in on the console... (qemu) savevm (qemu) quit $ qemu-system-x86_64 -enable-kvm -m 1024 -cpu host -drive if=virtio,cache=none,format=qcow2,file=test.qcow2 -device virtio-balloon-pci,id=virtio-balloon0 -S (qemu) qom-set virtio-balloon guest-stats-polling-interval 5 (qemu) loadvm 1 (qemu) c $ rm test.qcow2 Expected behavior: Guest state is loaded and resumes successfully. Actual behavior: "Virtqueue size exceeded" error from QEMU and the guest is terminated after the 'c' monitor command is issued. 2. virtio-blk s->rq test $ sudo qemu-img create -f qcow2 /dev/testvg/testlv 10G shell1$ sudo qemu-system-x86_64 -enable-kvm -m 1024 -cpu host -drive if=virtio,cache=none,format=raw,file=rhel72.img -drive if=virtio,cache=none,format=qcow2,file=/dev/testvg/testlv,werror=stop guest# dd if=/dev/zero of=/dev/vdb oflag=direct bs=4k shell2$ sudo qemu-system-x86_64 -enable-kvm -m 1024 -cpu host -drive if=virtio,cache=none,format=raw,file=rhel72.img -drive if=virtio,cache=none,format=qcow2,file=/dev/testvg/testlv,werror=stop -incoming tcp::1234 (qemu1) migrate tcp:127.0.0.1:1234 $ sudo lvresize -L +4M /dev/testvg/testlv (qemu2) c Expected behavior: Guest resumes successfully after 'c' monitor command is issued on destination QEMU. Actual behavior: "Virtqueue size exceeded" error from destination QEMU and guest is terminated after the 'c' monitor command is issued. Fix included in qemu-kvm-rhev-2.6.0-25.el7 Thanks, Stefan. Reproduce this bug using the following version: kernel-3.10.0-505.el7.x86_64 qemu-kvm-rhev-2.6.0-24.el7.x86_64 Reproduce steps: 1. create a 4M lv # pvcreate /dev/sdg # vgcreate testvg /dev/sdg # lvcreate -L 4M -T testvg/testlv # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert home rhel_hp-dl380pg8-09 -wi-ao---- 212.61g root rhel_hp-dl380pg8-09 -wi-ao---- 50.00g swap rhel_hp-dl380pg8-09 -wi-ao---- 15.75g testlv testvg twi-a-tz-- 4.00m 0.00 0.88 2. create a data disk image based on the above lv # qemu-img create -f qcow2 /dev/testvg/testlv 10G 3. boot a rhel7.3 guest with the above data disk image # /usr/libexec/qemu-kvm \ -S \ -name 'rhel7.3' \ -machine pc-i440fx-rhel7.3.0 \ -m 4096 \ -smp 4,maxcpus=4,sockets=1,cores=4,threads=1 \ -cpu SandyBridge \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -boot menu=on \ -enable-kvm \ -monitor stdio \ -spice port=5900,disable-ticketing \ -drive file=/home/rhel7.3.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device scsi-hd,drive=drive_sysdisk,bus=scsi_pci_bus0.0,id=device_sysdisk,bootindex=1 \ -drive if=none,cache=none,format=qcow2,file=/dev/testvg/testlv,werror=stop,id=drive-virtio-disk0 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk0,id=virtio-disk0 \ 4. on the same host, use the same command line with "-incoming tcp:0:5800", boot the rhel7.3 guest 5. inside guest # dd if=/dev/zero of=/dev/vdb oflag=direct bs=4k 6. after guest is paused with io-error, do migration (qemu) info status VM status: paused (io-error) (qemu) migrate -d tcp:0:5800 7. on host, grow the logical volume by 4 MB # lvresize -L +4M /dev/testvg/testlv 8. in destination, resume the guest (qemu)c after step8, "Virtqueue size exceeded" error from destination QEMU and qemu-kvm quits. Verify this bug using the following version: kernel-3.10.0-505.el7.x86_64 qemu-kvm-rhev-2.6.0-25.el7.x86_64 Do the above test, after step 8, destination qemu-kvm did not quit and guest can resume normally. Based on comment #11, set this bug to be verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2673.html |