Bug 1374623
Summary: | RHSA-2016-1756 breaks migration of instances | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Marcel Kolaja <mkolaja> |
Component: | qemu-kvm-rhev | Assignee: | Stefan Hajnoczi <stefanha> |
Status: | CLOSED ERRATA | QA Contact: | huiqingding <huding> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 7.3 | CC: | amedeo.salvati, aperotti, areis, berrange, blake.c.anderson, chayang, c.hendrickson09, cww, dasmith, eglynn, furlongm, huding, ipetrova, jboggs, jen, jherrman, jmelvin, juzhang, kamfonik, kchamart, knoel, lhh, lijin, lmiksik, moshele, qizhu, rbryant, sbauza, sferdjao, sgordon, sknauss, srevivo, stefanha, virt-maint, vromanso |
Target Milestone: | rc | Keywords: | Regression, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.3.0-31.el7_2.22 | Doc Type: | Bug Fix |
Doc Text: |
The fix for CVE-2016-5403 caused migrating guest instances to fail with a "Virtqueue size exceeded" error message. With this update, the value of the virtualization queue is recalculated after the migration, and the described problem no longer occurs.
|
Story Points: | --- |
Clone Of: | 1372763 | Environment: | |
Last Closed: | 2016-11-17 15:01:23 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1372763, 1376542 | ||
Bug Blocks: |
Description
Marcel Kolaja
2016-09-09 08:50:04 UTC
Fix included in qemu-kvm-rhev-2.6.0-25.el7 Fix included in qemu-kvm-rhev-2.3.0-31.el7_2.22 with qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64,I still hit "Virtqueue size exceeded" error when do migration during disk(blk) io stress. steps: 1.boot win8-32 guest with virtio-blk-pci device: -object iothread,id=thread0 -drive file=win8-32-rhel7u2.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device virtio-blk-pci,iothread=thread0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 2.run CrystalDiskMark in guest 3.do migration actual result: migration failed with: src:(qemu) qemu-kvm: Virtqueue size exceeded dst:(qemu) qemu-kvm: error while loading state section id 2(ram) qemu-kvm: load of migration failed: Input/output error (In reply to lijin from comment #7) > with qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64,I still hit "Virtqueue size > exceeded" error when do migration during disk(blk) io stress. > > steps: > 1.boot win8-32 guest with virtio-blk-pci device: > -object iothread,id=thread0 -drive > file=win8-32-rhel7u2.raw,if=none,id=drive-ide0-0-0,format=raw, > serial=mike_cao,cache=none -device > virtio-blk-pci,iothread=thread0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 > > 2.run CrystalDiskMark in guest > > 3.do migration > > actual result: > migration failed with: > src:(qemu) qemu-kvm: Virtqueue size exceeded > dst:(qemu) qemu-kvm: error while loading state section id 2(ram) > qemu-kvm: load of migration failed: Input/output error I was able to reproduce this problem with a Linux guest running fio. It is a different bug since the error happens on the source QEMU while the patch for this BZ fixes the destination QEMU. (In reply to Stefan Hajnoczi from comment #8) > I was able to reproduce this problem with a Linux guest running fio. It is > a different bug since the error happens on the source QEMU while the patch > for this BZ fixes the destination QEMU. Thanks,I will report a new bug to track it. This issue is reproduced on rhel7.2-z(qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64),can NOT reproduce with rhel7.3 latest version(qemu-kvm-rhev-2.6.0-26.el7.x86_64) As Z stream bug should clone from Y stream bug,I'm a little confused how to handle it? (In reply to lijin from comment #10) > This issue is reproduced on > rhel7.2-z(qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64),can NOT reproduce with > rhel7.3 latest version(qemu-kvm-rhev-2.6.0-26.el7.x86_64) > > As Z stream bug should clone from Y stream bug,I'm a little confused how to > handle it? Right, code inspection shows that RHEL 7.3 and upstream QEMU do not suffer from this race condition. So we need a 7.2.z-only BZ. I don't know the process either but I've asked on IRC. Will update the BZ when I receive an answer. (In reply to Stefan Hajnoczi from comment #11) > (In reply to lijin from comment #10) > > This issue is reproduced on > > rhel7.2-z(qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64),can NOT reproduce with > > rhel7.3 latest version(qemu-kvm-rhev-2.6.0-26.el7.x86_64) > > > > As Z stream bug should clone from Y stream bug,I'm a little confused how to > > handle it? > > Right, code inspection shows that RHEL 7.3 and upstream QEMU do not suffer > from this race condition. So we need a 7.2.z-only BZ. > > I don't know the process either but I've asked on IRC. Will update the BZ > when I receive an answer. 09:46 < mrezanin> stefanha: As usual...create normal bz (for both z-stream and y-stream). Then we negotiate clone and after it y-stream is closed with proper marking and z-stream is solved 09:48 < stefanha> mrezanin: Does this mean: create a BZ with both y-stream and z-stream flags, then leave a comment saying it's relevant for z-stream only, close it as NOTABUG? 09:48 < mrezanin> stefanha: Yes...but close after cloning I will create the BZ and CC you. Reproduce this bug using the following version: kernel-3.10.0-327.37.1.el7.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64 Reproduce steps: 1. create a 4M lv # pvcreate /dev/sdg # vgcreate testvg /dev/sdg # lvcreate -L 4M -T testvg/testlv # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert home rhel_hp-dl380pg8-09 -wi-ao---- 212.61g root rhel_hp-dl380pg8-09 -wi-ao---- 50.00g swap rhel_hp-dl380pg8-09 -wi-ao---- 15.75g testlv testvg twi-a-tz-- 4.00m 0.00 0.88 2. create a data disk image based on the above lv # qemu-img create -f qcow2 /dev/testvg/testlv 10G 3. boot a rhel7.3 guest with the above data disk image # /usr/libexec/qemu-kvm \ -S \ -name 'rhel7.3' \ -machine pc \ -m 4096 \ -smp 4,maxcpus=4,sockets=1,cores=4,threads=1 \ -cpu SandyBridge \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -boot menu=on \ -enable-kvm \ -monitor stdio \ -drive file=/mnt/rhel7.3.raw,format=raw,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive_sysdisk,bootindex=1 \ -drive if=none,cache=none,format=qcow2,file=/dev/testvg/testlv,werror=stop,id=drive-virtio-disk0 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk0,id=virtio-disk0 \ -vga qxl \ -spice port=5900,disable-ticketing 4. on the same host, use the same command line with "-incoming tcp:0:5800", boot the rhel7.3 guest 5. inside guest # dd if=/dev/zero of=/dev/vdb oflag=direct bs=4k 6. after guest is paused with io-error, do migration (qemu) info status VM status: paused (io-error) (qemu) migrate -d tcp:0:5800 7. on host, grow the logical volume by 4 MB # lvresize -L +4M /dev/testvg/testlv 8. in destination, resume the guest (qemu)c after step8, "Virtqueue size exceeded" error from destination QEMU and qemu-kvm quits. Verify this bug using the following version: kernel-3.10.0-327.37.1.el7.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64 Do the above test, after step 8, destination qemu-kvm did not quit and guest can resume normally. Based on comment #13, set this bug to be verified. Hey guys, Since qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64 obviously passed QA (c#13, c#14) when can we expect it? I have another customer asking for it. Regards, Irina Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2803.html |