Bug 1380320

Summary: Race condition during virtio-blk dataplane stop triggers "Virtqueue size exceeded"
Product: Red Hat Enterprise Linux 7 Reporter: Marcel Kolaja <mkolaja>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: huiqingding <huding>
Severity: high Docs Contact:
Priority: high    
Version: 7.2CC: areis, chayang, huding, jen, jherrman, juzhang, knoel, lijin, mas-hatada, mrezanin, mst, stefanha, taosawa, virt-maint
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.3.0-31.el7_2.23 Doc Type: Bug Fix
Doc Text:
Due to a race condition in the virtio-blk dataplane, live migration of a guest in some cases failed with a "Virtqueue size exceeded" error message. This update prevents the race condition from occurring, and thus allows live migration to work more reliably.
Story Points: ---
Clone Of: 1378788 Environment:
Last Closed: 2016-11-17 15:02:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1378788    
Bug Blocks:    

Description Marcel Kolaja 2016-09-29 09:45:20 UTC
This bug has been copied from bug #1378788 and has been proposed
to be backported to 7.2 z-stream (EUS).

Comment 3 Miroslav Rezanina 2016-10-14 06:34:05 UTC
Fix included in qemu-kvm-rhev-2.3.0-31.el7_2.23

Comment 5 huiqingding 2016-10-19 08:37:58 UTC
I can reproduce this bug using win8 32 bits guest, when running CrystalDiskMark IO stress, migrate guest. Migration is failed with "Virtqueue size exceeded". But I cannot reproduce this bug using RHEL7.3 guest with fio as bz1378788 comment 0.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.3.0-31.el7_2.22

Reproduce steps:
1. Boot guest with virtio-blk-pci device in the source host
-object iothread,id=thread0 -drive file=win8-32.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device virtio-blk-pci,iothread=thread0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1

2. Boot guest in the destination host
-object iothread,id=thread0 -drive file=win8-32.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device virtio-blk-pci,iothread=thread0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -incoming tcp:0:5800

3. Start CrystalDiskMark inside guest

4. Do migration

Results:
after step4, migration failed:
source host: (qemu) qemu-kvm: Virtqueue size exceeded
destination host: (qemu) qemu-kvm: Unknown combination of migration flags: 0
qemu-kvm: error while loading state section id 2(ram)
qemu-kvm: load of migration failed: Invalid argument

Comment 6 huiqingding 2016-10-20 06:49:39 UTC
Test this bug using qemu-kvm-rhev-2.3.0-31.el7_2.23.

Do test as comment #5, after step4, migration can be finished normally and guest works well in destination side. 

Do test as bz1378788 comment 0. Do 10+ times migration, migration is successful and the fio benchmark continues running in rhel7.2.z guest.

Comment 11 errata-xmlrpc 2016-11-17 15:02:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2803.html