Bug 1520824

Summary: Migration with dataplane, qemu processor hang, vm hang and migration can't finish
Product: Red Hat Enterprise Linux 7 Reporter: xianwang <xianwang>
Component: qemu-kvm-rhevAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED ERRATA QA Contact: xianwang <xianwang>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: chayang, coli, dgilbert, gveitmic, hhuang, jasowang, juzhang, knoel, lmiksik, lvivier, michen, mkalinin, mrezanin, ngu, peterx, quintela, qzhang, stefanha, toneata, virt-maint, yilzhang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.10.0-14.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1544738 (view as bug list) Environment:
Last Closed: 2018-04-11 00:52:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1477664, 1544738    

Description xianwang 2017-12-05 09:16:11 UTC
Description of problem:
This bug is from doing reproduction of https://bugzilla.redhat.com/show_bug.cgi?id=1506151#c7, boot a vm with two scsi disks that with dataplane, after migration, qemu processor hang on src host, vm hang, migration can't finish and "VM status: paused (inmigrate)" on dst host.

Version-Release number of selected component (if applicable):
x86:
3.10.0-792.el7.x86_64
qemu-kvm-rhev-2.10.0-10.el7.x86_64
seabios-bin-1.11.0-1.el7.noarch

How reproducible:
3/3

Steps to Reproduce:
1.Boot a guest on src host, there are two scsi disks that with dataplane:
/usr/libexec/qemu-kvm -nodefaults -object iothread,id=iothread0 -device virtio-scsi-pci,bus=pci.0,addr=0x1f,id=scsi0,iothread=iothread0 -drive file=/home/xianwang/rhel75.qcow2,media=disk,if=none,cache=none,id=drive_sysdisk,aio=native,format=qcow2,werror=stop,rerror=stop -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 -drive file=/home/xianwang/r1.qcow2,if=none,cache=none,id=drive_ddisk_2,aio=native,format=qcow2,werror=stop,rerror=stop -device scsi-hd,drive=drive_ddisk_2,bus=scsi0.0,id=ddisk_2 -monitor stdio -vga std -vnc :1 -m 4096

2.on dst host, boot a guest with the same qemu cli with src, appending -incoming tcp:0:5801

3.on src host, do migration
(qemu) migrate -d tcp:10.66.10.208:5801
(qemu) info status 
VM status: running
(qemu) info migrate
Migration status: active


Actual results:
on src, qemu hang and vm hang on dst vnc, the status is "VM status: paused (inmigrate)" on dst host
kill the qemu processor of src, there are some messages prompt on dst host

on src:
(qemu) Killed

on dst:
(qemu) qemu-kvm: Failed to load virtio_pci/modern_state:modern_state
qemu-kvm: Failed to load virtio/extra_state:extra_state
qemu-kvm: Failed to load virtio-scsi:virtio
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.0/virtio-scsi'
qemu-kvm: warning: TSC frequency mismatch between VM (3392293 kHz) and host (3392167 kHz), and TSC scaling unavailable
qemu-kvm: load of migration failed: Input/output error
 

Expected results:
migration finish and vm work well on dst host

Additional info:
this issue is both for x86 and ppc
on ppc(p8), after killing the qemu processor of src, the error message of dst host is:
(qemu) qemu-kvm: Failed to load virtio_pci/modern_state:modern_state
qemu-kvm: Failed to load virtio/extra_state:extra_state
qemu-kvm: Failed to load virtio-scsi:virtio
qemu-kvm: error while loading state for instance 0x0 of device 'pci@800000020000000:1f.0/virtio-scsi'
qemu-kvm: load of migration failed: Input/output error

Comment 2 Stefan Hajnoczi 2017-12-06 18:02:10 UTC
Paolo's patch has been posted upstream:
https://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg00964.html

Comment 3 Dr. David Alan Gilbert 2017-12-13 11:31:04 UTC
I see you've got a v2 at:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg01239.html

Comment 6 Miroslav Rezanina 2018-01-02 14:19:53 UTC
Fix included in qemu-kvm-rhev-2.10.0-14.el7

Comment 8 yilzhang 2018-01-03 07:24:47 UTC
Verified against the following version of qemu-kvm-rhev:
qemu-kvm-rhev-2.10.0-14.el7


Src host and Des host are both with:
Kernel:  3.10.0-823.el7.x86_64
seabios: seabios-bin-1.11.0-1.el7.noarch

Verify steps: the same with this bug reported

Actual results:
Migration succeeds, and VM works well on destination side after migration

So this bug is fixed.

Comment 12 Germano Veit Michel 2018-02-12 00:06:52 UTC
*** Bug 1542305 has been marked as a duplicate of this bug. ***

Comment 18 errata-xmlrpc 2018-04-11 00:52:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104