Bug 869981 - Cross version migration between different host with spice is broken
Cross version migration between different host with spice is broken
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.4
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Alon Levy
Virtualization Bugs
: Regression
Depends On:
Blocks: 890050
  Show dependency treegraph
 
Reported: 2012-10-25 05:44 EDT by Qunfang Zhang
Modified: 2014-08-04 18:09 EDT (History)
22 users (show)

See Also:
Fixed In Version: qemu-kvm-0.12.1.2-2.353.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 02:43:57 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Qunfang Zhang 2012-10-25 05:44:31 EDT
Description of problem:
Cross version migration (for example from 6.3 to 6.4 host) with spice is broken. I'm using -M rhel6.3.0 on the both side. But migration load failed on the destination host. Actually this is broken long ago (like 6.0<->6.1), but it will be fine to support it as it's a very common scenario.

Version-Release number of selected component (if applicable):
RHEL6.3 host:
kernel-2.6.32-279.15.1.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6_3.5.x86_64
seabios-0.6.1.2-19.el6.x86_64
spice-server-0.10.1-10.el6.x86_64

RHEL6.4 host:
kernel-2.6.32-336.el6.x86_64
qemu-kvm-0.12.1.2-2.331.el6.x86_64
seabios-0.6.1.2-25.el6.x86_64
spice-server-0.12.0-1.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Boot guest on rhel6.3 host with spice:

 /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu Conroe -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name win2k8-r2 -uuid b6d6f013-2940-4124-946a-13da591a3fba -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=b6d6f013-2940-4124-946a-13da591a3fba -k en-us -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/var/lib/libvirt/migrate/win2k8-r2-6.3.qcow2,if=none,id=disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=disk0,id=disk0,scsi=off,bus=pci.0,addr=0x3 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1A:1A:4A:25:28,bus=pci.0,addr=0x4 -netdev tap,id=hostnet1 -device e1000,netdev=hostnet1,id=net1,mac=00:1A:1A:4A:25:20,bus=pci.0,addr=0x5 -netdev tap,id=hostnet2 -device rtl8139,netdev=hostnet2,id=net2,mac=00:1A:1A:4A:25:00,bus=pci.0,addr=0x6  -monitor stdio -qmp tcp:0:6666,server,nowait -chardev socket,path=/tmp/isa-serial,server,nowait,id=isa1 -device isa-serial,chardev=isa1,id=isa-serial1 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -chardev socket,id=charchannel0,path=/tmp/serial-socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -chardev socket,path=/tmp/foo,server,nowait,id=foo -device virtconsole,chardev=foo,id=console0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -spice port=5930,disable-ticketing -global qxl-vga.vram_size=33554432 -k en-us -vga qxl -boot c

2. Boot the guest with same command line on rhel6.4 host with -incoming tcp:0:5800 (-M rhel6.3.0 as well)

3. Migrate guest to the dst host

  
Actual results:
(qemu) [Thread 0x7ffff0094700 (LWP 23602) exited]
qemu: warning: error while loading state for instance 0x0 of device 'ram'

Expected results:
Migration should succeed.

Additional info:
Comment 2 Orit Wasserman 2012-10-25 06:15:24 EDT
Live migration from 6.3 to 6.4 with spice and qxl (excluding seamless migration) should part of the test plan for RHEL 6.4
Comment 3 Orit Wasserman 2012-10-25 06:21:18 EDT
Could you try without global qxl-vga.vram_size=33554432 ?
Comment 4 Qunfang Zhang 2012-10-25 06:28:22 EDT
(In reply to comment #3)
> Could you try without global qxl-vga.vram_size=33554432 ?

Orit
It has the same result without "-global qxl-vga.vram_size=33554432".

#/usr/libexec/qemu-kvm ..... -spice port=5930,disable-ticketing -vga qxl -incoming tcp:0:5800
QEMU 0.12.1 monitor - type 'help' for more information
(qemu) qemu: warning: error while loading state for instance 0x0 of device 'ram'
Comment 5 Orit Wasserman 2012-10-25 06:50:38 EDT
by the way did you test with vnc ?
Comment 6 Qunfang Zhang 2012-10-25 22:23:23 EDT
(In reply to comment #5)
> by the way did you test with vnc ?

Cross migration with vnc finished successfully.
Comment 7 Alon Levy 2012-12-04 10:42:44 EST
I couldn't reproduce while building source vm from source with 0.12.0-295, and when using package qemu-kvm-0.12.1.2-2.295.el6_3.5 I cannot use Conroe cpu:

 Unknown cpu model: Conroe

So are you sure that's the source qemu version & command line used?

Alon
Comment 8 Alon Levy 2012-12-04 12:19:05 EST
OK, can reproduce when changed the cpu version to qemu64.

QEMU 0.12.1 monitor - type 'help' for more information
(qemu) qemu: warning: error while loading state for instance 0x0 of device 'ram'
load of migration failed
Comment 9 Alon Levy 2012-12-05 03:47:41 EST
Forgot to remove the needinfo.
Comment 10 Qunfang Zhang 2012-12-05 03:53:36 EST
(In reply to comment #9)
> Forgot to remove the needinfo.

Alon, hah, seems you added the needinfo again. I guess you have already reproduced this issue so clear it now. Please feel free to add comment if there's any thing I could do.
Comment 11 Qunfang Zhang 2012-12-06 03:31:48 EST
Hi, Alon
Cross migration works between rhel6.1<->rhel6.3 and rhel6.2<->rhel6.3, so this is a regression. 
Refer to bug 698936 that is fixed in rhel6.3.
Comment 12 Alon Levy 2012-12-06 05:08:21 EST
Thanks for the clue, I spent yesterday creating yet another script to automate the checks for bisection, but stepping through the code appears to have been faster (now that I tried it following your mention of bug 698936), now just to find where we set that size exactly (and to add this message to upstream/rhel for the future):

DEBUG: ram
length mismatch: 0000:00:02.0/qxl.vrom: 8192 in != 16384
qemu: warning: error while loading state for instance 0x0 of device 'ram'
load of migration failed
Comment 19 Qunfang Zhang 2013-01-03 21:13:10 EST
Hi, Alon
Do you know when the build will be ready? As it is in a late stage of rhel6.4 and QE need to verify this issue and also some other additional test for it.


Thanks,
Qunfang
Comment 21 Ademar Reis 2013-01-17 15:27:45 EST
Please test the brew-build provided by Alon for bug 876982:
https://bugzilla.redhat.com/show_bug.cgi?id=876982#c26

Alon, I don't see a downstream version of the patch, so it shouldn't be in POST. Please submit a patch to rhvirt-patches and ask for the 3 acks.
Comment 22 Qunfang Zhang 2013-01-17 21:41:36 EST
(In reply to comment #21)
> Please test the brew-build provided by Alon for bug 876982:
> https://bugzilla.redhat.com/show_bug.cgi?id=876982#c26

Ok, will dive into it soon and update the result here. Thanks. 
> 
> Alon, I don't see a downstream version of the patch, so it shouldn't be in
> POST. Please submit a patch to rhvirt-patches and ask for the 3 acks.
Comment 23 Qunfang Zhang 2013-01-18 03:21:28 EST
Hi, Ademar and Alon
I just tested this bug with the build in bug 876982#c26, and this bug can NOT be reproduced, cross migration between rhel6.3 and rhel6.4 host works well with spice+qxl.
As Alon said in bug 876982 that the build has no direct relation with this bug, so is there another patch for this bug itself? 
When the patch could get into the official build? 


Thanks,
Qunfang
Comment 24 Ademar Reis 2013-01-22 15:54:21 EST
Just to clarify and make sure we're in sync, the fix for this bug is:

"""
Date: Tue, 22 Jan 2013 19:33:21 +0200
From: Alon Levy <alevy@redhat.com>
Subject: [PATCHv2 RHEL-6.4 qemu-kvm 0/2] fix qxl migration revision 4 bug
To: rhvirt-patches@redhat.com
Cc: hdegoede@redhat.com, mlureau@redhat.com, armbru@redhat.com, kraxel@redhat.com

v1->v2: (Markus +)
 remove unwanted whitespace fix
 disable added trace events by default like all the rest
 fill in commit message for "stop using non revision 4 rom fields"
 remove a non related trace event that got in there (there is no
   interface_client_monitors_config in RHEL 6.4)

Alon Levy (2):
  qxl: stop using non revision 4 rom fields for revision < 4
  qxl: change rom size to 8192

 hw/qxl.c     | 22 +++++++++++++++-------
 trace-events |  1 +
 2 files changed, 16 insertions(+), 7 deletions(-)
"""

v2 got an ACK from Markus, but is missing the ACKs from v1 (Gerd and Hans).
Comment 25 Qunfang Zhang 2013-01-22 21:38:27 EST
Hi, Gerd and Hans
As Ademar's comment above, could you guys help ack from v1? It's planned to be include in snapshot 5. Currently it's urgent for QE as we need to verify this bug and also bug 733302 that is in the qemu-kvm errata.  And also, QE need to arrange function test for "stable guest abi" and "compatibility" that will take several days to finish after get the official build.

Thanks!
Qunfang
Comment 26 Qunfang Zhang 2013-01-22 21:48:44 EST
If it misses the train of rhel6.4, cross version migration with spice from rhel6.4 host to older rhel host will be broken. That is a very common scenario. As we already have the patch now after lots of effort, so guys please help give it a push and let it proceed.  Thanks a lot.
Comment 27 Hans de Goede 2013-01-23 03:31:56 EST
Acked v2.
Comment 35 errata-xmlrpc 2013-02-21 02:43:57 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0527.html

Note You need to log in before you can comment on or make changes to this bug.