Bug 1113009

Summary: Migration failed with virtio-blk from RHEL6.5.0 host to RHEL7.0 host
Product: Red Hat Enterprise Linux 7 Reporter: huiqingding <huding>
Component: qemu-kvmAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.1CC: coli, dgilbert, hhuang, huding, juzhang, michen, mrezanin, qzhang, rbalakri, scui, virt-maint, xfu, xuhan
Target Milestone: rcKeywords: Regression, TestBlocker
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-1.5.3-65.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 08:10:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description huiqingding 2014-06-25 09:10:36 UTC
Description of problem:
When migrate a guest with a virtio-blk device from RHEL6.5.0 host to RHEL7.0 host, the dst qemu-kvm quits with error info.

Version-Release number of selected component (if applicable):
RHEL6.5.0 host:
qemu-kvm-0.12.1.2-2.415.el6_5.11.x86_64
kernel-2.6.32-431.22.1.el6.x86_64
RHEL7.0 host:
qemu-img-1.5.3-63.el7.x86_64
kernel-3.10.0-123.4.2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot a guest on RHEL6.5.0 host as src host
# /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/win7-64.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 

2. boot the guest on RHEL7.0 host as dst host
# /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/win7-64.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -incoming tcp:0:5800

3. do migration from RHEL6.5.0 host to RHEL7.0 host
(qemu) migrate -d tcp:10.66.8.248:5800

Actual results:
the dst qemu-kvm quits with the error info:
(qemu) qemu-kvm: Unexpected config length 0x20. Expected 0x21
qemu: warning: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-blk'
load of migration failed

on src qemu-kvm, check migration status, migration is completed as following:
(qemu) info migrate 
Migration status: completed
total time: 984 milliseconds
downtime: 64 milliseconds
transferred ram: 652 kbytes
remaining ram: 0 kbytes
total ram: 2065728 kbytes

Expected results:
migration can finish normally.

Additional info:

Comment 1 huiqingding 2014-06-25 09:12:01 UTC
I use the same steps of comment 0 and test qemu-kvm-1.5.3-62.el7.x86_64, migration can be finished normally.

Comment 2 huiqingding 2014-06-25 09:13:38 UTC
I use libvirt to test qemu-img-1.5.3-63.el7.x86_64 and do migration as comment 0 and hit the same issue.

Comment 3 huiqingding 2014-06-25 09:16:00 UTC
(In reply to huiqingding from comment #2)
> I use libvirt to test qemu-img-1.5.3-63.el7.x86_64 and do migration as
> comment 0 and hit the same issue.

correction:
It should be qemu-kvm-1.5.3-63.el7.x86_64

Comment 4 juzhang 2014-06-25 09:18:39 UTC
(In reply to huiqingding from comment #1)
> I use the same steps of comment 0 and test qemu-kvm-1.5.3-62.el7.x86_64,
> migration can be finished normally.

According to this comment, add regression keywords

Comment 7 huiqingding 2014-06-25 09:28:06 UTC
Please note:

I also tried rhel7.0.z qemu-kvm build as well. qemu-kvm-1.5.3-60.el7_0.3 can hit it, qemu-kvm-1.5.3-60.el7_0.2 can not hit it.

Conclusion: this regression is brought from qemu-kvm-1.5.3-60.el7_0.3. From kvm qe pov, suggest to fix this bz in rhel7.0.z as well.

Best regards
Huiqing

Comment 19 Dr. David Alan Gilbert 2014-06-27 08:39:37 UTC
Fix posted upstream - 'Allow mismatched virtio config-len'

Comment 20 Dr. David Alan Gilbert 2014-06-27 08:41:31 UTC
Junyi:

> Thanks David first. Could you elaborate on what's virtio setups meaning? Vritio > includes all virtio-blk, virtio-scsi and virito-net? Or just virtio-blk?

The fix is in code used in migration by all virtio devices, so I think it's best to test them all.

Comment 21 juzhang 2014-06-27 08:45:10 UTC
(In reply to Dr. David Alan Gilbert from comment #20)
> Junyi:
> 
> > Thanks David first. Could you elaborate on what's virtio setups meaning? Vritio > includes all virtio-blk, virtio-scsi and virito-net? Or just virtio-blk?
> 
> The fix is in code used in migration by all virtio devices, so I think it's
> best to test them all.

Thanks for your confirmation again. May I ask one more question? Fix this bz does not affect bz1095782, right? I mean your patch is not simply revert but really fix. The bz1095782 should still in ON_QA state, right?

Best Regards,
Junyi

Comment 22 Dr. David Alan Gilbert 2014-06-27 09:04:47 UTC
My fix doesn't revert bz1095782's fix; it does  change the way that it's fix works.
So I'm not sure how you want to change the state of 1095782 - the fix for it is the cause of this bug.

Comment 23 juzhang 2014-06-27 09:08:32 UTC
(In reply to Dr. David Alan Gilbert from comment #22)
> My fix doesn't revert bz1095782's fix; it does  change the way that it's fix
> works.

Thanks David, QE will run the test according to your suggestions once this is on_qa status. 

Best Regards,
Junyi

> So I'm not sure how you want to change the state of 1095782 - the fix for it
> is the cause of this bug.

Comment 26 Dr. David Alan Gilbert 2014-06-27 17:00:22 UTC
Huiqing:
  Please also test https://brewweb.devel.redhat.com/taskinfo?taskID=7633754

Comment 29 huiqingding 2014-06-30 02:47:43 UTC
Hi, Hai

Sorry fro removing the needinfo flag, please see comment24 and comment25.

Thanks.

Best regards
Huiqing

Comment 39 Miroslav Rezanina 2014-07-02 09:11:48 UTC
Fix included in qemu-kvm-1.5.3-65.el7

Comment 41 huiqingding 2014-07-21 05:52:47 UTC
Reproduce this bug using the following version:
RHEL6.5.0 host:
qemu-kvm-0.12.1.2-2.415.el6_5.11.x86_64
kernel-2.6.32-431.25.1.el6.x86_64
RHEL7.1 host:
qemu-kvm-1.5.3-63.el7.x86_64
kernel-3.10.0-138.el7.x86_64

Steps to Reproduce:
1. boot a guest on src RHEL6.5.0 host
# /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/mnt/rhel7-test.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

2. boot the guest on RHEL7.1 host as dst host
# /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/mnt/rhel7-test.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -incoming tcp:0:5800

3. do migation
(qemu)  migrate -d tcp:10.66.8.248:5800

Actual result:
after step3, the dst qemu-kvm quits with the error info:
(qemu) qemu-kvm: Unexpected config length 0x20. Expected 0x21
qemu: warning: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-blk'
load of migration failed

on src qemu-kvm, check migration status, migration is completed as following:
(qemu) info migrate 
Migration status: completed
total time: 984 milliseconds
downtime: 64 milliseconds
transferred ram: 652 kbytes
remaining ram: 0 kbytes
total ram: 2065728 kbytes

Comment 42 huiqingding 2014-07-21 06:25:45 UTC
Test this bug on two intel hosts, using the following version:
RHEL6.5.0 host:
qemu-kvm-0.12.1.2-2.415.el6_5.11.x86_64
kernel-2.6.32-431.25.1.el6.x86_64
RHEL7.1 host:
qemu-kvm-1.5.3-66.el7.x86_64
kernel-3.10.0-138.el7.x86_64

Steps to Test:
1. boot a guest on src RHEL6.5.0 host
#  /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=gluster://10.66.9.152/gv0/rhel7-test2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc :1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

2. boot the guest on RHEL7.1 host as dst host
#  /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=gluster://10.66.9.152/gv0/rhel7-test2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc :1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -incoming tcp:0:5800

3. do migation
(qemu)  migrate -d tcp:10.66.8.248:5800

Actual result:
after step3, the migration can be finished normally.

Comment 43 huiqingding 2014-07-21 06:40:37 UTC
Test this bug on two intel hosts, using the following version:
RHEL6.5.0 host:
qemu-kvm-0.12.1.2-2.415.el6_5.11.x86_64
kernel-2.6.32-431.25.1.el6.x86_64
RHEL7.1 host:
qemu-kvm-rhev-2.1.0-3.el7ev.preview.x86_64
kernel-3.10.0-138.el7.x86_64

Steps to Test:
1. boot a guest on src RHEL6.5.0 host
#  /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=gluster://10.66.9.152/gv0/rhel7-test2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc :1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

2. boot the guest on RHEL7.1 host as dst host
#  /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=gluster://10.66.9.152/gv0/rhel7-test2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc :1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -incoming tcp:0:5800

3. do migation
(qemu)  migrate -d tcp:10.66.8.248:5800

Actual result:
after step3, the migration can be finished normally.

Comment 44 huiqingding 2014-07-23 03:44:50 UTC
Test this bug on two amd hosts, the guest is win7-32. Using the following version:
RHEL6.5.0 host:
qemu-kvm-0.12.1.2-2.415.el6_5.11.x86_64
kernel-2.6.32-431.25.1.el6.x86_64
RHEL7.1 host:
qemu-kvm-1.5.3-66.el7.x86_64
kernel-3.10.0-138.el7.x86_64

Steps to Test:
1. boot a guest on src RHEL6.5.0 host
#  /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu Opteron_G2 -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/mnt/win7sp1-32.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc :1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

2. boot the guest on RHEL7.1 host as dst host
# /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu Opteron_G2 -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/mnt/win7sp1-32.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc :1 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -incoming tcp:0:5800

3. do migation
(qemu)  migrate -d tcp:10.66.106.9:5800

Actual result:
after step3, the migration can be finished normally.

Comment 45 huiqingding 2014-07-23 04:34:54 UTC
Test this bug on two amd hosts, the guest is win7-32. Using the following version:
RHEL6.5.0 host:
qemu-kvm-0.12.1.2-2.415.el6_5.11.x86_64
kernel-2.6.32-431.25.1.el6.x86_64
RHEL7.1 host:
qemu-kvm-rhev-2.1.0-3.el7ev.preview.x86_64
kernel-3.10.0-138.el7.x86_64

Steps to Test:
1. boot a guest on src RHEL6.5.0 host
#   /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu Opteron_G2 -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown  -drive file=/mnt/virtio-blk-disk,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2  -vnc :1 -k en-us -vga cirrus -cdrom driver.iso -drive file=/mnt/en_windows_7_ultimate_with_sp1_x86_dvd_u_677460.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=1,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/mnt/win7sp1-32-cp1.raw,if=none,id=drive-scsi-disk,format=raw,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x13 -device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,id=data-disk2,bootindex=1

2. boot the guest on RHEL7.1 host as dst host
#  /usr/libexec/qemu-kvm -name migration1 -S -M rhel6.5.0 -cpu Opteron_G2 -enable-kvm -m 2001 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid f3b6e666-e7ac-1afd-100d-991565935a66 -nodefconfig -nodefaults -monitor stdio -rtc base=utc -no-shutdown  -drive file=/mnt/virtio-blk-disk,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2  -vnc :1 -k en-us -vga cirrus -cdrom driver.iso -drive file=/mnt/en_windows_7_ultimate_with_sp1_x86_dvd_u_677460.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=1,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/mnt/win7sp1-32-cp1.raw,if=none,id=drive-scsi-disk,format=raw,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x13 -device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,id=data-disk2,bootindex=1 -incoming tcp:0:5800

3. do migation
(qemu)  migrate -d tcp:10.66.106.9:5800

Actual result:
after step3, the migration can be finished normally.

Comment 49 errata-xmlrpc 2015-03-05 08:10:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0349.html