Bug 800710

Summary: migration crashes on the source after hot remove of virtio-scsi controller
Product: Red Hat Enterprise Linux 6 Reporter: Sibiao Luo <sluo>
Component: qemu-kvmAssignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: acathrow, bcao, bsarathy, chayang, dawu, flang, juzhang, mdeng, michen, minovotn, mkenneth, pbonzini, qzhang, shu, sluo, tburke, virt-maint, wdai, wquan, xfu, xigao
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.264.el6 Doc Type: Bug Fix
Doc Text:
No documentation needed
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 11:44:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 769712, 857935, 889075    

Description Sibiao Luo 2012-03-07 02:11:38 UTC
Description of problem:
boot guest with virtio-scsi disk on the src host, and with listening mode on the dest host without virtio-scsi disk. After that, do hot unplug virtio-scsi disk on the src host, and then do migration from src to dest, and the migration failed occurs.

Version-Release number of selected component (if applicable):
host info:
# uname -r & rpm -q qemu-kvm
2.6.32-246.el6.x86_64
qemu-kvm-0.12.1.2-2.236.el6.x86_64
# rpm -q seabios
seabios-0.6.1.2-8.el6.scsitest.x86_64
guest info:
# uname -r
2.6.32-246.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot guest with virtio-scsi disk on the src host, and with listening mode on the dest host without virtio-scsi disk.
src host CLI:
# /usr/libexec/qemu-kvm -enable-kvm -M rhel6.3.0 -smp 2 -m 2G -usb -device usb-tablet,id=input0 -name RHEL-Server-6.3-64 -uuid 9ff50ce8-5831-4556-b43f-84e9d5145e0b -drive file=/mnt/RHEL6.3_20120304.n.0_x86_64.qcow2,if=none,id=hd,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0 -device scsi-disk,drive=hd,scsi-id=0,lun=0,id=scsi_image,bootindex=1 -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0 -spice port=5910,disable-ticketing -vga qxl -monitor stdio -drive file=/mnt/RHEL6.0-20100922.1-Server-x86_64-DVD1.iso,if=none,id=cd -device virtio-scsi-pci,id=scsi1 -device scsi-cd,drive=cd,id=scsi_cd
dest host CLI:
# /usr/libexec/qemu-kvm -enable-kvm -M rhel6.3.0 -smp 2 -m 2G -usb -device usb-tablet,id=input0 -name RHEL-Server-6.3-64 -uuid 9ff50ce8-5831-4556-b43f-84e9d5145e0b -drive file=/mnt/RHEL6.3_20120304.n.0_x86_64.qcow2,if=none,id=hd,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0 -device scsi-disk,drive=hd,scsi-id=0,lun=0,id=scsi_image,bootindex=1 -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0 -spice port=5910,disable-ticketing -vga qxl -monitor stdio -incoming tcp:0.0.0.0:5888
2. check the block info and do hot unplug virtio-scsi disk on the src host.
(qemu) info block
hd: removable=0 io-status=ok file=/mnt/RHEL6.3_20120304.n.0_x86_64.qcow2 ro=0 drv=qcow2 encrypted=0
cd: removable=1 locked=0 tray-open=0 io-status=ok file=/mnt/RHEL6.0-20100922.1-Server-x86_64-DVD1.iso ro=0 drv=raw encrypted=0
ide1-cd0: removable=1 locked=0 tray-open=0 io-status=ok [not inserted]
floppy0: removable=1 locked=0 tray-open=0 [not inserted]
sd0: removable=1 locked=0 tray-open=0 [not inserted]
(qemu) device_del scsi_cd
(qemu) device_del scsi1
(qemu) info block
hd: removable=0 io-status=ok file=/mnt/RHEL6.3_20120304.n.0_x86_64.qcow2 ro=0 drv=qcow2 encrypted=0
ide1-cd0: removable=1 locked=0 tray-open=0 io-status=ok [not inserted]
floppy0: removable=1 locked=0 tray-open=0 [not inserted]
sd0: removable=1 locked=0 tray-open=0 [not inserted]
3. do migration from src to dest.
(qemu) migrate -d tcp:10.66.11.229:5888
spice_server_migrate_start:

Actual results:
the test result(prompt error) for the first time is not all the same as the second or other time after the step 3.
1. test for the fist time:
host migration completed on the src,
(qemu) info migrate 
Migration status: completed
but load of migration failed on the dest,
(qemu) 
(qemu) red_dispatcher_loadvm_commands: 
handle_dev_loadvm_commands: loadvm_commands
spice_server_add_interface: SPICE_INTERFACE_TABLET
Unknown savevm section or instance '0000:00:05.0/virtio-scsi' 0
load of migration failed
2. test for the second or other time:
Segmentation fault on the src,
(qemu) handle_dev_stop: stop
Segmentation fault
load of migration failed on the dest,
(qemu) qemu: warning: error while loading state section id 3
load of migration failed

Expected results:
migration should be successfully and guest should be works well.

Additional info:
if boot the guest with multi virtio-scsi devices on the same target(virtio-scsi controller), and then do migration successfully with the same steps above.

Comment 2 Sibiao Luo 2012-03-07 08:40:07 UTC
> 2. test for the second or other time:
> Segmentation fault on the src,
> (qemu) handle_dev_stop: stop
> Segmentation fault
> load of migration failed on the dest,
> (qemu) qemu: warning: error while loading state section id 3
> load of migration failed
> 

I have got the backtrace log for the segmentation fault on the src:

(qemu) handle_dev_stop: stop
[New Thread 0x7fff48ffa700 (LWP 8201)]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e088a5 in ?? ()
(gdb) bt
#0  0x00007ffff7e088a5 in ?? ()
#1  0x00007ffff7e0fb99 in ?? ()
#2  0x00007ffff7f33c83 in ?? ()
#3  0x00007ffff7e75a60 in ?? ()
#4  0x00007ffff7e6d2de in ?? ()
#5  0x00007ffff7df9a70 in ?? ()
#6  0x00007ffff7e19ffa in ?? ()
#7  0x00007ffff7dfb59c in main ()
(gdb) q

Comment 3 Sibiao Luo 2012-03-07 09:20:49 UTC
(In reply to comment #2)
> > 2. test for the second or other time:
> > Segmentation fault on the src,
> > (qemu) handle_dev_stop: stop
> > Segmentation fault
> > load of migration failed on the dest,
> > (qemu) qemu: warning: error while loading state section id 3
> > load of migration failed
> > 
> 
> I have got the backtrace log for the segmentation fault on the src:
> 
> (qemu) handle_dev_stop: stop
> [New Thread 0x7fff48ffa700 (LWP 8201)]
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff7e088a5 in ?? ()
> (gdb) bt
> #0  0x00007ffff7e088a5 in ?? ()
> #1  0x00007ffff7e0fb99 in ?? ()
> #2  0x00007ffff7f33c83 in ?? ()
> #3  0x00007ffff7e75a60 in ?? ()
> #4  0x00007ffff7e6d2de in ?? ()
> #5  0x00007ffff7df9a70 in ?? ()
> #6  0x00007ffff7e19ffa in ?? ()
> #7  0x00007ffff7dfb59c in main ()
> (gdb) q

sorry for my careless, forget to install the qemu-kvm-debuginfo package in my host. I have retest it and get the backtrace log for the segmentation fault on the src.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7f33c74 in virtio_save (vdev=0x7ffff88dbd00, f=0x7ffff9327fb0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio.c:735
735	    if (vdev->binding->save_config)
(gdb) bt
#0  0x00007ffff7f33c74 in virtio_save (vdev=0x7ffff88dbd00, f=0x7ffff9327fb0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio.c:735
#1  0x00007ffff7e75a60 in vmstate_save (mon=<value optimized out>, f=0x7ffff9327fb0) at savevm.c:1459
#2  qemu_savevm_state_complete (mon=<value optimized out>, f=0x7ffff9327fb0) at savevm.c:1621
#3  0x00007ffff7e6d2de in migrate_fd_put_ready (opaque=0x7ffff92d6e40) at migration.c:406
#4  0x00007ffff7df9a70 in qemu_run_timers (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:1315
#5  main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4058
#6  0x00007ffff7e19ffa in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2225
#7  0x00007ffff7dfb59c in main_loop (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4234
#8  main (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6495
(gdb)

Comment 9 Paolo Bonzini 2012-03-28 06:50:42 UTC
*** Bug 800274 has been marked as a duplicate of this bug. ***

Comment 10 langfang 2012-04-06 10:29:11 UTC
Reproduced this issue with steps and  environment as follows:
host version:
# uname -r
2.6.32-257.el6.x86_64
# rpm -qa |grep qemu-kvm
qemu-kvm-0.12.1.2-2.262.el6.x86_64

steps:
1)boot guest with virtio-scsi disk 

usr/libexec/qemu-kvm -m 2G -smp 1 -cpu Penryn,+x2apic, -usbdevice tablet -drive file=/mnt/RHEL-Server-6.3-64-virtio.qcow2-newinstall5,format=qcow2,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,cache=none -device virtio-blk-pci,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,mac=00:10:20:2d:31:21,bus=pci.0,addr=0x4,id=net0 -boot order=cdn,once=n,menu=on -uuid 3290efd3-7c9e-44f9-b5f7-af0f3a1b3066 -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -monitor stdio -name rhel6.1 -spice port=1000,disable-ticketing -vga qxl -device virtio-balloon-pci,bus=pci.0,id=balloon0 -drive file=/mnt/RHEL6.3-20120313.2-Server-x86_64-DVD1.iso,if=none,id=cdrom1 -device virtio-scsi-pci,id=cdrom -device scsi-cd,drive=cdrom1,scsi-id=0,lun=0 -nodefconfig
2)boot another vm with listen mode  and without virtio-scsi disk for migration
...-incoming tcp:0:5800
3) do hot unplug virtio-scsi on src vm
(qemu)device_del cdrom
4)do migration and results:
(qemu)migrate -d tcp:10.66.65.153:5800
resutls:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7f30244 in virtio_save (vdev=0x7ffff88cbe50, f=0x7ffff9571a10) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio.c:735
735	    if (vdev->binding->save_config)
..
(gdb) bt
#0  0x00007ffff7f30244 in virtio_save (vdev=0x7ffff88cbe50, f=0x7ffff9571a10) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio.c:735
#1  0x00007ffff7e6cdc0 in vmstate_save (mon=<value optimized out>, f=0x7ffff9571a10) at savevm.c:1459
#2  qemu_savevm_state_complete (mon=<value optimized out>, f=0x7ffff9571a10) at savevm.c:1621
#3  0x00007ffff7e64695 in migrate_fd_put_ready (opaque=0x7ffff89e8640) at migration.c:405
#4  0x00007ffff7deed70 in qemu_run_timers (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:1323
#5  main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4024
#6  0x00007ffff7e1035a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2244
#7  0x00007ffff7df17ac in main_loop (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4202
#8  main (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6427

Verified this issue with steps and environment as follows:

host version:
# uname -r
2.6.32-257.el6.x86_64
# rpm -qa |grep qemu-kvm
qemu-kvm-0.12.1.2-2.265.el6.x86_64

the steps as same as reproduce
1)
2)
3)
4)
after step 4,the results:
can be migrated successful,so this issue has been fixed.

Comment 12 Dor Laor 2012-04-22 11:31:52 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation needed

Comment 13 errata-xmlrpc 2012-06-20 11:44:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0746.html