Bug 744518 - qemu-kvm core dumps when qxl-linux guest migrate with reboot
Summary: qemu-kvm core dumps when qxl-linux guest migrate with reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.2
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Yonit Halperin
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-09 07:20 UTC by Xiaoqing Wei
Modified: 2013-01-10 00:25 UTC (History)
13 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.205.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 16:05:24 UTC


Attachments (Terms of Use)
gdb bt full (253.72 KB, text/plain)
2011-10-09 07:20 UTC, Xiaoqing Wei
no flags Details
Not reloading an empty cursor after migration + emptying it when needed (2.33 KB, patch)
2011-10-26 14:31 UTC, Yonit Halperin
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1531 normal SHIPPED_LIVE Moderate: qemu-kvm security, bug fix, and enhancement update 2011-12-06 01:23:30 UTC

Description Xiaoqing Wei 2011-10-09 07:20:30 UTC
Created attachment 527075 [details]
gdb bt full

Description of problem:
qemu-kvm core dumps when qxl-linux guest migrate with reboot

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.195.el6.x86_64

How reproducible:
1 / 1

Steps to Reproduce:
1. boot a rhel6.2 guest using -spice and '-vga qxl'
2. reboot guest
3. migrate guest using tcp "migrate -d tcp:xxxx:yyyy"
  
Actual results:
qemu-kvm core dumps

Expected results:
migration success, guest work well.

Additional info:
Both Host and guest are rhel6.2 snapshot 1

packages may relevant, host :
qemu-kvm-0.12.1.2-2.195.el6.x86_64
spice-server-0.8.2-4.el6.x86_64
spice-client-0.8.2-7.el6.x86_64
vgabios-0.6b-3.6.el6.noarch

packages may relevant, guest :
xorg-x11-drv-qxl-0.0.14-9.el6.x86_64

gdb-bt output, detail "bt full" is attached.

(gdb) bt
#0  0x0000003e08c32885 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003e08c34065 in abort () at abort.c:92
#2  0x0000003cb4012287 in validate_virt (info=<value optimized out>, virt=<value optimized out>, 
    slot_id=<value optimized out>, add_size=<value optimized out>, group_id=<value optimized out>)
    at red_memslots.c:83
#3  0x0000003cb401232c in get_virt (info=<value optimized out>, addr=<value optimized out>, 
    add_size=<value optimized out>, group_id=1) at red_memslots.c:122
#4  0x0000003cb4012ad6 in red_get_cursor_cmd (slots=0x7ff510ef5ab0, group_id=1, red=0x7ff47408cd60, 
    addr=<value optimized out>) at red_parse_qxl.c:1084
#5  0x0000003cb403157f in handle_dev_input (listener=0x7ff510d25c00, events=<value optimized out>)
    at red_worker.c:10149
#6  0x0000003cb40305d5 in red_worker_main (arg=<value optimized out>) at red_worker.c:10304
#7  0x0000003e090077f1 in start_thread (arg=0x7ff510efa700) at pthread_create.c:301
#8  0x0000003e08ce570d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Comment 2 Xiaoqing Wei 2011-10-09 09:34:34 UTC
full cmd
/root/staf-kvm-devel/autotest-devel/client/tests/kvm/qemu -name 'vm1' -chardev socket,id=human_monitor_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20111010-003633-Z66o,server,nowait -mon chardev=human_monitor_id_humanmonitor1,mode=readline -chardev socket,id=serial_id_20111010-003633-Z66o,path=/tmp/serial-20111010-003633-Z66o,server,nowait -device isa-serial,chardev=serial_id_20111010-003633-Z66o -drive file='/root/staf-kvm-devel/autotest-devel/client/tests/kvm/images/RHEL-Server-6.2-64-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idLTg2e5,mac=9a:58:05:cf:56:d2,id=ndev00idLTg2e5,bus=pci.0,addr=0x3 -netdev tap,id=idLTg2e5,ifname='t0-003633-Z66o',script='/root/staf-kvm-devel/autotest-devel/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 4096 -smp 4,cores=2,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic
\
\
 -spice port=8000,disable-ticketing -vga qxl \
\
\
-rtc base=utc,clock=host,driftfix=none -boot order=cdn,once=c,menu=off    -no-kvm-pit-reinjection  -M rhel6.2.0  -device intel-hda -device hda-duplex  -usb -device usb-tablet -enable-kvm

Comment 3 Yonit Halperin 2011-10-10 07:33:22 UTC
Can you please add the qemu log? It would be easier to determine if this is a clone of 740547 with the log.

Comment 4 Xiaoqing Wei 2011-10-10 09:37:41 UTC
(In reply to comment #3)
> Can you please add the qemu log? It would be easier to determine if this is a
> clone of 740547 with the log.

Hi Yonit,

I tried to start qemu-kvm with "-d out_asm,in_asm,op,op_opt,int"
 but after multiple migration, /tmp/qemu.log is still empty, could you pls tell me which log you need , and how to collect ?


Thanks

Comment 5 Yonit Halperin 2011-10-10 10:03:42 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Can you please add the qemu log? It would be easier to determine if this is a
> > clone of 740547 with the log.
> 
> Hi Yonit,
> 
> I tried to start qemu-kvm with "-d out_asm,in_asm,op,op_opt,int"
>  but after multiple migration, /tmp/qemu.log is still empty, could you pls tell
> me which log you need , and how to collect ?
> 
> 
> Thanks

Hi,
if you are using libvirt, the log should be at /var/log/libvirt/qemu.
If you are running qemu by yourself, just redirect the standard output (e.g., 
"qemu...|& tee /tmp/qemu.log").

BTW, we need both the log of migration src and target.

Comment 7 Yonit Halperin 2011-10-11 06:29:28 UTC
It looks like it is a new bug and not a clone of bug 740547.
According to the log and stack, the crash occurs during loadvm, when processing the replayed cursor command. Unlike bug 740547, when it happened, the qxl was in native mode and all the memslots were already added.

The log messages related to the crash:
10/10 02:49:26 INFO |   aexpect:0783| (qemu) id 0, group 0, virt start 0, virt end ffffffffffffffff, generation 0, delta 0
10/10 02:49:26 INFO |   aexpect:0783| (qemu) id 1, group 1, virt start 7f05dfc00000, virt end 7f05e2bfe000, generation 0, delta 7f05dfc00000
10/10 02:49:26 INFO |   aexpect:0783| (qemu) id 2, group 1, virt start 7f05dba00000, virt end 7f05dfa00000, generation 0, delta 7f05dba00000
10/10 02:49:26 INFO |   aexpect:0783| (qemu) validate_virt: panic: virtual address out of range
10/10 02:49:26 INFO |   aexpect:0783| (qemu)     virt=0x0+0x96 slot_id=0 group_id=1
10/10 02:49:26 INFO |   aexpect:0783| (qemu)     slot=0x0-0x0 delta=0x0

Comment 8 Yonit Halperin 2011-10-18 08:50:15 UTC
Hi Xiaoqing,
I didn't manage to reproduce the bug so far. However, from looking at the code I identify one possible path that can lead to this crash.
(1) Can you run qemu (both src and target) with -global qxl.debug=1 and -global.cmdlog=1 , reproduce the crash and attach the logs?
(2) Did X start, before you preform the reboot?
Thanks,
Yonit.

Comment 9 David Blechter 2011-10-18 14:08:08 UTC
reset devel_ack to (?) - due to developer can't reproduce the problem and needs more info from the reporter

Comment 10 Yonit Halperin 2011-10-18 15:15:10 UTC
Hi Xiaoqing,
can it be that (1) the specific crash that is documented here happened 
when migrating during the first guest boot and not after rebooting?
(2) crashes that occurred during reboot are the ones described at bug 740547?

Comment 11 Yonit Halperin 2011-10-18 16:42:25 UTC
Hi Xiaoqing,
a scratch build with the fix for bug 740547, and a fix to the specific scenario that I think that caused the call stack you received can be found in
brewweb.devel.redhat.com/taskinfo?taskID=3717659

Can you please try to reproduce the crash with this build? Thanks.

Comment 12 Xiaoqing Wei 2011-10-19 08:49:04 UTC
(In reply to comment #11)
> Hi Xiaoqing,
> a scratch build with the fix for bug 740547, and a fix to the specific scenario
> that I think that caused the call stack you received can be found in
> brewweb.devel.redhat.com/taskinfo?taskID=3717659
> 
> Can you please try to reproduce the crash with this build? Thanks.

Testing , will update bz  later

Comment 13 Xiaoqing Wei 2011-10-20 13:18:25 UTC
(In reply to comment #8)
Hi Yonit,
> Hi Xiaoqing,
> I didn't manage to reproduce the bug so far. However, from looking at the code
> I identify one possible path that can lead to this crash.
> (1) Can you run qemu (both src and target) with -global qxl.debug=1 and
> -global.cmdlog=1 , reproduce the crash and attach the logs?
  I think you mean -global qxl.cmdlog=1 -global qxl.debug=1 ? 

> (2) Did X start, before you preform the reboot?
  Yes, in Gnome desktop

I have tried letting autotest run 200 rounds and finish, looks not core dump.
 But the host is mistakenly reinstalled before I cp the test log.

I need to retest it , and will update bz 

Thanks,
Xiaoqing.

Comment 17 Yonit Halperin 2011-10-26 14:31:29 UTC
Created attachment 530296 [details]
Not reloading an empty cursor after migration + emptying it when needed

Comment 20 Xiaoqing Wei 2011-10-28 06:46:07 UTC
Hi,

I am using 
qemu-kvm-0.12.1.2-2.205.el6.x86_64
spice-server-0.8.2-5.el6.x86_64

doing same test as comment 0 (200 rounds), 
qemu-kvm crashes but, gdb output is same  as bz 698537.

(gdb) #0  0x0000003e8e832885 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003e8e834065 in abort () at abort.c:92
#2  0x0000003b584318c9 in handle_dev_update (listener=0x7f0a2a1edc00, events=<value optimized out>) at red_worker.c:9725
#3  handle_dev_input (listener=0x7f0a2a1edc00, events=<value optimized out>) at red_worker.c:9982
#4  0x0000003b58430865 in red_worker_main (arg=<value optimized out>) at red_worker.c:10304
#5  0x0000003e8f0077f1 in start_thread (arg=0x7f0a2a3c2700) at pthread_create.c:301
#6  0x0000003e8e8e570d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thanks and Best Regards,
Xiaoqing.

Comment 21 Xiaoqing Wei 2011-10-28 07:38:55 UTC
(In reply to comment #20)

> qemu-kvm crashes but, gdb output is same  as bz 698537.
>
Sorry , should be Bug 736631 - qemu crashes if screen dump is called when the vm is stopped

Comment 22 Xiaoqing Wei 2011-10-28 07:45:27 UTC
And during doing migration, can trigger this 

Bug 669581 - Migration Never end while Use firewall reject migration tcp port

Thanks and Best Regards,
Xiaoqing.

Comment 24 Eduardo Habkost 2011-10-28 18:01:12 UTC
Moving to ON_QA because Errata Tool did not do it

Comment 30 errata-xmlrpc 2011-12-06 16:05:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1531.html


Note You need to log in before you can comment on or make changes to this bug.