Bug 867405
Summary: | core dump when starting qemu with spice and -S | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Sibiao Luo <sluo> |
Component: | spice-server | Assignee: | Alon Levy <alevy> |
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.4 | CC: | acathrow, alevy, areis, bsarathy, cfergeau, chayang, dblechte, dyasny, flang, juzhang, kraxel, marcandre.lureau, mbarta, michen, mkenneth, mkrcmari, pbonzini, qzhang, sandmann, sluo, uril, virt-maint, xufango, xwei |
Target Milestone: | rc | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | spice-server-0.12.0-3 | Doc Type: | Bug Fix |
Doc Text: |
No documentation is needed.
This bug is related to a new feature that enables guest-client capabilities negotiation/sharing.
It was found and fixed during RHEL-6.4 development phase.
Cause:
Consequence:
Fix:
Result:
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2013-02-21 10:03:36 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 869958, 881827 |
Description
Sibiao Luo
2012-10-17 13:00:06 UTC
(In reply to comment #0) > Description of problem: > boot a guest with a data disk on iSCSI storage, and do I/O to the data disk, > then firewall the iSCSI port with iptables and generate EIO with wrong I/O > error prompts. then stop the iptables and resume the VM via 'cont' in QEMU > monitor, it fail to resume the VM, after a while, it will core dump that > relate to the qxl. > > if fail to resume the VM, and after a while will core dump, i will paste > later. > (qemu) __spice_char_device_write_buffer_get: token violation: dev 0x7fffd4000960 client 0x7ffff8ae2f40 spice_char_device_handle_client_overflow: dev 0x7fffd4000960 client 0x7ffff8a10ed0 red_client_destroy: destroy client with #channels 6 red_channel_client_disconnect: 0x7ffff9ed34c0 (channel 0x7ffff89c6ab0 type 3 id 0) red_channel_client_disconnect: 0x7ffff9ed34c0 (channel 0x7ffff89c6ab0 type 3 id 0) red_channel_client_disconnect: 0x7ffff9ecf2d0 (channel 0x7ffff8a5b640 type 5 id 0) snd_channel_put: sound channel freed red_channel_client_disconnect: 0x7ffff9ecf2d0 (channel 0x7ffff8a5b640 type 5 id 0) red_dispatcher_disconnect_display_peer: qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/qxl.c:1658: qxl_send_events: Assertion `qemu_spice_display_is_running(&d->ssd)' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x7fffe63cb700 (LWP 3810)] 0x00007ffff574c8a5 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff574c8a5 in raise () from /lib64/libc.so.6 #1 0x00007ffff574e085 in abort () from /lib64/libc.so.6 #2 0x00007ffff5745a1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff5745ae0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007ffff7f73d3d in qxl_send_events (d=0x7ffff9e1a320, events=16) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:1658 #5 0x00007ffff5f6b6a7 in handle_dev_display_disconnect (opaque=<value optimized out>, payload=<value optimized out>) at red_worker.c:11236 #6 0x00007ffff5f63cc7 in dispatcher_handle_single_read (dispatcher=0x7ffff9e2d9e8) at dispatcher.c:139 #7 dispatcher_handle_recv_read (dispatcher=0x7ffff9e2d9e8) at dispatcher.c:162 #8 0x00007ffff5f8488e in red_worker_main (arg=<value optimized out>) at red_worker.c:11782 #9 0x00007ffff7740851 in start_thread () from /lib64/libpthread.so.0 #10 0x00007ffff580167d in clone () from /lib64/libc.so.6 (gdb) cpuinfo as following: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz stepping : 7 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid bogomips : 6784.19 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: (In reply to comment #0) > > Version-Release number of selected component (if applicable): > host info: > # uname -r && rpm -q qemu-kvm > 2.6.32-331.el6.x86_64 > qemu-kvm-0.12.1.2-2.327.el6.x86_64 # rpm -qa | grep qxl xorg-x11-drv-qxl-0.0.14-13.el6_2.x86_64 Meet this problem when try to connect a guest desktop with 'spicec' command, when the guest boot up with spice+qxl and -S. Test with the 6.3 released version, have not this issue. So this is a regression. Steps: 1. Boot a guest with "-S -spice port=5930,disable-ticketing -global qxl-vga.vram_size=33554432= -vga qxl". 2. spicec -h ** -p 5930 Result: Aborted. same log with bug description. (In reply to comment #6) > Meet this problem when try to connect a guest desktop with 'spicec' command, > when the guest boot up with spice+qxl and -S. > Test with the 6.3 released version, have not this issue. So this is a > regression. > > Steps: > 1. Boot a guest with "-S -spice port=5930,disable-ticketing -global > qxl-vga.vram_size=33554432= -vga qxl". > 2. spicec -h ** -p 5930 > Good, that's a easy method to reproduce, i use 'remote-viewer spice://$ip_addr:$port' to connect guest that also hit this issue. bonzini suggest that try to reproduce it without i/o (just stop/cont), but i didnot reproduce it. eg:...-S -spice port=5931,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864 # remote-viewer spice://10.66.9.242:5931 (qemu) main_channel_link: add main channel client main_channel_handle_parsed: net test: latency 0.102000 ms, bitrate 10666666666 bps (10172.526041 Mbps) inputs_connect: inputs channel client create red_dispatcher_set_cursor_peer: qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/qxl.c:1658: qxl_send_events: Assertion `qemu_spice_display_is_running(&d->ssd)' failed. Program received signal SIGABRT, Aborted. 0x00007ffff574c8a5 in raise () from /lib64/libc.so.6 (gdb) (gdb) bt #0 0x00007ffff574c8a5 in raise () from /lib64/libc.so.6 #1 0x00007ffff574e085 in abort () from /lib64/libc.so.6 #2 0x00007ffff5745a1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff5745ae0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007ffff7f73d3d in qxl_send_events (d=0x7ffff9e1a320, events=16) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:1658 #5 0x00007ffff5f837a2 in handle_new_display_channel (opaque=0x7ffec40008c0, payload=0x7ffec41d80a0) at red_worker.c:10370 #6 handle_dev_display_connect (opaque=0x7ffec40008c0, payload=0x7ffec41d80a0) at red_worker.c:11216 #7 0x00007ffff5f63cc7 in dispatcher_handle_single_read (dispatcher=0x7ffff8a24ca8) at dispatcher.c:139 #8 dispatcher_handle_recv_read (dispatcher=0x7ffff8a24ca8) at dispatcher.c:162 #9 0x00007ffff5f8488e in red_worker_main (arg=<value optimized out>) at red_worker.c:11782 #10 0x00007ffff7740851 in start_thread () from /lib64/libpthread.so.0 #11 0x00007ffff580167d in clone () from /lib64/libc.so.6 (gdb) Bisected down: commit f11ac1cace1097d0ed8778472028b6f9292d2766 Author: Soren Sandmann <ssp> Date: Wed Oct 10 19:15:44 2012 +0200 qxl: Add set_client_capabilities() interface to QXLInterface IMO this is a spice-server bug. Spice server must not call QXLInterface callbacks if the guest isn't running (i.e. before spice_server_vm_start is called by qemu). Hi all, I have tried the rhel6.3GA host (kernel-2.6.32-279.el6.x86_64 and qemu-kvm-0.12.1.2-2.295.el6.x86_64) for many times, the QEMU did not core dump when boot guest with spice + qxl and -S. eg: ...-spice port=5932,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 host info: # rpm -qa | grep qxl xorg-x11-drv-qxl-0.0.14-13.el6_2.x86_64 # uname -r && rpm -q qemu-kvm 2.6.32-279.el6.x86_64 qemu-kvm-0.12.1.2-2.295.el6.x86_64 guest info: # uname -r 2.6.32-279.el6.x86_64 So, this issue is regression bug, we'd better fix it in 6.4. Best Regards. sluo Fixed upstream in: commit 4ca54e596f81ae8ea914b83d3a6bd9df55cd777f Author: Alon Levy <alevy> Date: Thu Nov 1 12:30:25 2012 +0200 server/red_worker: don't call set_client_capabilities if vm is stopped We try to inject an interrupt to the vm in this case, which we cannot do if it is stopped. Instead log this and update when vm restarts. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=870972 (that bz is on qemu, it will be cloned or just changed, not sure yet) Waiting for flags to push. *** Bug 869945 has been marked as a duplicate of this bug. *** (moving to POST since patch is already in rhel package) scratch build here. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=5089166 I'll go ahead and push this, meanwhile some testing would be good. (tested locally). Built: spice-server-0.12.0-3.el6 (In reply to comment #14) > scratch build here. > > http://brewweb.devel.redhat.com/brew/taskinfo?taskID=5089166 > > I'll go ahead and push this, meanwhile some testing would be good. (tested > locally). patch tested successfully by me (not the scratchbuild) The 0008-server-red_worker-don-t-call-set_client_capabilities.patch patch causes qemu -spice port=5900 to segfault at startup. spice-server-0.12.0-1 and a scratch build based on spice-server-0.12.0-4 with just this patch disabled work as expected ( scratch build at https://brewweb.devel.redhat.com/taskinfo?taskID=5123484 ). Should I move this bug back to ASSIGNED or open a new one? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0529.html __spice_char_device_write_buffer_get: token violation: dev 0x7fffd4000960 client 0x7ffff8ae2f40 it is because : if (!migrated_data_tokens && dev_client->do_flow_control) { dev_client->num_client_tokens--; } and num_client_tokens decrease to 0. and if (!migrated_data_tokens && dev_client->do_flow_control && !dev_client->num_client_tokens) { spice_printerr("token violation: dev %p client %p", dev, client); spice_char_device_handle_client_overflow(dev_client); goto error; } should fixed like this: if (dev->running && !migrated_data_tokens && dev_client->do_flow_control) { dev_client->num_client_tokens--; } |