Hide Forgot
Description of problem: spice-server abort at red_worker.c:handle_dev_destroy_primary_surface Reporting because I think the problem is real. But could not reproduce so far. The actual problem: we are accessing red_dispatcher from two threads, and it was never designed for that. This can happen like seen in the stack trace below, first user is a vga timer callback from main thread, and the second is a qxl io handler from vcpu thread. They both call red_dispatcher, which writes to the same pipe. So either protect all calls to red_dispatcher with a mutex (independent from qemu_iothread_lock), or serialize them using a per vcpu pipe to main thread, who will be the only user of red_dispatcher. Version-Release number of selected component (if applicable): How reproducible: 0% so far Steps to Reproduce: 1. Boot up a winxp guest, crash during driver initialization. Actual results: crash Expected results: no crash Additional info: I got a panic in handle_dev_destroy_primary_surface, red_dispatcher.c At the moment of panic there was an inconsistent status wrt qxl thinking it's in NATIVE mode in one thread and VGA in another. The main_loop had a timer triggered vga refresh leading to a call to qemu_spice_destroy_host_primary (because of vga_draw_text calling qemu_console_resize), which is only possible if qxl0->mode==QXL_MODE_VGA. Otoh, an io triggered qxl_create_guest_primary, which is called after qxl0->mode is set to QXL_MODE_NATIVE, and after ensuring we exit the vga state with qxl_exit_vga_mode. This is running a F14 guest. I couldn't recreate it since (running several times, this is after it did happen again once which was enough to run under a debugger). more complete stack traces: kvm_main_loop_cpu: ... kvm_handle_io ... qxl_create_guest_primary kvm_main_loop: ...(timer)... gui_update dpy_refresh display_refresh qemu_spice_display_refresh vga_hw_update qxl_hw_update vga_update_display vga_draw_text qemu_console_resize dpy_resize display_resize qemu_spice_display_resize qemu_spice_destroy_host_primary red_worker: handle_dev_destroy_primary_surface PANIC_ON(!worker->surfaces.surfaces[surface_id].context.canvas)
Alon, please try to reproduce with a smp (4 or 8 vcpus) guest.
*** Bug 680114 has been marked as a duplicate of this bug. ***
This situation is prevented by the latest locking fixes to bug 678208. I'm marking as duplicate because this is fixed by the same solution, but the bug is actually a different case - here it's an assert caused by dropping the global qemu mutex in the vcpu thread, and in 678208 it's a hang caused by taking the global qemu mutex from the spice server thread. *** This bug has been marked as a duplicate of bug 678208 ***
I met the same condition with : centos: kernel:3.6.2 qemu-kvm:1.2.0 spice qxl video driver My VM Xp abort and I found the error message from the /var/log/libvirt/qemu/xxx.log validate_surface: failed on 9 validate_surface: panic !worker->surfaces[surface_id].context.canvas I have met this twice in two days. any details are needed?