Hide Forgot
The Xinerama bug https://bugzilla.redhat.com/show_bug.cgi?id=730729 has a way to reproduce with a Linux guest containing a qxl driver with some fixes. I don't know if it is possible to reproduce with a Windows guest.
I started to work on this, to reproduce you need the package from bug #730729 or newer, the xorg.conf file from bug #730729, to start qemu with several qxl devices (-vga qxl -device qxl -device qxl -device qxl if you want 4). Start spicec while the VM OS is still booting, and spicec will eventually crash (probably when the agent starts) #0 RedScreen::set_update_interrupt_trigger (this=0x0, trigger=0x0) at ../../client/screen.cpp:910 #1 0x0000000000448788 in DisplayChannel::create_canvas ( this=0x7fffec008cc0, surface_id=1023, canvas_types=std::vector of length 1, capacity 1 = {...}, width=1024, height=1024, format=96) at ../../client/display_channel.cpp:1171 #2 0x0000000000470307 in SyncEvent::response (this=0x7fffdc000960, events_loop=<optimized out>) at ../../client/process_loop.cpp:37 #3 0x0000000000470760 in EventsQueue::process_events (this=0x745c10) at ../../client/process_loop.cpp:125 #4 0x000000000047096d in ProcessLoop::process_events_queue (this=0x745bd0) at ../../client/process_loop.cpp:314 #5 0x0000000000471010 in ProcessLoop::run (this=0x745bd0) at ../../client/process_loop.cpp:286 #6 0x000000000041ca79 in Application::run (this=0x745bd0) at ../../client/application.cpp:591 #7 0x000000000040a6b6 in Application::main (argc=5, argv=0x7fffffffe068, version_str=<optimized out>) at ../../client/application.cpp:2627 #8 0x000000000040a386 in main (argc=5, argv=0x7fffffffe068) at main.cpp:34
This is looking like a server bug, the primary surface for the second xinerama screen is never created, leading to this attempt to dereference a NULL pointer. It's create_primary_surface which assigns a non-NULL RedScreen to a DisplayChannel(which inherits from ScreenLayer), and it never gets called for the 2nd xinerama screen. At this point, this looks like a server bug.
I added more debugging code in spice-server, and I can see a request to create a non-primary surface on the 2nd xinerama head arriving before the request to create the corresponding primary surface. This explains the crash, the primary surface is expected to exist before trying to create non-primary surfaces red_create_surface_item: create surface 1023: 0x7fff78048410 red_create_surface_item: create primary surface: 0x7fff78048410 (the adress is the adress of the channel that is used, it's the 1st time it appears in the logs)
Command line to reproduce: qemu-system-x86_64 -bios /usr/share/seabios/bios.bin -net nic,model=ne2k_pci -net user -soundhw ac97 -spice port=5924,disable-ticketing,streaming-video=all -enable-kvm -m 1G -nographic -usb -usbdevice tablet -usb -vga qxl -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,nr=1,bus=virtio-serial0.0,chardev=vdagent,name=com.redhat.spice.0 -drive file=/home/teuf/vm-pool/rhel6-qxl.qcow2 -device qxl -device qxl -device qxl
Created attachment 523163 [details] The xorg.conf that can reproduce
(In reply to comment #5) > I added more debugging code in spice-server, and I can see a request to create > a non-primary surface on the 2nd xinerama head arriving before the request to > create the corresponding primary surface. This explains the crash, the primary > surface is expected to exist before trying to create non-primary surfaces It's definitely true that the driver creates non-primary surfaces before the primary ones, specifically, it creates a glyph pixmap before it sets a mode. This is something it has always done, and also does in the one-screen case (which raises the question: why doesn't it crash in that case?) Changing the driver to not do that would likely be hairy, since the assumption that you can allocate video memory regardless of the mode, is valid for normal video cards and therefore one that the X server is making. I'd tend to think that the client should be robust against this - is there a reason it can't be? Wrt. the case that the second screen is ignored; I haven't seen this, but I will try to reproduce with the given command lines and xorg.conf.
(In reply to comment #8) > It's definitely true that the driver creates non-primary surfaces before the > primary ones, specifically, it creates a glyph pixmap before it sets a mode. > This is something it has always done, and also does in the one-screen case > (which raises the question: why doesn't it crash in that case?) In the single screen case, I don't see a create surface coming before the primary surface is created, which explains why we don't hit this bug. Is this glyph pixmap recreated every time the primary surface is destroyed? > I'd tend to think that the client should be robust against this - is there a > reason it can't be? I haven't really looked at this yet, but I have nothing against making the client robust against this, I just wanted to figure out where is the right place to fix it first. > > Wrt. the case that the second screen is ignored; I haven't seen this, but I > will try to reproduce with the given command lines and xorg.conf. I get warnings about BARs overlapping when qemu starts up, and then (EE) qxl(1): Bad RAM signature dd2a68 at 0x7faeade68000 in xorg log. Let's say it's just an issue on my setup with stuff compiled by hand, and not spend more time on it for now.
Created attachment 523509 [details] proposed fix I'm still not sure why for the primary screen, we seem to be getting events "in order" (ie in a way that doesn't make spicec crash, and which is different from the order for the other screens)
So guys how about windows? Just got crash with xorg.conf from Comment 7 on 5.0.0-10019.
Could you attach /var/log/Xorg.0.log? When did it crash exactly? The xorg.conf file from this bug is meant to be used in the guest,
I mean the spicec.exe crashed not the Xorg. I'll need to reproduce this as it happened only once. As the same version of client worked with the same xorg.conf
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause On a Linux guest (RHEL-6.2 or newer) that uses Xinerama, when X starts up and creates one of the secondary screens, first a non-primary surface is created on the secondary screen, and then the primary surface for this screen is created. When it tries to handle the first non-primary surface creation, Spice-Client (DisplayChannel::create_canvas) assumes a screen has already been set for the DisplayChannel while this only happens upon primary surface creation. Consequence A spice client crashes when connected to a Linux guest (RHEL-6.2 or newer) using Xinerama setup. Fix Spice Client now checks that the screen was created, before doing operations on it. Result Spice Client no longer crashes when connecting to a Linux guest running Xinerama.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1518.html