Bug 732423 - Spice Client crashes when guest is running Xinerama
Summary: Spice Client crashes when guest is running Xinerama
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: spice-client
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Christophe Fergeau
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On: 730729 732467
Blocks: 750568
TreeView+ depends on / blocked
 
Reported: 2011-08-22 12:28 UTC by Søren Sandmann Pedersen
Modified: 2014-06-18 09:14 UTC (History)
8 users (show)

Fixed In Version: spice-client-0.8.2-6
Doc Type: Bug Fix
Doc Text:
Cause On a Linux guest (RHEL-6.2 or newer) that uses Xinerama, when X starts up and creates one of the secondary screens, first a non-primary surface is created on the secondary screen, and then the primary surface for this screen is created. When it tries to handle the first non-primary surface creation, Spice-Client (DisplayChannel::create_canvas) assumes a screen has already been set for the DisplayChannel while this only happens upon primary surface creation. Consequence A spice client crashes when connected to a Linux guest (RHEL-6.2 or newer) using Xinerama setup. Fix Spice Client now checks that the screen was created, before doing operations on it. Result Spice Client no longer crashes when connecting to a Linux guest running Xinerama.
Clone Of: 730729
: 750568 (view as bug list)
Environment:
Last Closed: 2011-12-06 15:28:27 UTC


Attachments (Terms of Use)
The xorg.conf that can reproduce (1.25 KB, application/octet-stream)
2011-09-14 14:06 UTC, Søren Sandmann Pedersen
no flags Details
proposed fix (1.54 KB, patch)
2011-09-16 07:08 UTC, Christophe Fergeau
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1518 normal SHIPPED_LIVE libcacard and spice-client bug fix and enhancement update 2011-12-06 00:50:43 UTC

Comment 2 Søren Sandmann Pedersen 2011-08-22 12:35:27 UTC
The Xinerama bug  https://bugzilla.redhat.com/show_bug.cgi?id=730729  has a way to reproduce with a Linux guest containing a qxl driver with some fixes. I don't know if it is possible to reproduce with a Windows guest.

Comment 3 Christophe Fergeau 2011-09-12 13:13:11 UTC
I started to work on this, to reproduce you need the package from bug #730729 or newer, the xorg.conf file from bug #730729, to start qemu with several qxl devices (-vga qxl -device qxl -device qxl -device qxl if you want 4). Start spicec while the VM OS is still booting, and spicec will eventually crash (probably when the agent starts)

#0  RedScreen::set_update_interrupt_trigger (this=0x0, trigger=0x0)
    at ../../client/screen.cpp:910
#1  0x0000000000448788 in DisplayChannel::create_canvas (
    this=0x7fffec008cc0, surface_id=1023, 
    canvas_types=std::vector of length 1, capacity 1 = {...}, width=1024, 
    height=1024, format=96) at ../../client/display_channel.cpp:1171
#2  0x0000000000470307 in SyncEvent::response (this=0x7fffdc000960, 
    events_loop=<optimized out>) at ../../client/process_loop.cpp:37
#3  0x0000000000470760 in EventsQueue::process_events (this=0x745c10)
    at ../../client/process_loop.cpp:125
#4  0x000000000047096d in ProcessLoop::process_events_queue (this=0x745bd0)
    at ../../client/process_loop.cpp:314
#5  0x0000000000471010 in ProcessLoop::run (this=0x745bd0)
    at ../../client/process_loop.cpp:286
#6  0x000000000041ca79 in Application::run (this=0x745bd0)
    at ../../client/application.cpp:591
#7  0x000000000040a6b6 in Application::main (argc=5, argv=0x7fffffffe068, 
    version_str=<optimized out>) at ../../client/application.cpp:2627
#8  0x000000000040a386 in main (argc=5, argv=0x7fffffffe068) at main.cpp:34

Comment 4 Christophe Fergeau 2011-09-13 16:14:04 UTC
This is looking like a server bug, the primary surface for the second xinerama screen is never created, leading to this attempt to dereference a NULL pointer.
It's create_primary_surface which assigns a non-NULL RedScreen to a DisplayChannel(which inherits from ScreenLayer), and it never gets called for the 2nd xinerama screen.
At this point, this looks like a server bug.

Comment 5 Christophe Fergeau 2011-09-14 13:04:30 UTC
I added more debugging code in spice-server, and I can see a request to create a non-primary surface on the 2nd xinerama head arriving before the request to create the corresponding primary surface. This explains the crash, the primary surface is expected to exist before trying to create non-primary surfaces

red_create_surface_item: create surface 1023: 0x7fff78048410
red_create_surface_item: create primary surface: 0x7fff78048410

(the adress is the adress of the channel that is used, it's the 1st time it appears in the logs)

Comment 6 Søren Sandmann Pedersen 2011-09-14 14:05:51 UTC
Command line to reproduce:

qemu-system-x86_64  -bios /usr/share/seabios/bios.bin  -net nic,model=ne2k_pci -net user -soundhw ac97  
-spice port=5924,disable-ticketing,streaming-video=all -enable-kvm -m 1G -nographic   
-usb -usbdevice tablet   -usb
-vga qxl -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 
-chardev spicevmc,name=vdagent,id=vdagent
-device virtserialport,nr=1,bus=virtio-serial0.0,chardev=vdagent,name=com.redhat.spice.0 
-drive file=/home/teuf/vm-pool/rhel6-qxl.qcow2  
-device qxl -device qxl -device qxl

Comment 7 Søren Sandmann Pedersen 2011-09-14 14:06:37 UTC
Created attachment 523163 [details]
The xorg.conf that can reproduce

Comment 8 Søren Sandmann Pedersen 2011-09-15 12:16:47 UTC
(In reply to comment #5)
> I added more debugging code in spice-server, and I can see a request to create
> a non-primary surface on the 2nd xinerama head arriving before the request to
> create the corresponding primary surface. This explains the crash, the primary
> surface is expected to exist before trying to create non-primary surfaces

It's definitely true that the driver creates non-primary surfaces before the primary ones, specifically, it creates a glyph pixmap before it sets a mode. This is something it has always done, and also does in the one-screen case (which raises the question: why doesn't it crash in that case?)

Changing the driver to not do that would likely be hairy, since the assumption that you can allocate video memory regardless of the mode, is valid for normal video cards and therefore one that the X server is making.

I'd tend to think that the client should be robust against this - is there a reason it can't be?

Wrt. the case that the second screen is ignored; I haven't seen this, but I will try to reproduce with the given command lines and xorg.conf.

Comment 9 Christophe Fergeau 2011-09-15 13:45:21 UTC
(In reply to comment #8)
> It's definitely true that the driver creates non-primary surfaces before the
> primary ones, specifically, it creates a glyph pixmap before it sets a mode.
> This is something it has always done, and also does in the one-screen case
> (which raises the question: why doesn't it crash in that case?)

In the single screen case, I don't see a create surface coming before the primary surface is created, which explains why we don't hit this bug. Is this glyph pixmap recreated every time the primary surface is destroyed?

> I'd tend to think that the client should be robust against this - is there a
> reason it can't be?

I haven't really looked at this yet, but I have nothing against making the client robust against this, I just wanted to figure out where is the right place to fix it first.

> 
> Wrt. the case that the second screen is ignored; I haven't seen this, but I
> will try to reproduce with the given command lines and xorg.conf.

I get warnings about BARs overlapping when qemu starts up, and then (EE) qxl(1): Bad RAM signature dd2a68 at 0x7faeade68000 in xorg log. Let's say it's just an issue on my setup with stuff compiled by hand, and not spend more time on it for now.

Comment 10 Christophe Fergeau 2011-09-16 07:08:22 UTC
Created attachment 523509 [details]
proposed fix

I'm still not sure why for the primary screen, we seem to be getting events "in order" (ie in a way that doesn't make spicec crash, and which is different from the order for the other screens)

Comment 14 Lubos Kocman 2011-10-18 10:35:10 UTC
So guys how about windows? Just got crash with xorg.conf from Comment 7 on 5.0.0-10019.

Comment 15 Christophe Fergeau 2011-10-18 11:00:06 UTC
Could you attach /var/log/Xorg.0.log? When did it crash exactly? The xorg.conf file from this bug is meant to be used in the guest,

Comment 16 Lubos Kocman 2011-10-19 12:29:53 UTC
I mean the spicec.exe crashed not the Xorg. I'll need to reproduce this as it happened only once. As the same version of client worked with the same xorg.conf

Comment 18 Uri Lublin 2011-11-21 16:30:37 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
  On a Linux guest (RHEL-6.2 or newer) that uses Xinerama, when X starts up and creates one of the secondary screens, first a non-primary surface is created on the secondary screen, and then the primary surface for this screen is created.
When it tries to handle the first non-primary surface creation, Spice-Client (DisplayChannel::create_canvas) assumes a screen has already been set for the DisplayChannel while this only happens upon primary surface creation.

Consequence
  A spice client crashes when connected to a Linux guest (RHEL-6.2 or newer) using Xinerama setup.

Fix
  Spice Client now checks that the screen was created, before doing operations on it.

Result
  Spice Client no longer crashes when connecting to a Linux guest running Xinerama.

Comment 19 errata-xmlrpc 2011-12-06 15:28:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1518.html


Note You need to log in before you can comment on or make changes to this bug.