Bug 731245

Summary: KDE fails to start inside a VM , large amount of memory [@ miCopyRegion]
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: xorg-x11-drv-qxlAssignee: Søren Sandmann Pedersen <sandmann>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16CC: airlied, ajax, cfergeau, hdegoede, jreznik, kem, kevin, ltinkl, mcepl, mike.hinz, mjg, oliver.henshaw, rdieter, rh-bugzilla, rnovacek, rtguille, ry, sandmann, smparrish, sven, than, xgl-maint
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: [cat:crash] AcceptedBlocker
Fixed In Version: xorg-x11-drv-qxl-0.0.21-8.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-29 06:31:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 719957    
Attachments:
Description Flags
crash log
none
dmesg.cirrus
none
kdm.log.cirrus
none
/var/log/messages.cirrus
none
Xorg.0.log.cirrus
none
dmesg.qxl
none
kdm.log.qxl
none
/var/log/messages.qxl
none
Xorg.0.log.qxl
none
Xorg.0.log.X
none
screenshot of the borked panel rendering
none
.xsession-errors from a session with the messed-up graphics after the initial fix none

Description Adam Williamson 2011-08-17 06:58:46 UTC
Booting the F16 Alpha RC5 KDE x86_64 live image inside a virt-manager VM - host F15, using Spice and the qxl video adapter - results in KDE crashing right at the end of startup (after it displays the KDE logo in the bootsplash sequence) and dumping you back to KDM, every time you try to log in.

Booting in 'basic graphics mode' - i.e. using vesa instead of the native qxl driver - works.

Debugging is complicated by the fact that the VCs of the VM are corrupted (confirmed by two testers), but virsh should make it possible. I'll try to get around to that soon, for now, just wanted to have the failure logged.

Don't think this is an Alpha blocker as real hardware is apparently okay (I didn't test yet) and there's an easy workaround, but it's a bit of a pain.

Comment 1 Adam Williamson 2011-08-17 07:31:45 UTC
CommonBugs - definitely worth documenting.

Comment 2 Kevin Kofler 2011-08-17 10:44:29 UTC
This looks a lot like a driver bug to me. (I assume X is crashing, probably somewhere inside driver code.)

Comment 3 Adam Williamson 2011-08-17 17:51:52 UTC
i'll try and get the X log out with virsh later.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 4 Matěj Cepl 2011-08-22 12:37:55 UTC
Just to set needinfo.

Please add drm.debug=0x04 to the kernel command line, restart computer, and attach

* your X server config file (/etc/X11/xorg.conf, if available),
* X server log file (/var/log/Xorg.*.log*; check with grep Backtrace /var/log/Xorg* which logs might be the most interesting ones, send us at least Xorg.0.log)
* output of the dmesg command, and
* system log (/var/log/messages)

to the bug report as individual uncompressed file attachments using the bugzilla file attachment link above.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.

Comment 5 Adam Williamson 2011-08-23 01:24:32 UTC
Created attachment 519378 [details]
crash log

Seems like it has a problem with setting the display size, gets some huge size (2560 width) and falls over...

Comment 6 Adam Williamson 2011-08-23 03:26:23 UTC
Note that I believe the other person who hit this bug confirmed it happened to cirrus too, so this really isn't in the qxl driver. cirrus probably has the same issue with the unusually large display size.

Comment 7 Matěj Cepl 2011-08-23 12:25:04 UTC
When you have this nice backtrace I would need exact version and arch of xorg-x11-drv-qxl and xorg-x11-server-Xorg, please.

[    35.396] 0: /usr/bin/X (xorg_backtrace+0x2f) [0x46207f]
[    35.396] 1: /usr/bin/X (0x400000+0x66dc6) [0x466dc6]
[    35.396] 2: /lib64/libpthread.so.0 (0x7fb2b5d0f000+0xf470) [0x7fb2b5d1e470]
[    35.396] 3: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fb2b28b1000+0xb1d8) [0x7fb2b28bc1d8]
[    35.396] 4: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fb2b28b1000+0x6b7b) [0x7fb2b28b7b7b]
[    35.396] 5: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fb2b28b1000+0x7264) [0x7fb2b28b8264]
[    35.396] 6: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fb2b28b1000+0x82c4) [0x7fb2b28b92c4]
[    35.396] 7: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fb2b28b1000+0xeb7b) [0x7fb2b28bfb7b]
[    35.396] 8: /usr/bin/X (miCopyRegion+0x16a) [0x5449fa]
[    35.396] 9: /usr/bin/X (miDoCopy+0x34d) [0x544e9d]
[    35.396] 10: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fb2b28b1000+0xe65e) [0x7fb2b28bf65e]
[    35.396] 11: /usr/bin/X (0x400000+0xdd5a7) [0x4dd5a7]
[    35.396] 12: /usr/bin/X (0x400000+0x2f932) [0x42f932]
[    35.396] 13: /usr/bin/X (0x400000+0x33861) [0x433861]
[    35.396] 14: /usr/bin/X (0x400000+0x22bf5) [0x422bf5]
[    35.397] 15: /lib64/libc.so.6 (__libc_start_main+0xed) [0x7fb2b449250d]
[    35.397] 16: /usr/bin/X (0x400000+0x22ee1) [0x422ee1]
[    35.397] Segmentation fault at address 0x1306984
[    35.397]

Comment 8 Kevin Kofler 2011-08-23 14:55:57 UTC
In any case, the crash is clearly inside X and not in plasma-desktop or any other Plasma workspace executable.

Comment 9 Adam Williamson 2011-08-23 16:28:43 UTC
Whatever's in F16 Alpha, Matej, I don't have the numbers right to hand. Again, I think the fundamental problem is whatever causes the driver to think the display is so big, so it may well still boil down to KDE, especially as other desktops don't have problems. The same crash happens to cirrus, it's only vesa that works without it.

I'll investigate further today, anyway.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 10 Kevin Kofler 2011-08-23 17:14:46 UTC
These are the live CD's compose logs:
http://koji.fedoraproject.org/koji/getfile?taskID=3278345&name=livecd.log
http://koji.fedoraproject.org/koji/getfile?taskID=3278345&name=root.log

The relevant package versions Matěj asked for:
xorg-x11-server-Xorg-1.10.99.1-8.20110510.fc16
xorg-x11-drv-qxl-0.0.21-3.fc16
xorg-x11-drv-cirrus-1.3.2-9.fc16

Comment 11 Kevin Kofler 2011-08-23 17:16:18 UTC
As for arch, Adam wrote in his original report that he tested the x86_64 live image, which means that all the packages are x86_64.

Comment 12 Kevin Kofler 2011-08-23 17:18:52 UTC
For the record, these are the x86_64 compose logs:
http://koji.fedoraproject.org/koji/getfile?taskID=3278346&name=livecd.log
http://koji.fedoraproject.org/koji/getfile?taskID=3278346&name=root.log
(The previous ones were i686.) But of course, the package versions are the same.

Comment 13 Matěj Cepl 2011-08-23 19:04:24 UTC
Yes, it looks like some kind of memory-related issue.

---------------------------------
In function miCopyRegion:
(from frame 8: /usr/bin/X (miCopyRegion+0x16a) [0x5449fa])
135:     }
136: 
137:     (*copyProc) (pSrcDrawable,
138: 		 pDstDrawable,
139: 		 pGC,
140: 		 pbox,
141: 		 nbox,
142: 		 dx, dy,
143: 		 reverse, upsidedown, bitPlane, closure);
144:     
145: >>>>>>>     free(pboxNew1);
146:     free(pboxNew2);
147: }
148: 
149: RegionPtr
150: miDoCopy (DrawablePtr	pSrcDrawable,
151: 	  DrawablePtr	pDstDrawable,
152: 	  GCPtr		pGC,
153: 	  int		xIn, 
154: 	  int		yIn,
155: 	  int		widthSrc, 


Frame 9: /usr/bin/X (miDoCopy+0x34d) [0x544e9d]
	/usr/src/debug/xorg-server-20110510/mi/micopy.c:334
	miDoCopy
Frame 11: /usr/bin/X (0x400000+0xdd5a7) [0x4dd5a7]
	/usr/src/debug/xorg-server-20110510/miext/damage/damage.c:866
	damageCopyArea
Frame 12: /usr/bin/X (0x400000+0x2f932) [0x42f932]
	/usr/src/debug/xorg-server-20110510/dix/dispatch.c:1648
	ProcCopyArea
Frame 13: /usr/bin/X (0x400000+0x33861) [0x433861]
	/usr/src/debug/xorg-server-20110510/dix/dispatch.c:432
	Dispatch
Frame 14: /usr/bin/X (0x400000+0x22bf5) [0x422bf5]
	/usr/src/debug/xorg-server-20110510/dix/main.c:291
	main

Comment 14 Adam Williamson 2011-08-23 19:16:40 UTC
the video adapter in a current virt-manager default VM emulates 64MB of RAM. This doesn't appear to be configurable, at least via virt-manager.

Comment 15 Dave Airlie 2011-08-24 09:01:31 UTC
move it back to QXL driver and ssp.

Comment 16 Kevin Kofler 2011-08-24 11:08:51 UTC
I guess the problem is that the driver needs too much video memory to handle the pixmaps from Qt's raster paint engine. Nokia changed the default paint engine from native X11 (XRender) to raster (software rendering) in 4.8, because they found XRender to hurt performance instead of improving it in their tests (and also because they don't want to maintain the X11/XRender paint engine, it has several regressions in 4.8 which they seem to have little to no interest in fixing). Basically, how raster works is that they paint everything into a double buffer in CPU memory using pure software and then blit the whole thing to the screen in one X11/XRender operation.

Comment 17 Kevin Kofler 2011-09-05 23:16:41 UTC
So we really need to figure out what's going wrong here:
* Is the detected display size the problem, as hinted in comment #5? Has this changed since F15? (It would be nice to have a log from F15 to compare with.)
* If it's not the display size, what is the problem? Is it really Qt 4.8's raster engine?
* What we really need to know is: What changed? This clearly worked before. Is it a Qt or KDE change triggering this driver crash? Or has the driver itself changed?

Comment 18 Adam Williamson 2011-09-08 00:23:52 UTC
wanted to investigate a bit further but can't get beta tc1 KDE live to install in a VM even in vesa mode, it seems to peg the CPU out and stop responding after you launch the installer :/ anyone else made it fly?

Comment 19 Adam Williamson 2011-09-16 02:46:08 UTC
Quick note, the original issue is still happening in 16 Beta RC1, will check if I can install it.

Comment 20 Kevin Kofler 2011-09-19 17:12:18 UTC
We really need to investigate further what's going on. Is it Qt 4.8 which is triggering this? Did the driver change?

Another question: Is this a blocker for the release? Not being able to start the live image in a VM sucks, though using basic video (vesa) is a workaround.

Finally, if installing still fails in vesa mode, we'll need a separate bug filed about that (and there too it might be a blocker).

Comment 21 Martin Kho 2011-09-23 10:50:26 UTC
Hi,

I was able to install Fedora 16 KDE RC1 in virt-manager using the vmvga driver. After the install I installed all updates. Next I first tried the cirrus driver. This caused kdm to crash (see attachments .cirrus). Then I used the qxl driver. The login screen appears, but when logged in plasma-desktop crashed and the login screen appears again. (see attachments .qlx) Especially .xsession-errors.xxx looks interesting.

Hope this helps.

Martin Kho

Attachments: dmesg, Xorg.0.log, kdm.log, .xsession-errors.

Comment 22 Martin Kho 2011-09-23 10:51:12 UTC
Created attachment 524579 [details]
dmesg.cirrus

Comment 23 Martin Kho 2011-09-23 10:51:52 UTC
Created attachment 524580 [details]
kdm.log.cirrus

Comment 24 Martin Kho 2011-09-23 10:52:40 UTC
Created attachment 524581 [details]
/var/log/messages.cirrus

Comment 25 Martin Kho 2011-09-23 10:53:21 UTC
Created attachment 524583 [details]
Xorg.0.log.cirrus

Comment 26 Martin Kho 2011-09-23 10:54:05 UTC
Created attachment 524584 [details]
dmesg.qxl

Comment 27 Martin Kho 2011-09-23 10:54:44 UTC
Created attachment 524586 [details]
kdm.log.qxl

Comment 28 Martin Kho 2011-09-23 10:55:28 UTC
Created attachment 524587 [details]
/var/log/messages.qxl

Comment 29 Martin Kho 2011-09-23 10:56:12 UTC
Created attachment 524588 [details]
Xorg.0.log.qxl

Comment 30 Mike Hinz 2011-09-28 02:17:12 UTC
FYI.  I'm hitting exactly this same bug.  My host machine is Fedora 15-64bit with virt-preview repo enabled.  Everything is fully updated on the host.  My VM is F16-64bit-rc3 also fully updated.  

I can launch gnome desktop perfectly well with the all video drivers and using VNC or spice.  However when I attempt to launch the KDE desktop on the VM from GDM, I get exactly the failure above if I use the QXL driver or cirrus driver.  However if I use the vmvga driver the KDE desktop will launch successfully.  

If there are more details needed, i can supply them, but everything is identical to the above reports and logs.  

Mike

Comment 31 Mike Hinz 2011-09-28 12:58:12 UTC
FYI.  I'm hitting exactly this same bug.  My host machine is Fedora 15-64bit with virt-preview repo enabled.  Everything is fully updated on the host.  My VM is F16-64bit-rc3 also fully updated.  

I can launch gnome desktop perfectly well with the all video drivers and using VNC or spice.  However when I attempt to launch the KDE desktop on the VM from GDM, I get exactly the failure above if I use the QXL driver or cirrus driver.  However if I use the vmvga driver the KDE desktop will launch successfully.  

If there are more details needed, i can supply them, but everything is identical to the above reports and logs.  

Mike

Comment 32 Oliver Henshaw 2011-09-30 14:22:43 UTC
The crash isn't in xorg when using the cirrus driver - see attachment 524583 [details] for an example. When booting F16 Beta RC3 plasma-desktop is the crashee (presumably because kdm is bypassed by the autologin).

Additionally I can confirm that the raster setting is complicit in the crash, with the following steps.

1. boot liveimage in kvm to runlevel 3
2. add "export QT_GRAPHICSSYSTEM=native" to /etc/kde/env/env.sh
3. 'systemctl start graphical.target'

and kde starts with no crashes (whereas it crashes as expected with a normal boot or just steps 1 + 3).


Packages:

xorg-x11-server-Xorg-1.11.0-1.fc16.x86_64
xorg-x11-drv-cirrus-1.3.2-10.fc16.x86_64
kdebase-workspace-4.7.0-9.fc16.x86_64
qt-4.8.0-0.9.beta1.fc16.x86_64

Comment 33 Oliver Henshaw 2011-09-30 14:50:54 UTC
Using the native graphicssystem also works for spice, at least as fas as getting kde to log in. That is, the desktop appears (on a normal boot the screen is corrupted and even after switching to another virtual terminal there is graphical corruption obscuring the console.)

However, the desktop is unresponsive and even switching virtual terminal from virt-manager doesn't work. I don't know if this proves anything because spice with a F15 host has never worked well for me.

xorg-x11-drv-qxl-0.0.21-4.fc16.x86_64

Comment 34 Adam Williamson 2011-09-30 19:43:45 UTC
Discussed at 2011-09-30 blocker review meeting. Accepted as a final blocker under criterion "The release must boot successfully as a virtual guest in a situation where the virtual host is running the previous stable Fedora release (using Fedora's current preferred virtualization technology)" (and the same criterion for 'same release', as this fails with F16 host as well).

We felt 'use vesa' was an okay workaround for Beta, but it really isn't for Final, we should fix this.

Comment 35 Hans de Goede 2011-09-30 20:28:01 UTC
Some input from the spice side:

1) Apparently we have an issue where KDE and the qxl driver do not play nice together on a 2560x1600 desktop

2) For some reason after logon KDE decides to make an Xrandr call to resize the
display to 2560x1600, hello KDE there is a reason why the driver chooses a certain default, don't override it without user interaction!!!

I think fixing 1 may be hard to do in the time window we've got left, also 2 is the bigger issue anyways, because even if it wouldn't crash resizing the VM screen to 2560x1600 is not going to be a good thing to do for people who have a display with a resolution smaller then that (iow 99.9% of all people).

I'm not sure what KDE is doing here exactly, but the QXL driver will default to a resolution of 1024x768 when there is no xorg.conf. KDE is overriding this without any user interaction, for reasons which are beyond me.

Comment 36 Kevin Kofler 2011-09-30 20:39:40 UTC
Well, I assume the virtual display claims it can do a maximum of 2560×1600, so KDE thinks that this is the native resolution of the LCD and resizes to it. The code was written for real screens, not fake ones. Real LCDs normally don't allow a higher resolution than native, so setting the resolution to the highest available makes sense.

Comment 37 Hans de Goede 2011-09-30 20:53:43 UTC
(In reply to comment #36)
> Well, I assume the virtual display claims it can do a maximum of 2560×1600, so
> KDE thinks that this is the native resolution of the LCD and resizes to it.

X will pick the best mode for the monitor before KDE (or any X client) ever comes into play. X has a *ton* of code to deal with picking the best mode KDE should *never* need override this.

Did you know for example that there are beamers which have native resolution of 800x600 or 1024x768 yet can handle quarter or sometimes even full HD resolutions, so that they will work when you plug in a blueray player. The fact that such a beamer claims it can handle 1920x1080 does not make that the best resolution to send to it, only to have it then downscale it. X knows these things and picks the resolution which is advertised in the DCC info as being the advised mode to use with the device.

KDE blindly overriding X here is just stupid, and is not needed in any real world scenario. KDE should not overrule X's resolution (and refresh rate) choice unless specifically told to do so by the user.

Comment 38 Kevin Kofler 2011-09-30 20:58:45 UTC
Could you please try to explain that to upstream (i.e. bugs.kde.org)?

Comment 39 Rex Dieter 2011-09-30 21:00:27 UTC
if no one beats me to it, I'll take it up with the kwin folks.

Comment 40 Hans de Goede 2011-09-30 21:21:53 UTC
(In reply to comment #39)
> if no one beats me to it, I'll take it up with the kwin folks.

Thanks! Please add me to the CC, my bugs.kde.org account is jwrdegoede .

2 remarks:

1) I'm not claiming that the crash at 2560x1600 is not (possibly) a qxl bug, I'm merely saying that, independent of that KDE automatically switching to 2560x1600 when running inside a virtual machine is not a good thing to do since that will very likely make the vm's display larger then the actual screen through which the vm is viewed.

2) xrandr has the notion of a preferred mode, currently the qxl driver is not marking any mode as the preferred mode, since well there really isn't one.

I hope we can come to an agreement with KDE upstream on not changing the mode from the initial mode chosen by Xorg at all (unless the user explicitly choose a different mode in a previous session and the monitor config has not changed).

If we cannot agree on that then the KDE code which calls xrandr should at least honor the preferred mode flag (maybe it already does I don't know), so it will do the right thing in the case of the beamer example I gave. If / when it does that then I can add the preferred flag to the 1024x768 mode in the qxl driver fixing things that way.

Thanks,

Hans

Comment 41 Kevin Kofler 2011-09-30 22:27:20 UTC
Doesn't it check the preferred mode flag already? I haven't checked the code yet, but it's quite possible that it uses the preferred mode if set, and the largest one if there's no preferred mode.

Comment 42 Kevin Kofler 2011-09-30 22:34:44 UTC
The preferred mode for a VM should probably be the next lower resolution with the same or similar aspect ratio to the host's resolution, e.g. 1024×768 if the host is 1280×1024, 1024×640 if the host is 1280×800 (1024×768 is unlikely to fit if the panel is huge or if there are 2 panels).

Comment 43 Hans de Goede 2011-10-01 07:40:58 UTC
(In reply to comment #41)
> Doesn't it check the preferred mode flag already?

I don't know.

> I haven't checked the code
> yet, but it's quite possible that it uses the preferred mode if set, and the
> largest one if there's no preferred mode.

And where is the logic in just choosing the largest mode? Any modern display will indicate a preferred mode through its DCC info. If there is no preferred mode, perhaps Xorg has a good reason to choose a different mode as default (such as a different mode being specified in xorg.conf). KDE is overriding both Xorg's own default, and even any user configuration done through xorg.conf here...

For that matter even changing the mode to the preferred mode seems a weird thing to do, Xorg *knows* this is the preferred mode, and thus if it is not already running in this mode, it probably has good reasons for that...

Anyways we need to discuss this with upstream. As said if they insist on overruling Xorg's chosen resolution on session startup, even though the user never asked for that, I'm willing to set a preferred mode flag in the QXL driver.

(In reply to comment #42)
> The preferred mode for a VM should probably be the next lower resolution with
> the same or similar aspect ratio to the host's resolution, e.g. 1024×768 if the
> host is 1280×1024, 1024×640 if the host is 1280×800 (1024×768 is unlikely to
> fit if the panel is huge or if there are 2 panels).

vm's now a day usually use either vnc or spice, which are network transparent. The vm itself thus does not talk directly through X, and does not know the client resolution, esp. since different clients with different resolutions can connect. It is very well possible that atm the qxl (or cirrus) driver initializes no client is connected at all.

Comment 44 Rex Dieter 2011-10-03 17:29:27 UTC
OK, feedback from kwin upstream is that it includes no xrandr calls that should change anything.  I've confirmed 

* options.cpp only contains code to fetch refresh rate:
 rate = XRRConfigCurrentRate(config);

* events.cpp only calls
 XRRUpdateConfiguration(e);
when any Extensions::randrNotifyEvent( (ie, RRScreenChangeNotify) is triggered.


I'll continue to look elsewhere, like kde-workspace/kcontrol/randr, but I'm pretty sure that only comes to play when 
1.  restoring a session from saved randr settings
2.  running krandrtray systray applet
3.  running systemsettings->display&monitor->size&orientation

I'm pretty sure none of these happen with a fresh/live user on auto-login.

Comment 45 Hans de Goede 2011-10-03 19:50:28 UTC
Rex,

Thanks for looking into this. I just double checked with a fully up2date F-16 virtual machine, using qxl and gnome-3 as desktop. I created a fresh user and both in gdm and after logging in Xorg's resolution stays at the driver default of 1024x76, starting display preferences (which reads xrandr display info) also does not cause any change of the mode. So to me it still seems as if something in kde is causing this ...

Regards,

Hans

Comment 46 Rex Dieter 2011-10-03 19:56:54 UTC
I've poked the solid/powersave/krandr folks for comment/feedback now, I'm out of ideas where to look.

Comment 47 Martin Kho 2011-10-03 20:50:11 UTC
Hi,

I've booted into textmode (rnlvl 3) and issued X (no kde stuff) with the gxl driver enabled. In Xorg.0.log there is the entry:

[    87.324] (**) qxl(0): Virtual size is 2560x1600 (pitch 2560)

May be this can help?

Martin Kho

Note: I'll attach Xorg.0.log.X

Comment 48 Martin Kho 2011-10-03 20:51:11 UTC
Created attachment 526141 [details]
Xorg.0.log.X

Comment 49 Hans de Goede 2011-10-04 07:20:32 UTC
Hi,

(In reply to comment #46)
> I've poked the solid/powersave/krandr folks for comment/feedback now, I'm out
> of ideas where to look.

I guess that KDE has code to restore the resolution configured the last time by the user, just like Gnome 3 has. I wonder what that code does if there is no saved config to restore ? That would be my first guess of where to look, note just a hunch I could of course be completely wrong.

Regards,

Hans

Comment 50 Radek Novacek 2011-10-04 14:12:15 UTC
I don't think this is caused by changing resolution. The same crash happens when I click on Session Type or Menu buttons on the login screen.

Comment 51 Rex Dieter 2011-10-04 14:47:42 UTC
Re: comment #49

To be clear, stuff I've tested by disabling/removing functionality:

1.  randr session restoration
2.  powerdevil (kded_powerdevil.so)
3.  systemsettings display/monitor kcm
4.  kded_randrmonitor.so

and, still, this crash remains.  I've now run out of places to look (pending feedback per comment #46 )

Interestingly (and possibly re-iterating), this only happens when using QT_GRAPHICSSYSTEM=raster, if one sets QT_GRAPHICSSYSTEM=native (X11), the crash does not occur and those 
[    87.324] (**) qxl(0): Virtual size is 2560x1600 (pitch 2560)
lines from the driver are replaced with 1024x768 which is more expected.

Hans (or others?), are you *sure* there's an randr call happening somewhere on login?  If so, what's the evidence again?

Comment 52 Rex Dieter 2011-10-04 14:49:14 UTC
Oh, and Re: comment #50 and login screen crashes

I saw those too in various testing, but didn't look closely... are you sure it's the *same* crash as here (same backtrace?) or could it be something else? (do we need another bug filed?)

Comment 53 Hans de Goede 2011-10-04 16:46:10 UTC
(In reply to comment #51)
> Re: comment #49
> 
> To be clear, stuff I've tested by disabling/removing functionality:
> 
> 1.  randr session restoration
> 2.  powerdevil (kded_powerdevil.so)
> 3.  systemsettings display/monitor kcm
> 4.  kded_randrmonitor.so
> 
> and, still, this crash remains.  I've now run out of places to look (pending
> feedback per comment #46 )
> 
> Interestingly (and possibly re-iterating), this only happens when using
> QT_GRAPHICSSYSTEM=raster, if one sets QT_GRAPHICSSYSTEM=native (X11), the crash
> does not occur and those 
> [    87.324] (**) qxl(0): Virtual size is 2560x1600 (pitch 2560)
> lines from the driver are replaced with 1024x768 which is more expected.
> 
> Hans (or others?), are you *sure* there's an randr call happening somewhere on
> login?  If so, what's the evidence again?

Hmm, I think we may have been miscommunicating here. I thought that the actual resolution was being changed to 2560x1600? Do you mean that kdm / the session (for as long as it shows) is running at 1024x768? The virtual resolution being 2560x1600 is expected behavior, as the virtual res gets set to the largest size the setup can handle. I thought that the problem is / was that the actual resolution xorg is running at is getting set to 2560x1600.

Comment 54 Rex Dieter 2011-10-04 17:25:58 UTC
OK, I don't think anything is actually setting 2560x1600 then.

My dirty test was this:
I set QT_GRAPHICSSYSTEM=native in /etc/profile.d/ to at least get kdm to run (modulo the crashes there mentioned in comment #50 and comment #52 , and in my user session I set QT_GRAPHICSSYSTEM=raster in ~/.kde/env/.

I see kdm start @ 1024x768, and start to login, ksplash proceeds along nicely until it's done, then... crash (presumably somewhere midstream during the kde startup process, I would guess either kwin or plasma perhaps).

Comment 55 Radek Novacek 2011-10-05 11:41:46 UTC
I tried to install older qxl driver (xorg-x11-drv-qxl-0.0.21-3.fc16 from http://koji.fedoraproject.org/koji/buildinfo?buildID=239996 and ignored potential ABI incompatibility) and it do NOT crash. But it crashes with xorg-x11-drv-qxl-0.0.21-4.fc16.

Changelog says that change between -3 and -4 is:
* Thu Aug 18 2011 Adam Jackson <ajax> - 0.0.21-4
- Rebuild for xserver 1.11 ABI

So this looks like that some API changes not only ABI.

Comment 56 Radek Novacek 2011-10-06 09:54:11 UTC
Please, ignore my last comment. The qxl driver didn't load and it used mesa driver, so the crash didn't happen.

Comment 57 Kevin Kofler 2011-10-15 23:30:54 UTC
*** Bug 746444 has been marked as a duplicate of this bug. ***

Comment 58 Adam Williamson 2011-10-24 22:58:24 UTC
I've asked ajax to take a look at this, but he's also trying to work on the gnome/fallback/clutter/video playback bug.

Comment 59 Søren Sandmann Pedersen 2011-10-26 12:36:43 UTC
Full backtrace below. The "dest_stride=-1" looks really suspicious.


#0  hashlittle (key=<optimized out>, length=2048, initval=<optimized out>) at lookup3.c:300
#1  0x00007fb3b4367d2b in hash_and_copy (height=27, width=512, bytes_per_pixel=4, dest_stride=-1, 
    dest=0x0, src_stride=4096, src=0x29b4100 <Address 0x29b4100 out of bounds>) at qxl_image.c:62
#2  qxl_image_create (qxl=0x12e83e0, data=0x29b4100 <Address 0x29b4100 out of bounds>, 
    x=<optimized out>, y=<optimized out>, width=512, height=27, stride=4096, Bpp=4)
    at qxl_image.c:136
#3  0x00007fb3b4368414 in real_upload_box (y2=768, x2=512, y1=741, x1=0, surface=0x1357150)
    at qxl_surface.c:1054
#4  upload_box (surface=0x1357150, x1=0, y1=741, x2=1024, y2=768) at qxl_surface.c:1083
#5  0x00007fb3b4369474 in qxl_surface_finish_access (surface=0x1357150, pixmap=0x2054c70)
    at qxl_surface.c:1106
#6  0x00007fb3b436fd2b in uxa_copy_n_to_n (pSrcDrawable=0x28022a0, pDstDrawable=0x207ae20, 
    pGC=0x207b440, pbox=0x7fff7badbb40, nbox=1, dx=0, dy=-741, reverse=0, upsidedown=0, bitplane=0, 
    closure=0x0) at uxa-accel.c:625
#7  0x000000000055587a in miCopyRegion ()
#8  0x0000000000555d1d in miDoCopy ()
#9  0x00007fb3b436f80e in uxa_copy_area (dsty=0, dstx=0, height=27, width=<optimized out>, 
    srcy=<optimized out>, srcx=<optimized out>, pGC=<optimized out>, pDstDrawable=<optimized out>, 
    pSrcDrawable=<optimized out>) at uxa-accel.c:644
#10 uxa_copy_area (pSrcDrawable=<optimized out>, pDstDrawable=<optimized out>, pGC=<optimized out>, 
    srcx=<optimized out>, srcy=<optimized out>, width=<optimized out>, height=27, dstx=0, dsty=0)
    at uxa-accel.c:633
#11 0x000000000050a487 in ?? ()
#12 0x000000000042fb02 in ?? ()
#13 0x0000000000433a31 in ?? ()
#14 0x0000000000422dc5 in ?? ()

Comment 60 Søren Sandmann Pedersen 2011-10-26 17:00:00 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=3461838

has a fix. I believe the real bug is elsewhere, but for F16, this will have to do.

Comment 61 Fedora Update System 2011-10-26 17:05:18 UTC
xorg-x11-drv-qxl-0.0.21-6.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/xorg-x11-drv-qxl-0.0.21-6.fc16

Comment 62 Adam Williamson 2011-10-26 18:22:38 UTC
The fix works insofar as KDE now starts up, but KDE panel rendering is completely broken. The panel shows up either black, black and blue, or looking like the desktop wallpaper. Clicking where the 'K' button should be brings up about half of the 'start menu', the top half doesn't look like it's there but seems to work. repainting inside the 'start menu' doesn't work right at all. hard to describe but easy to reproduce, just build a live image with the fixed qxl and try using it. I can upload such an image if necessary. (or you can do an install into a VM and update the qxl package, whatever).

Comment 63 Adam Williamson 2011-10-26 18:23:24 UTC
Created attachment 530351 [details]
screenshot of the borked panel rendering

Comment 64 Fedora Update System 2011-10-26 19:04:22 UTC
Package xorg-x11-drv-qxl-0.0.21-6.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing xorg-x11-drv-qxl-0.0.21-6.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2011-14943
then log in and leave karma (feedback).

Comment 65 Adam Williamson 2011-10-26 22:12:58 UTC
Investigating the issue more - it seems like when I, say, click the 'K' button, an entry "Zero width or height" gets written to Xorg.0.log .

.xsession-errors is full of suspicious errors, which I'll attach.

Comment 66 Adam Williamson 2011-10-26 22:14:08 UTC
Created attachment 530395 [details]
.xsession-errors from a session with the messed-up graphics after the initial fix

Comment 67 Adam Williamson 2011-10-26 22:31:55 UTC
Ajax, airlied, do you have any ideas on this second issue?

Comment 68 Adam Williamson 2011-10-26 23:25:21 UTC
This live image reproduces the issue:

http://adamwill.fedorapeople.org/adamwkde-20111025-x86_64.iso

just boot it inside a standard f15/f16 KVM and you should see the issue.

Comment 69 Martin Kho 2011-10-27 10:03:58 UTC
Hi,

Just a little question. I've tried the iso that Adam created, with all default virt-manager settings. Booting the iso result in a black screen, because not qxl is used but the cirrus-driver - at least in Fedora 15. So my question is. Is the qxl-driver the blocking factor?

Thanks,

Martin Kho

Comment 70 Adam Williamson 2011-10-27 15:53:10 UTC
ah, I thought F15 was Spice-by-default but maybe not. Or is this an old VM you're re-using? Have you tried creating a new one?

Cirrus is the driver you get when using an old-style VM which uses VNC for graphics; qxl is the driver you get when using a new-style VM which uses Spice for graphics.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 71 Martin Kho 2011-10-27 16:38:33 UTC
Uh, no I created a new one for your ISO.

Martin Kho

Btw. I can retry it with Fedora 16 as host.

Comment 72 Adam Williamson 2011-10-27 16:41:59 UTC
sure. but if there's another bug with f15's default VM config that'd be unfortunate :(



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 73 Martin Kho 2011-10-27 16:50:04 UTC
I'm not sure, but I'm now trying your ISO in Fedora 16. Here I also get the Cirrus driver.

Comment 74 Kevin Kofler 2011-10-27 17:05:21 UTC
That cirrus is also broken has been known since at least comment #6. Unfortunately, none of us had the reflex to clone the bug.

Comment 75 Adam Williamson 2011-10-27 17:14:16 UTC
yeah, we should probably do that before it gets too confusing. I'm just a bit worried about why martin is getting cirrus in f15 and f16 when i'd expect spice/qxl, but maybe it's an upgraded system and doesn't have the necessary packages for spice support or something.

anyway, let's file a new bug for cirrus (please not a clone, clones are horrible) and stick to qxl here.

Comment 76 Martin Kho 2011-10-27 17:53:48 UTC
Hi Adam,

I was running virt-manager on an upgraded f15. Now I've installed virtualization on f16 final TC2 ... and again I get the cirrus-driver ;-(

Sorry.

Martin Kho
Do I have to open a separate report for the cirrus issue?

Comment 77 Adam Williamson 2011-10-27 18:02:21 UTC
Yes, we need a new report for that, to avoid things getting confusing. With all the usual X logs and so on.

Comment 78 Martin Kho 2011-10-27 19:19:03 UTC
Hi,

Opened Bug 749647


Martin Kho

Comment 79 Adam Williamson 2011-10-27 21:38:35 UTC
So, comment #32 has some interesting info I managed to miss before: the initial bug only happens with the 'raster' QT graphics system, and you can workaround it by switching to the 'native' one (which I believe is xrender/x11).

And here's another interesting thing: if I do this with the original qxl, before soren's partial fix for this bug, KDE works fine: menu rendering is good.

If I do this with the qxl that fixed the crash but caused the 'messed up rendering' issue, the rendering is _still_ messed up: the K menu is cut off halfway and doesn't redraw correctly.

So it seems like soren's fix actually caused some kind of regression which persists even in the 'native' QT graphics system.

So, as an alternative fix for this, could we back out soren's 'fix' and patch Qt to default to 'native' if it sees the qxl or cirrus X driver?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 80 Adam Williamson 2011-10-27 21:39:51 UTC
actually, false alarm, somewhat: turns out the corruption does still happen with the 'old' qxl + native rendering, I just got lucky for a couple of tries. :(

Comment 81 Adam Williamson 2011-10-27 21:40:59 UTC
when using 'native' rendering, the error in Xorg.0.log that seems to correspond to misrenders is "Out of surfaces".

Comment 82 Adam Williamson 2011-10-27 21:48:41 UTC
well, actually, native does seem significantly better. it's still clearly a bit screwed - the kde menu comes up broken occasionally and things take a bit of time to redraw - but it's usable.

Comment 83 Adam Williamson 2011-10-27 22:05:24 UTC
For background, Qt was switched from 'native' to 'raster' by default in August, in response to:

https://bugzilla.redhat.com/show_bug.cgi?id=712617

but I'm not actually seeing any of the rendering problems discussed in that bug when using 'native' inside a KVM. It actually works okay, really. I've managed to get through a complete install, run kmail, run konqueror...



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 84 Kevin Kofler 2011-10-27 22:17:17 UTC
Actually, the default changed in (pre-F16-branching) Rawhide when we imported Qt 4.8. We explicitly set raster in the configure options, but we found out that Qt 4.8 actually defaults to raster if nothing is specified on the configure line (and specifying nothing is what we had done until August).

Bug #712617 and related issues were reported by users:
* using the few odd apps such as KSnapshot which explicitly force the native graphicssystem because they depend on its internals or
* explicitly forcing the native graphicssystem themselves.

That said, be warned that there are also the few odd apps forcing the raster graphicssystem (because the developers found performance issues with native), so even forcing native by default won't necessarily stop all crashes.

Comment 85 Adam Williamson 2011-10-27 22:42:54 UTC
so, to sum up where we're at with this:

It seems 'native' (i.e. Xrender-based) Qt rendering more or less works on qxl/cirrus, while 'raster' is broken. 'raster' with qxl, after the initial fix from Soren, still has the problem with rendering of some elements of KDE. rendering of the panel, right-click menu on the 'Install to Hard Drive' icon, the alt-tab display, menus in Konqueror, at least are all broken.

The regressions which caused our KDE team to switch from 'native' to 'raster' by default are fixed, but it's believed that 'raster' is likely faster in many scenarios, and upstream seems to favour 'raster' over 'native' now: Qt 4.8 defaults to 'raster', while 4.7 and earlier defaulted to 'native'. qxl + raster is definitely a lot faster than qxl+native if you ignore the rendering errors.

So, we can patch Qt back to 'native' by default, but it's very late for that, is likely to be less well-supported upstream, and probably costs in terms of performance.

We can try to patch Qt to use 'native' only when the qxl or cirrus drivers are in use, but that'd be new code as Qt has nothing to detect X drivers currently.

Or we can get both qxl and cirrus patched to work with 'render', quickly.

The last is likely the best option, if it's feasible.

Comment 86 Hans de Goede 2011-10-28 08:25:35 UTC
(In reply to comment #75)
> yeah, we should probably do that before it gets too confusing. I'm just a bit
> worried about why martin is getting cirrus in f15 and f16 when i'd expect
> spice/qxl, but maybe it's an upgraded system and doesn't have the necessary
> packages for spice support or something.

I'm not sure if this is a dependency issue or not, if you've upgraded from one Fedora to the next, then the default for new vm-s will be cirrus not qxl / spice. Only fresh F-15 installs get the new qxl as default behavior. Upgraded installs can change the default under virt-manager -> edit->preferences -> vm details and then change "Install graphics" to spice.

Comment 87 Oliver Henshaw 2011-10-28 10:16:49 UTC
Another data point: I upgraded from F13 straight to F15 and I'm fairly virt-manager did indeed default to qxl for new VMs.

But I still tend to opt for cirrus because of bug #707758 - which seems to be fixed upstream but not in F15.

Comment 88 Oliver Henshaw 2011-10-28 12:44:51 UTC
(In reply to comment #63)
> Created attachment 530351 [details]
> screenshot of the borked panel rendering

This looks reminiscent of https://bugs.freedesktop.org/show_bug.cgi?id=22566 - plasma popups cause holes to be bitten out of windows in certain circumstances(*). This only happened when using kwin in non-compositing mode with the composite/render xorg extension enabled(**). So on a hunch I tried adamwkde-20111025-x86_64.iso with this snippet in xorg.conf.d/

Section "Extensions"
        Option  "Composite" "Disable"
EndSection

and it works - there's no kde menu, right-click menu or panel corruption.

You can test this by writing the xorg.conf.d snippet in runlevel 3 and starting grpahical.target (and click through the firstboot dialog for some reason) or by booting to the corrupted desktop and using krunner (Alt+F2) to launch "konsole" and then again  to "logout".



* although if it is the same bug it's a new manifestation. I can see the what looks like a rounded corner on the grey border on the top-right of the menu. So it's more like the whole menu is hidden but you can see a bite-sized part of it?

** I can also see https://bugs.freedesktop.org/show_bug.cgi?id=38711 in my firefox history, Maybe that, or one of the chain of bugs it's duplicated too, is related too?

Comment 89 Oliver Henshaw 2011-10-28 13:39:49 UTC
Disabling the Composite extension also appears to work with the unpatched qxl driver, i.e.:

xorg-x11-drv-qxl-0.0.21-5.fc16

from Fedora-16-Nightly-20111024.10-x86_64-Live-kde.iso


But this doesn't seem to make any difference with the cirrus driver.

Comment 90 Oliver Henshaw 2011-10-28 14:56:18 UTC
(In reply to comment #88)
> in runlevel 3 and starting
> graphical.target (and click through the firstboot dialog for some reason)
bug 749815

Comment 91 Kevin Kofler 2011-10-28 15:25:48 UTC
> Disabling the Composite extension also appears to work with the unpatched qxl
driver, i.e.:
[snip]
> But this doesn't seem to make any difference with the cirrus driver.

More evidence that this isn't the same bug, qxl and cirrus happen to both be broken, but differently. :-( See bug 749647 for cirrus.

Comment 92 Adam Williamson 2011-10-28 15:54:19 UTC
kev: does it mean what remains of the problem in qxl could be a kwin bug rather than a qxl bug, though?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 93 Kevin Kofler 2011-10-28 16:00:26 UTC
Doubtful. This happens with compositing in KWin DISABLED, so normally enabling or disabling Composite in X11 should have no effect whatsoever.

Now a good question would be whether it's possible to enable compositing in KWin. But OpenGL compositing (which is what is normally used when you enable "Desktop Effects" in KWin) is very likely to not work at all here (it's known not to work with software OpenGL), at best we could have XRender compositing with (slow) software XRender here.

Comment 94 Oliver Henshaw 2011-10-28 16:27:07 UTC
There's a difference between the Composite extension and kwin Desktop Effects, I think. According to https://bugs.kde.org/show_bug.cgi?id=173686 (linked from https://bugs.freedesktop.org/show_bug.cgi?id=22566 ) then the Composite extension allows plasma to use ARGB visuals even when kwin is not in compositing mode.

There seem to have been a class of X bugs only triggered by ARGB visuals without a compositing window manager (if I've got my terminology right). This may be one of them.

Comment 95 Fedora Update System 2011-10-28 17:24:02 UTC
xorg-x11-drv-qxl-0.0.21-8.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/xorg-x11-drv-qxl-0.0.21-8.fc16

Comment 96 Adam Williamson 2011-10-28 17:49:15 UTC
the new fix is looking very good for me, can't cause any misrendering in KDE at all, got through a live boot, poke at various apps, install, and boot of installed system with no issues. yay.

Comment 97 Adam Williamson 2011-10-28 17:55:37 UTC
one smallish bug with the new fix: the K menu sometimes still doesn't paint correctly when you click on it and the space where it should paint only contains desktop background (i.e. it's not painting over an app). it comes up entirely invisible. as soon as an element needs redrawing it shows up - so first the tiny bit of the text entry widget where the cursor is sitting shows up, as the cursor blinks. then as you mouse over the menu, each element you mouse over paints in.

if you close the menu and then re-open it it paints fine.

seems 100% reproducible by running konqueror, maximizing it, minimizing it, then clicking the K menu. but the *second* time you open the K menu, it paints fine.

that's probably not serious enough to be blocker.

Comment 98 Martin Kho 2011-10-28 18:25:31 UTC
Hi,

21-8 works for me too. I don't see the little bug Adam experiences. Really excellent work!

Thanks,

Martin Kho

Comment 99 Adam Williamson 2011-10-28 22:59:46 UTC
two fix confirmations -> VERIFIED.

There's a live image with the fix available at:

http://adamwill.fedorapeople.org/adamwkde-20111028-x86_64.iso (sha256sum
97a4d9429ec4e06cbb9d3ebeb010fd044ae66d49abeb58915714bb8f4cb49594 )

for testing.

Comment 100 Fedora Update System 2011-10-29 06:31:46 UTC
xorg-x11-drv-qxl-0.0.21-8.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 101 Jaroslav Reznik 2011-10-31 13:37:15 UTC
(In reply to comment #99)
> two fix confirmations -> VERIFIED.
> 
> There's a live image with the fix available at:
> 
> http://adamwill.fedorapeople.org/adamwkde-20111028-x86_64.iso (sha256sum
> 97a4d9429ec4e06cbb9d3ebeb010fd044ae66d49abeb58915714bb8f4cb49594 )
> 
> for testing.

Adam,
thanks for ISO - I tried it, it's working, no visible glitches found, reasonable speed. Just I've experienced a few virt-manager freezes but seems unrelated to KDE issue as virtual machine is still running and reconnecting fixes the issue (and it's F15...).

Comment 102 Adam Williamson 2011-11-08 04:35:54 UTC

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers