Bug 500086 - X segfault on XO-1 system booted with rawhide-xo build 20090519
X segfault on XO-1 system booted with rawhide-xo build 20090519
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati (Show other bugs)
rawhide
i586 Linux
low Severity medium
: ---
: ---
Assigned To: X/OpenGL Maintenance List
Fedora Extras Quality Assurance
:
Depends On:
Blocks: F11Blocker/F11FinalBlocker FedoraOnXO
  Show dependency treegraph
 
Reported: 2009-05-10 16:16 EDT by Mikus Grinbergs
Modified: 2013-01-10 00:12 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-12 18:13:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Output from most recent (automatic) attempt to start X. (15.58 KB, text/plain)
2009-05-10 16:16 EDT, Mikus Grinbergs
no flags Details
/var/log/messages (46.87 KB, text/plain)
2009-05-10 16:20 EDT, Mikus Grinbergs
no flags Details

  None (edit)
Description Mikus Grinbergs 2009-05-10 16:16:57 EDT
Created attachment 343299 [details]
Output from most recent (automatic) attempt to start X.

Description of problem:  When I booted rawhide-xo 20090510 on an XO-1 system, the boot process stalled because it could not start X.  Prior to the stall, the text console output "flashed" repeatedly as the system attempted multiple times to start X.


Version-Release number of selected component (if applicable): 1.6.1


How reproducible:  Did not try with multiple XO-1s.


Steps to Reproduce:
1.  Copy "installation image" of ~cjb/rawhide-xo 20090510.img to NAND on XO-1 system (using 'copy-nand' at ok prompt).
2.  Boot XO (with 'check' button on front panel pressed).
3.

  
Actual results:  Never saw any of the screen contents that use X to display.

Expected results:  Would see "user logon screen".


Additional info:  Am primarily creating this bug ticket in order to have a place to store log output from the system that did not successfully start X.
Comment 1 Mikus Grinbergs 2009-05-10 16:20:02 EDT
Created attachment 343300 [details]
/var/log/messages

For what it's worth - copy of 'messages' output of system on which X did not start.
Comment 2 Peter Robinson 2009-05-10 16:49:16 EDT
I'm seeing exactly the same thing on a build I did today as well. This is a X regression on the XO that's occurred in the last couple of days. I'm going to add this as a blocker for F11.
Comment 3 Chris Ball 2009-05-10 19:22:55 EDT
Next step is probably to bisect the last few xorg-x11-server-Xorg.i586 RPMs here:

http://koji.fedoraproject.org/koji/packageinfo?packageID=63

in order to know whether it's a server change, and which particular RPM introduced the crash.
Comment 4 Chris Ball 2009-05-10 21:18:46 EDT
> (II) Cannot locate a core pointer device.

This is surprising.  Could it be related to the segfault?

I tried disabling GLX by moving /usr/lib/dri/swrast_dri.so out of the way; we get the same segfault with miCreateScreenResources() in the trace.
Comment 5 Chris Ball 2009-05-11 00:59:37 EDT
Mikus points out that "Cannot locate a core pointer device." was present in previous working builds; we're using an xorg.conf entry instead.
Comment 6 Chris Ball 2009-05-11 20:37:16 EDT
Here's a proper gdb backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x00458d54 in exaCreatePixmap (pScreen=0xa055a68, w=0, h=0, depth=16, 
    usage_hint=0) at exa.c:323
323			pExaPixmap->driverPriv = pExaScr->info->CreatePixmap2(pScreen, w, h, depth, usage_hint, bpp);
(gdb) bt
#0  0x00458d54 in exaCreatePixmap (pScreen=0xa055a68, w=0, h=0, depth=16, 
    usage_hint=0) at exa.c:323
#1  0x0811ba79 in miCreateScreenResources (pScreen=0xa055a68)
    at miscrinit.c:153
#2  0x00458177 in exaCreateScreenResources (pScreen=0xa055a68) at exa.c:716
#3  0x080e7221 in xf86CrtcCreateScreenResources (screen=0xa055a68)
    at xf86Crtc.c:698
#4  0x0806b983 in main (argc=1, argv=0xbfb79b44, envp=0xbfb79b4c) at main.c:326
Comment 7 Chris Ball 2009-05-11 21:03:44 EDT
Bisected.  xorg-x11-server-1.6.1-7.fc11 works, xorg-x11-server-1.6.1-8.fc11 (built by airlied) doesn't.

The changelog entry is:

* Thu Apr 23 2009 Dave Airlie <airlied@redhat.com> 1.6.1-8 - xserver-1.6.1-exa-create-pixmap2.patch - add support for tiling create pixmap hook - need to fix firefox on ati rs690 crashes 

X maintainers, please revert or -- if there's time -- let me know how Geode can avoid this new code path, and I can make a quick Geode release.
Comment 8 Kyle McMartin 2009-05-11 22:17:28 EDT
As I've pointed out to cjb on irc, this function pointer should be uninitialized on geode, and zero filled, so this codepath shouldn't be executed. Looks like a driver bug in the geode driver. (Although, possibly the server is doing it, but that seems unlikely.)
Comment 9 Chris Ball 2009-05-12 01:17:31 EDT
Kyle worked it out; we were callocing sizeof() the EXA struct at driver compile-time, which left it containing garbage when airlied made it larger, which made the new function pointer test as valid.  He has a patch to use ExaDriverAlloc() for allocation instead, which I'll try to merge, test, release upstream geode, release a new geode driver RPM, and point people at here for tagging into F11 final in a couple of hours.

Thanks for the excellent help, all.
Comment 10 Chris Ball 2009-05-12 02:31:08 EDT
I've made a new geode 2.11.2 release and tested it working, but have lost my ACL to xorg-x11-drv-geode CVS, I think due to losing provenpackager.  I've already put the tarball into new-sources; could someone else apply the following CVS patch and tag into F11, please?  Thanks!

cvs diff: Diffing .
Index: .cvsignore
===================================================================
RCS file: /cvs/pkgs/rpms/xorg-x11-drv-geode/devel/.cvsignore,v
retrieving revision 1.5
diff -u -r1.5 .cvsignore
--- .cvsignore	16 Feb 2009 21:23:50 -0000	1.5
+++ .cvsignore	12 May 2009 06:28:21 -0000
@@ -1 +1 @@
-xf86-video-geode-2.11.1.tar.bz2
+xf86-video-geode-2.11.2.tar.bz2
Index: sources
===================================================================
RCS file: /cvs/pkgs/rpms/xorg-x11-drv-geode/devel/sources,v
retrieving revision 1.7
diff -u -r1.7 sources
--- sources	16 Feb 2009 21:23:50 -0000	1.7
+++ sources	12 May 2009 06:28:21 -0000
@@ -1 +1 @@
-6e00dd248ac5de89ab4764954ea74a96  xf86-video-geode-2.11.1.tar.bz2
+4c652ecba772f705296b8e52d746857c  xf86-video-geode-2.11.2.tar.bz2
Index: xorg-x11-drv-geode.spec
===================================================================
RCS file: /cvs/pkgs/rpms/xorg-x11-drv-geode/devel/xorg-x11-drv-geode.spec,v
retrieving revision 1.9
diff -u -r1.9 xorg-x11-drv-geode.spec
--- xorg-x11-drv-geode.spec	26 Feb 2009 10:48:34 -0000	1.9
+++ xorg-x11-drv-geode.spec	12 May 2009 06:28:21 -0000
@@ -4,8 +4,8 @@
 
 Summary:   Xorg X11 AMD Geode video driver
 Name:      xorg-x11-drv-geode
-Version:   2.11.1
-Release:   2%{?dist}
+Version:   2.11.2
+Release:   1%{?dist}
 URL:       http://www.x.org/wiki/AMDGeodeDriver
 Source0:   http://xorg.freedesktop.org/releases/individual/driver/xf86-video-geode-%{version}.tar.bz2
 License:   MIT
@@ -60,6 +60,9 @@
 %{driverdir}/ztv_drv.so
 
 %changelog
+* Tue May 12 2009 Chris Ball <cjb@laptop.org> 2.11.2-1
+- fix crasher bug due to EXA ABI change: RHBZ #500086
+
 * Thu Feb 26 2009 Fedora Release Engineering <rel-eng@lists.fedoraproject.org> - 2.11.1-2
 - Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild
Comment 11 Chris Ball 2009-05-12 14:05:32 EDT
Kyle made the build, I've filed a rel-eng ticket for inclusion in F11:

https://fedorahosted.org/rel-eng/ticket/1791
Comment 12 Matěj Cepl 2009-05-12 17:58:03 EDT
There is nothing to triage here.

Switching to ASSIGNED so that developers have responsibility to do whatever they want to do with it.
Comment 13 Chris Ball 2009-05-12 18:06:09 EDT
I think this can be closed -- the build with the fix has been tagged into f11-final.  If you'd like me to verify that the correct RPM makes it into the final image, go ahead and leave this open, else we can close it now.

Thanks.

Note You need to log in before you can comment on or make changes to this bug.