Bug 500086
Summary: | X segfault on XO-1 system booted with rawhide-xo build 20090519 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mikus Grinbergs <mikus> | ||||||
Component: | xorg-x11-drv-ati | Assignee: | X/OpenGL Maintenance List <xgl-maint> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | CC: | chris-rhbugs, dcantrell, kmcmartin, mcepl, mcepl, pbrobinson, sebastian, xgl-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i586 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-05-12 22:13:34 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 446452, 461806 | ||||||||
Attachments: |
|
Created attachment 343300 [details]
/var/log/messages
For what it's worth - copy of 'messages' output of system on which X did not start.
I'm seeing exactly the same thing on a build I did today as well. This is a X regression on the XO that's occurred in the last couple of days. I'm going to add this as a blocker for F11. Next step is probably to bisect the last few xorg-x11-server-Xorg.i586 RPMs here: http://koji.fedoraproject.org/koji/packageinfo?packageID=63 in order to know whether it's a server change, and which particular RPM introduced the crash. > (II) Cannot locate a core pointer device.
This is surprising. Could it be related to the segfault?
I tried disabling GLX by moving /usr/lib/dri/swrast_dri.so out of the way; we get the same segfault with miCreateScreenResources() in the trace.
Mikus points out that "Cannot locate a core pointer device." was present in previous working builds; we're using an xorg.conf entry instead. Here's a proper gdb backtrace: Program received signal SIGSEGV, Segmentation fault. 0x00458d54 in exaCreatePixmap (pScreen=0xa055a68, w=0, h=0, depth=16, usage_hint=0) at exa.c:323 323 pExaPixmap->driverPriv = pExaScr->info->CreatePixmap2(pScreen, w, h, depth, usage_hint, bpp); (gdb) bt #0 0x00458d54 in exaCreatePixmap (pScreen=0xa055a68, w=0, h=0, depth=16, usage_hint=0) at exa.c:323 #1 0x0811ba79 in miCreateScreenResources (pScreen=0xa055a68) at miscrinit.c:153 #2 0x00458177 in exaCreateScreenResources (pScreen=0xa055a68) at exa.c:716 #3 0x080e7221 in xf86CrtcCreateScreenResources (screen=0xa055a68) at xf86Crtc.c:698 #4 0x0806b983 in main (argc=1, argv=0xbfb79b44, envp=0xbfb79b4c) at main.c:326 Bisected. xorg-x11-server-1.6.1-7.fc11 works, xorg-x11-server-1.6.1-8.fc11 (built by airlied) doesn't. The changelog entry is: * Thu Apr 23 2009 Dave Airlie <airlied> 1.6.1-8 - xserver-1.6.1-exa-create-pixmap2.patch - add support for tiling create pixmap hook - need to fix firefox on ati rs690 crashes X maintainers, please revert or -- if there's time -- let me know how Geode can avoid this new code path, and I can make a quick Geode release. As I've pointed out to cjb on irc, this function pointer should be uninitialized on geode, and zero filled, so this codepath shouldn't be executed. Looks like a driver bug in the geode driver. (Although, possibly the server is doing it, but that seems unlikely.) Kyle worked it out; we were callocing sizeof() the EXA struct at driver compile-time, which left it containing garbage when airlied made it larger, which made the new function pointer test as valid. He has a patch to use ExaDriverAlloc() for allocation instead, which I'll try to merge, test, release upstream geode, release a new geode driver RPM, and point people at here for tagging into F11 final in a couple of hours. Thanks for the excellent help, all. I've made a new geode 2.11.2 release and tested it working, but have lost my ACL to xorg-x11-drv-geode CVS, I think due to losing provenpackager. I've already put the tarball into new-sources; could someone else apply the following CVS patch and tag into F11, please? Thanks! cvs diff: Diffing . Index: .cvsignore =================================================================== RCS file: /cvs/pkgs/rpms/xorg-x11-drv-geode/devel/.cvsignore,v retrieving revision 1.5 diff -u -r1.5 .cvsignore --- .cvsignore 16 Feb 2009 21:23:50 -0000 1.5 +++ .cvsignore 12 May 2009 06:28:21 -0000 @@ -1 +1 @@ -xf86-video-geode-2.11.1.tar.bz2 +xf86-video-geode-2.11.2.tar.bz2 Index: sources =================================================================== RCS file: /cvs/pkgs/rpms/xorg-x11-drv-geode/devel/sources,v retrieving revision 1.7 diff -u -r1.7 sources --- sources 16 Feb 2009 21:23:50 -0000 1.7 +++ sources 12 May 2009 06:28:21 -0000 @@ -1 +1 @@ -6e00dd248ac5de89ab4764954ea74a96 xf86-video-geode-2.11.1.tar.bz2 +4c652ecba772f705296b8e52d746857c xf86-video-geode-2.11.2.tar.bz2 Index: xorg-x11-drv-geode.spec =================================================================== RCS file: /cvs/pkgs/rpms/xorg-x11-drv-geode/devel/xorg-x11-drv-geode.spec,v retrieving revision 1.9 diff -u -r1.9 xorg-x11-drv-geode.spec --- xorg-x11-drv-geode.spec 26 Feb 2009 10:48:34 -0000 1.9 +++ xorg-x11-drv-geode.spec 12 May 2009 06:28:21 -0000 @@ -4,8 +4,8 @@ Summary: Xorg X11 AMD Geode video driver Name: xorg-x11-drv-geode -Version: 2.11.1 -Release: 2%{?dist} +Version: 2.11.2 +Release: 1%{?dist} URL: http://www.x.org/wiki/AMDGeodeDriver Source0: http://xorg.freedesktop.org/releases/individual/driver/xf86-video-geode-%{version}.tar.bz2 License: MIT @@ -60,6 +60,9 @@ %{driverdir}/ztv_drv.so %changelog +* Tue May 12 2009 Chris Ball <cjb> 2.11.2-1 +- fix crasher bug due to EXA ABI change: RHBZ #500086 + * Thu Feb 26 2009 Fedora Release Engineering <rel-eng.org> - 2.11.1-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild Kyle made the build, I've filed a rel-eng ticket for inclusion in F11: https://fedorahosted.org/rel-eng/ticket/1791 There is nothing to triage here. Switching to ASSIGNED so that developers have responsibility to do whatever they want to do with it. I think this can be closed -- the build with the fix has been tagged into f11-final. If you'd like me to verify that the correct RPM makes it into the final image, go ahead and leave this open, else we can close it now. Thanks. |
Created attachment 343299 [details] Output from most recent (automatic) attempt to start X. Description of problem: When I booted rawhide-xo 20090510 on an XO-1 system, the boot process stalled because it could not start X. Prior to the stall, the text console output "flashed" repeatedly as the system attempted multiple times to start X. Version-Release number of selected component (if applicable): 1.6.1 How reproducible: Did not try with multiple XO-1s. Steps to Reproduce: 1. Copy "installation image" of ~cjb/rawhide-xo 20090510.img to NAND on XO-1 system (using 'copy-nand' at ok prompt). 2. Boot XO (with 'check' button on front panel pressed). 3. Actual results: Never saw any of the screen contents that use X to display. Expected results: Would see "user logon screen". Additional info: Am primarily creating this bug ticket in order to have a place to store log output from the system that did not successfully start X.