Bug 230524
Description
Ray Van Dolson
2007-03-01 06:54:40 UTC
Created attachment 148993 [details]
Xorg log with DRI enabled.
Created attachment 148994 [details]
lspci output
Created attachment 148995 [details]
Base xorg.conf
Maybe similar to bug 230025. However 230025 is for rawhide. Comment on attachment 148993 [details]
Xorg log with DRI enabled.
This is a log from a "freeze" (note that DRI is enabled)
Created attachment 148996 [details]
Xorg log with DRI disabled
X starts successfully in this case (alias radeon off in modprobe.conf)
So, this is expected behavior? Any reason why this was closed? This is fairly odd. It looks like fi1236_drv.so is hanging the box; no idea why that would be related to DRM though. Can you start successfully with DRI enabled if you move /usr/lib/xorg/modules/multimedia/fi1236_drv.so to /tmp? Also, try this. Boot the machine in runlevel 3 with DRI enabled and the fi1236 driver in the normal place (not moved out to /tmp). ssh in from another machine and run 'X -verbose', and see if the last line of the output is different from how the log you gave ends. I suspect the module in question is just buggy and is trying to call a nonexistant function, which is instantly fatal. Moving fi1236_drv.so does not enable me to start X with DRI enabled. I'll attach the resulting Xorg.log from that effort and also the Xorg.log which resulted from starting X from a remote console with X -verbose and the fi1236_drv.so in place. Created attachment 149431 [details]
Xorg log; DRI enabled; fi1236_drv loaded; result of X -verbose
Created attachment 149432 [details]
Xorg log; DRI enabled; fi1236_drv NOT loaded
I noticed that when starting X on the two previous attempts (with DRI enabled), I was able to remotely ssh into the box while X was "hung" (screen = blank/black). Could not attach strace to the X process. lsof did not show anything particularly useful for the X process and neither did /proc/pid_of_X/fd. Could not kill the X process with wither kill nor kill -9. Could you attach gdb to the hung X process? If not, it would be useful to get the output of 'ps awxl' for the X server process itself. It's probably hung in a DRM ioctl for some reason. I am not able to attach gdb to the hung X process. I run gdb /usr/bin/X <pid> and it does start up, but cannot attach to the process (just sits there hanging). I have to ctrl-z out and reboot the machine. I'll attach a ps awxl. In addition, I tried running gdb /usr/bin/X and then starting X from gdb. This didn't appear to give me any additional helpful information, but I'll attach what I captured from the screen. Once the server locks up, there's no way to detach to get a backtrace unless you can tell me some gdb tricks to try. I should note that this is with the -debug RPM's installed for the Xorg-server and the Xorg-ati-drv. I also captured the strace -f output of /usr/bin/X (run as strace -f /usr/bin/X &> /tmp/strace.log). I will attach that as well in case it is useful. Created attachment 149547 [details]
ps awxl
Result of ps awxl after starting X normally and then attempting to attach to it
with gdb /usr/bin/X <pidofX>
Created attachment 149548 [details]
Attempt to start X from within gdb.
Ran gdb /usr/bin/X and then 'c' from the gdb> prompt. This is stdout.
Created attachment 149549 [details]
Output of strace -f /usr/bin/X &> /tmp/strace.log
Created attachment 149550 [details]
Output of ps afxwl after attempted startup of X from strace
That seems to show us getting hung up on the xkbcomp fork. Which seems... unrelated? Really weird if so. With DRI enabled, try starting X from the command line as: X -kb :0 If _that_ comes up then we know we can blame something in the xkbcomp fork going wacky. Otherwise we're back to looking for DRI bugs. Nope, X still didn't come up. Got some interesting strace output from it though. Will upload both that and the Xorg.log file. Created attachment 149581 [details]
strace -f /usr/bin/X -kb :0 &> /tmp/strace3.log
Created attachment 149582 [details]
X -kb :0
Comment on attachment 149581 [details]
strace -f /usr/bin/X -kb :0 &> /tmp/strace3.log
I should note that it appeared to be just looping on FD 9 and I couldn't reboot
with 'reboot' so I hard reset. Maybe I should have tried attaching with gdb
first.
Anyone out there? :-) This actually also happens using the Ubuntu Feisty Fawn Live CD. Guess I could file a bug with them. Wonder if it's the fault of Xorg or the radeon module in the kernel. ajax, tried to find dups on b.f.o, but there well ... There are two leads https://bugs.freedesktop.org/show_bug.cgi?id=5341 (bug looks pretty similar, but it is in some rather uncertain state, so I would hesitate to call it upstream of this bug), and then the only things I found were not 100% same (IMHO) https://bugs.freedesktop.org/show_bug.cgi?id=2581 https://bugs.freedesktop.org/show_bug.cgi?id=8243 One of the b.f.o bugs pointed (indirectly) to http://www.openoffice.org/issues/show_bug.cgi?id=49902 which says, that it is actually https://bugs.freedesktop.org/show_bug.cgi?id=1204 (but that looks very different) and https://bugs.freedesktop.org/show_bug.cgi?id=1360 (even less likely). There is also https://launchpad.net/ubuntu/+source/xfree86/+bug/15219 which points to https://bugs.freedesktop.org/show_bug.cgi?id=3606 (looks different). It is weird, because all of these bugs (except of b.f.o 5341) happen DURING the running of Xorg (mostly when glxgears are involved), but we have here problem on starup of X. Keep in mind guys that this is an Asus-branded 9800XT. Don't know what differences this would result in from a hardware perspective. Would be interesting to find someone else out there with an identical card to see if they have the same issue. There's on on eBay currently for $200. :-) Have had an interesting breakthrough here. Due to a discussion on the dri-devel list here: http://marc.info/?t=117634797400002&r=1&w=2 I changed my aperture settings in BIOS from 64MB to 128MB. I can now start Xorg without having alias radeon off in modprobe.conf. I actually came across this suggestion also on a Gentoo ATI page. Still seems like this should work regardless of aperture settings. Michel Danzer has suggested that the following patch may address the issue: http://gitweb.freedesktop.org/?p=mesa/drm.git;a=commitdiff;h=8ff026723cf170034173052a58c650c8c1f28c0b See this post: http://marc.info/?l=dri-devel&m=117635985514789&w=2 I don't even see a shared-core/radeon_cp.c in libdrm-2.3.0-1.fc6.src.rpm so I'm not sure how easily this could be back-ported. Created attachment 152460 [details] Kernel drm patch for AGP aperture size w/ Radeon http://gitweb.freedesktop.org/?p=mesa/drm.git;a=commit;h=8ff026723cf170034173052a58c650c8c1f28c0b With the above patch applied to my FC6 kernel (.2933), I can now start Xorg with the old 64MB aperture setting and DRI enabled. The following appears in dmesg: [drm] Initialized drm 1.1.0 20060810 ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 17 [drm] Initialized radeon 1.25.0 20060524 on minor 0 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode [drm] Setting GART location based on new memory map [drm] Can't use AGP base @0xf8000000, won't fit [drm] Loading R300 Microcode [drm] writeback test succeeded in 1 usecs Note the "won't fit" error above. This will go away if I re-up my aperture setting to 128MB, but at least the system doesn't hang. I don't know what the performance consequences are for this. Any chance of getting this backported into FC6 or FC7? Reporter, as a protection against soon-to-come end of support for FC6, could you please confirm (and attach the logs) that you can reproduce this bug on F7 or on Rawhide? Thanks. *** Bug 200718 has been marked as a duplicate of this bug. *** I _believe_ the patch I mentioned above is actually in the F7 kernels. I thought I'd made a note of that here, but apparently not. Michael Danzer applied this to the kernel upstream here: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.22.y.git;a=commitdiff;h=80b2c386f3d8c3367533a8600b599f8686c9d386 Back in Feb of this year. So this is included in F7 kernels that include this patch. You can close this out as far as I am concerned.... Closing per reporter's request. |