Bug 523905
Created attachment 361430 [details]
/var/log/messages
Created attachment 361432 [details]
startx output
Just crashed again. This time the stack trace is a little different: Backtrace: 0: /usr/bin/X (xorg_backtrace+0x3c) [0x80a3c8c] 1: /usr/bin/X (0x8048000+0x5f4b6) [0x80a74b6] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0x23240c] 3: /usr/bin/X (dixLookupPrivate+0x24) [0x8088f14] 4: /usr/bin/X (FreePicture+0x7f) [0x810f14f] 5: /usr/bin/X (FreeResource+0x112) [0x808c592] 6: /usr/bin/X (0x8048000+0xcef93) [0x8116f93] 7: /usr/bin/X (0x8048000+0xc9b44) [0x8111b44] 8: /usr/bin/X (0x8048000+0x26137) [0x806e137] 9: /usr/bin/X (0x8048000+0x1a885) [0x8062885] 10: /lib/libc.so.6 (__libc_start_main+0xe6) [0x3d0b36] 11: /usr/bin/X (0x8048000+0x1a471) [0x8062471] Segmentation fault at address 0x150 Fatal server error: Caught signal 11 (Segmentation fault). Server aborting Created attachment 361448 [details]
xorg.conf
Created attachment 361579 [details]
stack trace with symbolic info
Created attachment 361618 [details]
Another stack trace with symbols
After this crash, the o/s was still working, and I could shutdown gracefully.
Comment on attachment 361579 [details]
stack trace with symbolic info
After this crash (gdb_log.2214), the o/s had locked up hard, and I had to forcibly power off.
This is still happening. Current versions are: kernel-2.6.31-33.fc12.i686 xorg-x11-server-Xorg-1.6.99.902-1.fc12.i686 xorg-x11-server-common-1.6.99.902-1.fc12.i686 xorg-x11-drv-nouveau-0.0.15-11.20090921gitdf94ebd.fc12.i686 libXrandr-1.3.0-3.fc12.i686 kdebase-4.3.1-2.fc12.i686 kdelibs-4.3.1-6.fc12.i686 qt-4.5.2-19.fc12.i686 New stack trace attached. Created attachment 362162 [details]
X stack trace from 23 Sep 09 ,1:31pm+10
Ben? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Mmm, not a real idea. There's been bugfixes to both the X side and nouveau in recent times, though it may be best to wait a bit as there's more fixes to come. I'll update when they're available :) I can tell you that it is still happening even with the latest released changes (yesterday's updates). I'll be watching out for your comments :) Thanks! xorg-x11-server-Xorg-1.7.0-1.fc12.i686 xorg-x11-drv-nouveau-0.0.15-13.20090929gitdd8339f.fc12.i686 etc. you could try this xorg-x11-drv-nouveau build: http://koji.fedoraproject.org/koji/buildinfo?buildID=135644 and this kernel build: http://koji.fedoraproject.org/koji/buildinfo?buildID=136674 and see how they behave. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Thanks Adam, I'll give that a try over the next day or two. However, if you look at the stack traces, most of the crashes seem to be caused when X is freeing resources, and not related to kernel or driver. I suspect this is a memory corruption in the user mode part of the X server. Richard I should also add, that over time I have come to recognise that this problem most often occurs when I close a window or app. there's definitely some system-specific element to this, because I'm running nouveau on current Rawhide and it stays up happily for days at a time. Ben did say he wanted you to try the latest changes to kernel and nouveau, so that's why I pointed them out. they are not in Rawhide because of the beta freeze. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Created attachment 365187 [details]
gdb stack trace of X server after crash
I installed the requested kernel and nouveau drivers last Friday, but didn't really get around to using the system much until today.
Unfortunately, the same sort of crash is still occurring.
Please see attached gdb stack trace.
Thanks. Ben? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers btw, note that although this is still assigned to nouveau, adam jackson and dave airlie (two of our X server guys) are CCed on it, so if they think it's in the server and have any bright ideas, they'll be jumping in =) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Is there any chance of a build without optimisations? In case it the stack trace reveal more. Also, the next time it happens, is there anything you want me to do in gdb? Anything you want printed out? ben, those questions are for you :) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers There still appears to be numerous issues in some EXA changes that happened recently, I suspect that to be the case here too, and am looking into it. Thanks Ben. This issue has been happening to me for months, not days or even weeks. So I'm not sure it is something newly introduced. Then again...??? I spent a couple of days running X under valgrind, but unfortunately couldn't trigger the bug. And since it was so horribly slow, I've stopped using it that way. So sorry I couldn't add anything useful there. Not sure if this is a related bug, but when changing virtual terminals (from text console to X), I got this segfault... Program received signal SIGSEGV, Segmentation fault. 0x08089034 in privateExists (key=0x64d898, privates=0x18) at privates.c:79 79 return *key && *privates && Current language: auto The current source language is "auto; currently c". (gdb) bt #0 0x08089034 in privateExists (key=0x64d898, privates=0x18) at privates.c:79 #1 dixLookupPrivate (key=0x64d898, privates=0x18) at privates.c:162 #2 0x00642298 in exaPolyFillRect (pDrawable=<value optimized out>, pGC=<value optimized out>, nrect=<value optimized out>, prect=<value optimized out>) at exa_accel.c:764 #3 0x0811c1c6 in damagePolyFillRect (pDrawable=<value optimized out>, pGC=<value optimized out>, nRects=<value optimized out>, pRects=<value optimized out>) at damage.c:1404 #4 0x0809be02 in miPaintWindow (pWin=<value optimized out>, prgn=<value optimized out>, what=<value optimized out>) at miexpose.c:670 #5 0x0809c198 in miWindowExposures (pWin=<value optimized out>, prgn=<value optimized out>, other_exposed=<value optimized out>) at miexpose.c:504 #6 0x0817b5ec in xf86XVWindowExposures (pWin=<value optimized out>, reg1=<value optimized out>, reg2=<value optimized out>) at xf86xv.c:1054 #7 0x081ac5e8 in miHandleValidateExposures (pWin=<value optimized out>) at miwindow.c:246 #8 0x08097a54 in MapWindow (pWin=<value optimized out>, client=<value optimized out>) at window.c:2658 #9 0x0806d829 in ProcMapWindow (client=<value optimized out>) at dispatch.c:843 #10 0x0806e187 in Dispatch () at dispatch.c:445 #11 0x08062875 in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at main.c:285 Still happening, as of the latest (2009-11-04) published versions of everything, including Xorg 1.7.0-5.fc12. I have had a couple of variants of the crash, and will attach stack dumps after this. But one interesting thing to note is that I can now more or less get the problem to occur at will. Using Sun's JRE 1.6.0_16, in particular Java Web Start apps (for example, http://www.playclockwiser.com/clockwiser.jnlp), and running from the console like so: /usr/java/jre1.6.0_16/bin/javaws ~/Download/clockwiser.jnlp ... is a pretty reliable way to cause X to crash ... particularly when the application is shut down. (Not every time, but very often). NB: it is not just this app, but rather this is a convenient publicly available app that triggers the bug for me. Stack traces etc to be attached next. Created attachment 367388 [details]
X server stack trace
Created attachment 367389 [details] Text output to tty7 after X stack trace in attachment #367388 [details] Created attachment 367390 [details]
X server stack trace
This stack trace was taken when X died after shutting down a JWS app.
NB: there seems to be no driver specific involvement in this particular crash.
Created attachment 367391 [details] startx output associated with X stack trace in attachment #367390 [details] Created attachment 367392 [details]
X server stack trace
yet another stack trace from an X crash when closing a JWS app.
NB: this is different but similar to the other crashes involving exa.
Created attachment 367393 [details] startx output associated with X stack trace in attachment #367392 [details] I have a couple more stack traces to add. The reason these are notable is that: a) all previous runs of X were with the xorg.conf attached to this bug. The first new stack trace comes from a run after I deleted the xorg.conf and let X autodetect everything. The desktop displayed correctly, but I could still induce a crash. b) this run is important because it is with the vesa driver, not nouveau! It goes some way to showing this problem is a server issue, and not (only) a driver one. Created attachment 367402 [details]
X server stack trace - vesa driver
Created attachment 367403 [details] startx output associated with X stack trace in attachment #367402 [details] Created attachment 367404 [details]
X server stack trace - no xorg.conf
Created attachment 367405 [details] startx output associated with X stack trace in attachment #367404 [details] Created attachment 367406 [details] xorg.conf file associated with stack trace in attachment #367402 [details] I can't believe it!!! I have updated to the latest X server 1.7.1-6 and things are stable! At least so far. I have tried my previous repeatable ways to get the X server to crash, and it hasn't!!!! After many months of being told its my graphics card, finally an X server update fixes this. Perhaps its been a mixture of X server and driver problems. Whatever, it feels good to get some confidence back in my desktop. I really really really hope that you seriously consider releasing 1.7.1 as part of FC12 final. I'm sure you will regret it otherwise. I hope these celebrations aren't temporary. But I'll keep running X and gdb for now, just in case. issues that are fixed in the server can still only manifest on some cards. 1.7.1-5 has already been tagged, that would be enough to fix your issue. -6's change is unrelated. let's close this one, then - re-open if the problems come back. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Then why did the problems occur with the vesa driver too? presumably because the bug was in the server code. we never said the bug was in the driver code, ben only suggested at first that that might be the case, as a possibility. it's not important, the important thing is it's fixed... -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers |
Created attachment 361429 [details] X server log file Description of problem: Every now and then (sometimes frequently, particularly, it seems, while playing audio), the X server will crash. Stack-trace in attached files. I can sometimes go a day without a crash, but typically have 3-10 crashes per day. The symptoms at the time of crash are: * the screen completely freezes * mouse freezes * keyboard freezes (capslock etc don't respond, nor does vt switching) * 9/10 times, the o/s is still running ... if I hit the power button, the o/s does a graceful shutdown. I haven't tried ssh'ing in. On rare occasions, the power-button won't be recognised either, and I am forced to reset the machine. Version-Release number of selected component (if applicable): Many versions (over the last few months of fc12). Current versions are: kernel-2.6.31-14.fc12.i686 xorg-x11-server-Xorg-1.6.99.901-2.fc12.i686 xorg-x11-server-common-1.6.99.901-2.fc12.i686 xorg-x11-drv-nouveau-0.0.15-10.20090914git1b72020.fc12.i686 libXrandr-1.3.0-3.fc12.i686 kdebase-4.3.1-2.fc12.i686 kdelibs-4.3.1-3.fc12.i686 qt-4.5.2-18.fc12.i686 Current video card is an NV96. Running multi-headed. How reproducible: It happens randomly, but I can reasonably readily reproduce by playing music with Amarok, and opening or closing apps. Steps to Reproduce: 1. n/a 2. 3. Actual results: X locks up, and sometimes the whole system freezes. Expected results: X works without lockups. Additional info: Please see attached log files.