Description of problem: With default Fedora 10 settings (mode setting enabled, no xorg.conf) Xorg gets stuck after a few minutes of working in Gnome. The machine is not dead, I can ssh into it. top shows the CPUs are idle. I got a backtrace of Xorg with gdb: ... Loaded symbols for /lib64/libnss_files.so.2 pixman_fill_mmx (bits=<value optimized out>, stride=192, bpp=<value optimized out>, x=<value optimized out>, y=<value optimized out>, width=<value optimized out>, height=17, xor=0) at pixman-mmx.c:1799 1799 __asm__ ( Missing separate debuginfos, use: debuginfo-install expat-2.0.1-5.x86_64 freetype-2.3.7-1.fc10.x86_64 libcap-2.10-2.fc10.x86_64 mesa-dri-drivers-7.2-0.13.fc10.x86_64 xorg-x11-drv-ati-6.9.0-55.fc10.x86_64 xorg-x11-drv-evdev-2.0.7-3.fc10.x86_64 xorg-x11-drv-fbdev-0.3.1-7.fc9.x86_64 xorg-x11-drv-synaptics-0.15.2-1.fc10.x86_64 xorg-x11-drv-vesa-2.0.0-1.fc10.x86_64 zlib-1.2.3-18.fc9.x86_64 (gdb) bt #0 pixman_fill_mmx (bits=<value optimized out>, stride=192, bpp=<value optimized out>, x=<value optimized out>, y=<value optimized out>, width=<value optimized out>, height=17, xor=0) at pixman-mmx.c:1799 #1 0x000000302322640d in pixman_fill (bits=0x0, stride=0, bpp=-80224256, x=0, y=-80224256, width=0, height=18, xor=0) at pixman-utils.c:175 #2 0x000000000103fcd2 in fbFill (pDrawable=<value optimized out>, pGC=<value optimized out>, x=0, y=0, width=36, height=18) at fbfill.c:48 #3 0x000000000103ff36 in fbPolyFillRect (pDrawable=0x146e3e0, pGC=0x1429590, nrect=0, prect=<value optimized out>) at fbfillrect.c:77 #4 0x00007fcc02320b74 in ExaCheckPolyFillRect (pDrawable=0x146e3e0, pGC=0x1429590, nrect=1, prect=0x151dad0) at exa_unaccel.c:229 #5 0x00007fcc02319954 in exaPolyFillRect (pDrawable=0x146e3e0, pGC=0x1429590, nrect=1, prect=0x151dad0) at exa_accel.c:778 #6 0x0000000000529b06 in damagePolyFillRect (pDrawable=0x146e3e0, pGC=0x1429590, nRects=1, pRects=0x151dad0) at damage.c:1337 #7 0x0000000000443b76 in ProcPolyFillRectangle (client=0x151c000) at dispatch.c:1795 #8 0x00000000004468d4 in Dispatch () at dispatch.c:454 #9 0x000000000042cd1d in main (argc=9, argv=0x7fff0a7eb7a8, envp=<value optimized out>) at main.c:441 Notice the funny values of arguments to pixmap_fill. The hardware is: ATI Technologies Inc RS690M [Radeon X1200 Series]. Version-Release number of selected component (if applicable): kernel-2.6.27.5-120.fc10.x86_64 xorg-x11-server-Xorg-1.5.3-5.fc10.x86_64 xorg-x11-drv-ati-6.9.0-55.fc10.x86_64 How reproducible: 100% Steps to Reproduce: 1. Boot with kernel modesetting, default X configuration (no xorg.conf). 2. Run gtkperf with lots of iterations (10000). Actual results: In less than a minute gtkperf stops drawing. Everything else in X is stuck too. The mouse pointer is stuck for half a minute or so, then starts reacting to mouse movement again. But the rest of Xorg will not resume working. Expected results: X should not hang.
Created attachment 324314 [details] Xorg.0.log (includes a backtrace)
Created attachment 324315 [details] dmesg
Lovely! Backtrace: 0: /usr/bin/Xorg(xorg_backtrace+0x26) [0x4e7a26] 1: /usr/bin/Xorg(mieqEnqueue+0x291) [0x4c8591] 2: /usr/bin/Xorg(xf86PostMotionEventP+0xc4) [0x491494] 3: /usr/bin/Xorg(xf86PostMotionEvent+0xa9) [0x491669] 4: /usr/lib64/xorg/modules/input//synaptics_drv.so [0x7fcbfbc12832] 5: /usr/lib64/xorg/modules/input//synaptics_drv.so [0x7fcbfbc14de2] 6: /usr/bin/Xorg [0x47a765] 7: /usr/bin/Xorg [0x46b307] 8: /lib64/libc.so.6 [0x3554432f60] 9: /usr/lib64/libpixman-1.so.0 [0x3023229d10] 10: /usr/lib64/libpixman-1.so.0(pixman_fill+0x3d) [0x302322640d] 11: /usr/lib64/xorg/modules//libfb.so(fbFill+0x482) [0x103fcd2] 12: /usr/lib64/xorg/modules//libfb.so(fbPolyFillRect+0x1c6) [0x103ff36] 13: /usr/lib64/xorg/modules//libexa.so(ExaCheckPolyFillRect+0x44) [0x7fcc02320b74] 14: /usr/lib64/xorg/modules//libexa.so [0x7fcc02319954] 15: /usr/bin/Xorg [0x529b06] 16: /usr/bin/Xorg(ProcPolyFillRectangle+0xe6) [0x443b76] 17: /usr/bin/Xorg(Dispatch+0x364) [0x4468d4] 18: /usr/bin/Xorg(main+0x45d) [0x42cd1d] 19: /lib64/libc.so.6(__libc_start_main+0xe6) [0x355441e546] 20: /usr/bin/Xorg [0x42c0f9]
I've just kicked off a new kernel build in koji. kernel-2.6.27.5-123.fc10 it'll appear here when finished. http://kojipkgs.fedoraproject.org/packages/kernel/2.6.27.5/123.fc10/ Can you install it and see if it helps?
kernel-2.6.27.5-123.fc10.x86_64 made no difference. I can still easily reproduce the hang, the backtrace is the same and I see no new messages in dmesg or Xorg.0.log. Then I upgraded to xorg-x11-drv-ati-6.9.0-56.fc10. The test hangs again, but the backtrace is now different: (gdb) bt #0 0x00000035544ddff7 in ioctl () from /lib64/libc.so.6 #1 0x000000356d603023 in drmIoctl (fd=9, request=3222299750, arg=0x7fff6b29c520) at xf86drm.c:186 #2 0x000000356d60326c in drmCommandWriteRead (fd=9, drmCommandIndex=<value optimized out>, data=0x7fff6b29c520, size=18446744073709551615) at xf86drm.c:2342 #3 0x0000000007b23116 in RADEONCSFlushIndirect (pScrn=0x1590780, discard=<value optimized out>) at radeon_accel.c:629 #4 0x0000000007b2300b in RADEONCPFlushIndirect (pScrn=0x1590780, discard=1) at radeon_accel.c:794 #5 0x0000000007b729ea in R300TextureSetupCP (pPict=0x183f0a0, pPix=0x1828810, unit=1) at radeon_exa_render.c:1238 #6 0x0000000007b73226 in R300PrepareCompositeCP (op=3, pSrcPicture=0x1833ec0, pMaskPicture=0x183f0a0, pDstPicture=0x1837620, pSrc=0x1833cf0, pMask=0x1828810, pDst=0x18339e0) at radeon_exa_render.c:1447 #7 0x0000000005490932 in exaTryDriverComposite (op=3 '\003', pSrc=0x1833ec0, pMask=0x183f0a0, pDst=0x1837620, xSrc=3, ySrc=5, xMask=<value optimized out>, yMask=<value optimized out>, xDst=<value optimized out>, yDst=<value optimized out>, width=<value optimized out>, height=<value optimized out>) at exa_render.c:671 #8 0x00000000054912d5 in exaComposite (op=3 '\003', pSrc=0x1833ec0, pMask=0x183f0a0, pDst=0x1837620, xSrc=3, ySrc=5, xMask=0, yMask=0, xDst=3, yDst=5, width=11, height=11) at exa_render.c:936 #9 0x00000000005291b8 in damageComposite (op=9 '\t', pSrc=0x1833ec0, pMask=0x183f0a0, pDst=0x1837620, xSrc=3, ySrc=5, xMask=0, yMask=<value optimized out>, xDst=<value optimized out>, yDst=<value optimized out>, width=<value optimized out>, height=<value optimized out>) at damage.c:576 #10 0x0000000005490554 in exaTrapezoids (op=9 '\t', pSrc=0x1833ec0, pDst=0x1837620, maskFormat=0x159c158, xSrc=3, ySrc=5, ntrap=0, traps=0x17e1530) at exa_render.c:1122 #11 0x000000000051a83d in ProcRenderTrapezoids (client=0x17f0dc0) at render.c:791 #12 0x00000000004468d4 in Dispatch () at dispatch.c:454 #13 0x000000000042cd1d in main (argc=9, argv=0x7fff6b29cd48, envp=<value optimized out>) at main.c:441
On another try I received a backtrace almost identical to the one in https://bugzilla.redhat.com/show_bug.cgi?id=472314#c13 . The two bugs may be duplicates. (gdb) bt #0 0x00000035544ddff7 in ioctl () from /lib64/libc.so.6 #1 0x000000356d603023 in drmIoctl (fd=9, request=3222299750, arg=0x7fff7eb30ee0) at xf86drm.c:186 #2 0x000000356d60326c in drmCommandWriteRead (fd=9, drmCommandIndex=<value optimized out>, data=0x7fff7eb30ee0, size=18446744073709551615) at xf86drm.c:2342 #3 0x000000000755c116 in RADEONCSFlushIndirect (pScrn=0x1de9600, discard=<value optimized out>) at radeon_accel.c:629 #4 0x000000000755c31d in RADEONCSReleaseIndirect (pScrn=0x9) at radeon_accel.c:703 #5 0x000000000755c3fd in RADEONCPReleaseIndirect (pScrn=0x9) at radeon_accel.c:833 #6 0x00000000075a20e8 in RADEONLeaveServer () at radeon_dri.c:560 #7 RADEONDRISwapContext (pScreen=<value optimized out>, syncType=<value optimized out>, oldContextType=<value optimized out>, oldContext=0xffffffffffffffff, newContextType=<value optimized out>, newContext=0x1de8660) at radeon_dri.c:585 #8 0x0000000005440809 in DRIDoBlockHandler (screenNum=<value optimized out>, blockData=<value optimized out>, pTimeout=<value optimized out>, pReadmask=<value optimized out>) at dri.c:1655 #9 0x000000000543f8e6 in DRIBlockHandler (blockData=0x0, pTimeout=0x7fff7eb31268, pReadmask=0x7dc7c0) at dri.c:1622 #10 0x000000000044a355 in BlockHandler (pTimeout=0x7fff7eb31268, pReadmask=0x7dc7c0) at dixutils.c:387 #11 0x00000000004e4eb1 in WaitForSomething (pClientsReady=0x1eeee20) at WaitFor.c:223 #12 0x00000000004465ef in Dispatch () at dispatch.c:375 #13 0x000000000042cd1d in main (argc=9, argv=0x7fff7eb31438, envp=<value optimized out>) at main.c:441
Created attachment 324503 [details] workaround patch The first version where the problem first appears for me is xorg-x11-drv-ati-6.9.0-41. The attached patch reverts the change done in that version. A scratch build is here: http://koji.fedoraproject.org/koji/taskinfo?taskID=948177 I don't know what the change does, but I can't reproduce the hang with this patch.
Wierd I was getting a hang in gtkperf here but the kernel I generated fixed it, and gtkperf completes fine. I'll have a look to see why the workaround helps not hit the wierd case.
I upgraded to xorg-x11-drv-ati-6.9.0-57.fc10.x86_64. The hang is still reproducible, but there is a small change in the best way to reproduce it. With previous versions running GtkCheckButton test with 10000 iterations was very likely to cause the hang. With -57 I haven't been able to reproduce the hang during this test. With -57, instead of the GtkCheckButton test, I can now use GtkEntry test with 10000 iterations to cause the hang. Or I just run all tests with the default 100 iterations - then the hang is most likely to occur during GtkTextView scroll test.
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
xorg-x11-drv-ati-6.9.0-58.fc10.x86_64 looks good! I can't reproduce this hang anymore.
(In reply to comment #11) > xorg-x11-drv-ati-6.9.0-58.fc10.x86_64 looks good! I can't reproduce this hang > anymore. Same for me, I can't reproduce this freeze (with the -58), even when : - scrolling very very quickly on firefox - playing huge 3D games - creating some rapidly moving text in a terminal Good news !
cool, closing per reporter's comment 11