Since the upgrade to FC13, I've been getting hangs now and then. I finally correlated it to firefox opening very large pages so it seems like there is some buffer check missing or similar. I never had this problem on FC12. At first it seems like only Xorg is affected. Mouse pointer still works and machine is reachable over the network. Xorg is eating 100% CPU and killing it (SIGKILL is needed) results in the entire machine hanging. The nouveau kernel module throws out these things whenever the machine gets hosed: Jun 11 08:12:41 mjolnir kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2 Hardware is a Quadro NVS 140M. Please have a look at this soon as this makes the machine very difficult to use as it might hang on you at any moment. :/
Thanks for the bug report. We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue. Please add drm.debug=0x04 to the kernel command line, restart computer, wait until the Xorg freezes, and collects the following via ssh * your X server config file (/etc/X11/xorg.conf, if available), * X server log file (/var/log/Xorg.*.log) * output of the dmesg command, and * system log (/var/log/messages) and attach to the bug report as individual uncompressed file attachments using the bugzilla file attachment link above. We will review this issue again once you've had a chance to attach this information. Thanks in advance.
Created attachment 423342 [details] xorg.conf
Created attachment 423343 [details] Xorg.0.log
Created attachment 423345 [details] dmesg
Created attachment 423346 [details] messages
Backtrace: [ 214.084] 0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x49e708] [ 214.084] 1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x49e0b4] [ 214.084] 2: /usr/bin/Xorg (xf86PostButtonEventP+0xcf) [0x477a3f] [ 214.085] 3: /usr/bin/Xorg (xf86PostButtonEvent+0xbe) [0x477b6e] [ 214.085] 4: /usr/lib64/xorg/modules/input/synaptics_drv.so (0x7ff04ec47000+0x3a12) [0x7ff04ec4aa12] [ 214.085] 5: /usr/lib64/xorg/modules/input/synaptics_drv.so (0x7ff04ec47000+0x5cc8) [0x7ff04ec4ccc8] [ 214.085] 6: /usr/bin/Xorg (0x400000+0x6aae7) [0x46aae7] [ 214.085] 7: /usr/bin/Xorg (0x400000+0x117b43) [0x517b43] [ 214.085] 8: /lib64/libc.so.6 (0x326d200000+0x32a20) [0x326d232a20] [ 214.085] 9: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7ff04f07c000+0x1f458) [0x7ff04f09b458] [ 214.085] 10: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7ff04f07c000+0x2093d) [0x7ff04f09c93d] [ 214.085] 11: /usr/lib64/xorg/modules/libexa.so (0x7ff04e5e7000+0x9080) [0x7ff04e5f0080] [ 214.085] 12: /usr/lib64/xorg/modules/libexa.so (0x7ff04e5e7000+0xeac8) [0x7ff04e5f5ac8] [ 214.085] 13: /usr/bin/Xorg (0x400000+0xd21e0) [0x4d21e0] [ 214.085] 14: /usr/bin/Xorg (0x400000+0xcb91e) [0x4cb91e] [ 214.085] 15: /usr/bin/Xorg (0x400000+0x2c32c) [0x42c32c] [ 214.085] 16: /usr/bin/Xorg (0x400000+0x219ca) [0x4219ca] [ 214.085] 17: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x326d21ec5d] [ 214.085] 18: /usr/bin/Xorg (0x400000+0x21579) [0x421579]
Anyone had a chance to look at this? It seems to be less frequent with current updates, but it still happens now and then.
Ah, the cause is known and a patch available. I'll fix it in F13 in the morning.
Can you give this build a try please and see how you go: http://koji.fedoraproject.org/koji/taskinfo?taskID=2383661
This looks suspiciously like bug #609764 or bug #566987. I'm running the new kernel on my Dell T3500 where the older kernels would freeze up with 10 minutes. Works fine up to now, keeping my fingers crossed...
kernel-2.6.34.2-34.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/kernel-2.6.34.2-34.fc13
kernel-2.6.34.2-34.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/kernel-2.6.34.2-34.fc13
(In reply to comment #10) > This looks suspiciously like bug #609764 or bug #566987. > > I'm running the new kernel on my Dell T3500 where the older kernels would > freeze up with 10 minutes. Works fine up to now, keeping my fingers crossed... This particular issue should only occur if something's chewed up all your VRAM, if that's what's happening for you, then it's definitely a possible candidate.
(In reply to comment #13) > (In reply to comment #10) > > This looks suspiciously like bug #609764 or bug #566987. > > > > I'm running the new kernel on my Dell T3500 where the older kernels would > > freeze up with 10 minutes. Works fine up to now, keeping my fingers crossed... > > This particular issue should only occur if something's chewed up all your VRAM, > if that's what's happening for you, then it's definitely a possible candidate. I just tested with KDE. It actually appears that KDE does indeed use a massive amount of VRAM compared to Gnome, so it's entirely possible that you're hitting this bug if you're using KDE.
(In reply to comment #14) > > I just tested with KDE. It actually appears that KDE does indeed use a massive > amount of VRAM compared to Gnome, so it's entirely possible that you're hitting > this bug if you're using KDE. I didn't check on the desktop if the corrupted KDE icons bug #591570 is fixed with this kernel. It is still running on my laptop so I'll check there. While this new kernel didn't fix the lockup bug #609764, the machine behaved differently. Before there was no chance to getting the card back to work again without a hard reboot. With this kernel the framebuffer console came back at X server reset and the machine shut down normally. I'm using KDE4 with kwin4 running XRender composite. OpenGL from the nouveau mesa driver is still too incomplete, so kwin4 rejects it. As far as I can tell all my lockups happened when some popup window appeared (window menus, menus in task bar).
I've installed kernel-2.6.34.2-34.fc13 and it seems to solve at least one test case I managed to produce.
kernel-2.6.34.3-37.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/kernel-2.6.34.3-37.fc13
kernel-2.6.34.3-37.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/kernel-2.6.34.3-37.fc13
Seems like there is some issue remaining. I got a hang today again. Nothing in Xorg.0.log.old, but I got this in messages: Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_TRAP - Ch 2/5 Class 0x8297 Mthd 0x15e0 Data 0x00000000:0x00000000 Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_TRAP_TPDMA - no VM fault? Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_TRAP_TPDMA - TP0: Unhandled ustatus 0x00000008 Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2 Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0fa4 Data 0x00000000:0x0008ae04 Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0fa8 Data 0x00000000:0x0151014d Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
kernel-2.6.34.6-47.fc13 has been submitted as an update for Fedora 13. https://admin.fedoraproject.org/updates/kernel-2.6.34.6-47.fc13
(In reply to comment #19) > Seems like there is some issue remaining. I got a hang today again. Nothing in > Xorg.0.log.old, but I got this in messages: > > Aug 11 17:05:18 mjolnir kernel: [drm] nouveau 0000:01:00.0: PGRAPH_TRAP - Ch > 2/5 Class 0x8297 Mthd 0x15e0 Data 0x00000000:0x00000000 Your problem is probably fixed and now you see bug #566987 or bug #609764. IMvvvHO 2.6.34 is a dud (radeon suspend & hibernate broken, worse powertop results than in .33 r .35) and F13 should go to .35 directly.
kernel-2.6.34.6-47.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.