Description of problem: With the latest nouveau driver in rawhide X easily freezes at 100% cpu: the most reliable way to make it happen is to simply open a page in firefox, but it also happened to me scrolling a terminal etc. Version-Release number of selected component (if applicable): [root@localhost paolo]# rpm -qa | grep xorg-x11-server xorg-x11-server-Xorg-1.6.0-19.fc11.i586 xorg-x11-server-common-1.6.0-19.fc11.i586 xorg-x11-server-utils-7.4-7.fc11.i586 [root@localhost paolo]# rpm -qa | grep nouveau xorg-x11-drv-nouveau-0.0.12-26.20090413git7100c06.fc11.i586 xorg-x11-drv-nouveau-debuginfo-0.0.12-26.20090413git7100c06.fc11.i586 [root@localhost paolo]# uname -a Linux localhost.localdomain 2.6.29.1-68.fc11.i586 #1 SMP Sat Apr 11 02:06:17 EDT 2009 i686 i686 i386 GNU/Linux How reproducible: 100% reproduceable Additional info: I attached gdb to X, reproduced the 100% cpu and then ctrl+C and bt: the output was consistently Program received signal SIGINT, Interrupt. 0x00a17e87 in nouveau_dma_wait () from /usr/lib/libdrm_nouveau.so.1 (gdb) bt #0 0x00a17e87 in nouveau_dma_wait () from /usr/lib/libdrm_nouveau.so.1 #1 0x00a1613a in nouveau_pushbuf_flush () from /usr/lib/libdrm_nouveau.so.1 #2 0x001a74bc in FIRE_RING (chan=<value optimized out>) at /usr/include/nouveau/nouveau_pushbuf.h:98 #3 NV04EXASolid (chan=<value optimized out>) at nv04_exa.c:156 tail of dmesg after it happens: SELinux: initialized (dev fuse, type fuse), uses genfs_contexts nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 1 nouveau 0000:01:00.0: nouveau_fifo_free: freeing fifo 1 nouveau 0000:01:00.0: Failed to idle channel 1. Prepare for strangeness.. nouveau 0000:01:00.0: Unhandled PGRAPH_INTR - 0x00000100 nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 1/6 Mthd 0x0184 Data 0xd8000002 nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 1/6 Mthd 0x0188 Data 0xd8000001 nouveau 0000:01:00.0: Unhandled PGRAPH_INTR - 0x00000080 nouveau 0000:01:00.0: nouveau_fifo_free: freeing fifo 0 nouveau 0000:01:00.0: Allocating FIFO number 0 nouveau 0000:01:00.0: nouveau_fifo_alloc: initialised FIFO 0 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18c4 Data 0x00000000:0xc1500000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18c8 Data 0xbf800000:0x00000000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18cc Data 0xbf800000:0xbf800000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x1900 Data 0x00000000:0xfff40001 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18c0 Data 0x41500000:0x00000000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18c4 Data 0x41500000:0x41500000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18c8 Data 0x3f800000:0x00000000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x18cc Data 0x3f800000:0x3f800000 nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 0/7 Class 0x0000 Mthd 0x1900 Data 0x00000000:0x000e0001 nouveau 0000:01:00.0: Allocating FIFO number 1 nouveau 0000:01:00.0: nouveau_fifo_alloc: initialised FIFO 1 SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
Created attachment 339530 [details] Xorg.0.log xorg log file in case it has some useful info
Can you downgrade to kernel-2.6.29-0.258.2.3.rc8.git2.fc11 and see if the issue still occurs? There's a couple of other similar bug reports, but can't track down exactly what the cause is as of yet.
same thing happens with kernel-2.6.29-0.258.2.3.rc8.git2.fc11 (if you want me to try older kernels I'll need a brief explanation of how to downgrade, I happened to still have 2.6.29-0.258.2.3 in the grub list)
Ok, interesting.. Can you now downgrade xorg-x11-drv-nouveau to 0.0.12-10.20090310git8f9a580 and try with both your latest kernel and the 258 kernel. You can grab the RPM for -10 from: http://koji.fedoraproject.org/koji/buildinfo?buildID=93599, and install with "rpm -Uvh --force <filename>". Thank you!
downgrading to xorg-x11-drv-nouveau-0.0.12-10 works, I was not able to reproduce the bug
Excellent, second report I've had with bugs like this pointing the finger at the 2D driver, rather than the kernel! I'll look at all the changes since then tomorrow and see what stands out. If you're feeling really keen, narrowing it down a bit more would be very helpful (I can't reproduce on any of my hardware)! There's a list of all the nouveau 2d driver packages at http://koji.fedoraproject.org/koji/packageinfo?packageID=5871 :) Thanks! Ben.
Ok, tracked down when the problem starts: 0.0.12-11 -> OK 0.0.12-15 -> BUG (Note that 12 is not on koji and 13 and 14 were failed builds)
Thank you, that's perfect :) I'll try and track this down in the morning!
There's a build of plain upstream nouveau http://koji.fedoraproject.org/koji/taskinfo?taskID=1301622, do you see the issue there?
I need an x86 build to test...
I compiled upstream driver from the git repo (see below for the version) and it seems to survive a bit of testing. git log | head commit 7100c06be099bacc0f8bb8898bbf7eb34ff1cc6e Author: Ben Skeggs <skeggsb> Date: Mon Apr 13 20:21:51 2009 +1000
even more weird: I compiled the following revision from git: commit 4067ab466fe3aa817e0323959f70c7dd3494de0a Author: Ben Skeggs <skeggsb> Date: Mon Mar 23 14:43:22 2009 +1000 and I still cannot reproduce. However reinstalling the rpm "15" (which should correspond to the same version) still triggers the bug easily.
Yeah, I figured as much.. nothing was standing out as I scoured over the diffs as an obvious issue. It'll be one of the fedora-specific patches interacting badly with one of the commits between -11 and -15 then, fun fun! I'll keep looking, as this is something that really should be fixed before the release. Would be much easier if I could reproduce myself!
ok, I tried git HEAD + the patch in the fedora repo and pinpointed it to this patch: nouveau-multiple-xserver.patch applying just that one on top of git triggers the issue
This is fixed now in libdrm-2.4.6-6.fc11 and xorg-x11-drv-nouveau-0.0.12-29.20090417gitfa2f111.fc11. Thank you again for all your help tracking this down on IRC earlier :)