618143 – Nouveau driver freezes system randomly but regularly

Bug 618143 - Nouveau driver freezes system randomly but regularly

Summary: Nouveau driver freezes system randomly but regularly

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	xorg-x11-drv-nouveau
Sub Component:
Version:	13
Hardware:	All
OS:	Linux
Priority:	low
Severity:	high
Target Milestone:	---
Assignee:	Ben Skeggs
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-07-26 09:19 UTC by Chris Rouch
Modified:	2011-06-29 12:58 UTC (History)
CC List:	26 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-06-29 12:58:21 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Hanging X process call trace (2.89 KB, text/plain) 2010-10-20 20:06 UTC, Thomas Jarosch	no flags	Details
Kernel backtrace before killing X (47.07 KB, text/plain) 2010-10-20 20:07 UTC, Thomas Jarosch	no flags	Details
Kernel backtrace after killing X + boot output (114.78 KB, text/plain) 2010-10-20 20:07 UTC, Thomas Jarosch	no flags	Details
View All

Description Chris Rouch 2010-07-26 09:19:30 UTC

Description of problem:

I'm using the Nouveau driver on an HP G70 laptop with an nvidia 
GeForce 9200M GS graphics card. I use kde and kwin and have compositing effects enabled. From time to time the system will freeze - no response to ctrl-fN or ctrl-alt-del. The only fix is a hard power off, and even after this the system will fail its first boot (it hangs after the udev warnings). Once it is reset again it will boot normally.

It seems to happen when I try to raise a window over a previously opened menu or smooth task preview - certainly it is either where an eye-candy action is needed or has just been invoked.

Version-Release number of selected component (if applicable):

kernel-2.6.33.6-147.fc13.x86_64
xorg-x11-drv-nouveau-0.0.16-7.20100423git13c1043.fc13.x86_64


How reproducible:
This happens regularly (maybe once a day, sometimes more), but is not predictable

Steps to Reproduce:
1. Log in using kde
2. Enable compositing
3. Use multiple applications
  
Actual results:

The system freezes

Expected results:

The system keeps working

Additional info:

I see this in /var/log/messages at around the time it happens:

Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch
 2
Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - C
h 2/2 Class 0x502d Mthd 0x0238 Data 0x0002b3ae:0x00042050
Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - I
NVALID_BITFIELD
Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - C
h 2/2 Class 0x502d Mthd 0x023c Data 0x0002b3ae:0x0002b3ae
Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - I
NVALID_VALUE
Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - C
h 2/2 Class 0x502d Mthd 0x0240 Data 0x00000000:0x00000020
Jul 26 10:59:46 watson kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - I
NVALID_VALUE

Comment 1 Karsten Roch 2010-07-26 17:18:01 UTC

Similar problem here, with a Geforce GS 8400 card.

2.6.35-0.56.rc6.git1.fc14.i686
xorg-x11-drv-nouveau.i686     1:0.0.16-9.20100615gitdb98ad2.fc14
Desktop: LXDE

Symptoms: Sudden freeze of the whole X system. I can't escape by ctrl-alt-backspace or switch to the tty console screen to get the system back alive without hard reset. This problem happens sometimes, when there is some video played, webcam/Skype is in use or graphics/pictures are shown and modified. 

/var/log/messages shows only one line, when such error occurs:

Jul 25 18:29:50 krakatoa kernel: [drm] nouveau 0000:03:00.0: PFIFO_DMA_PUSHER - Ch 2

Regards
Karsten

Comment 2 Chuck Ebbert 2010-07-27 15:02:30 UTC

Try adding the kernel boot option "pcie_aspm=off".

And also try kernel-2.6.34.1-29.fc13 from koji.

Comment 3 Karsten Roch 2010-07-27 20:20:39 UTC

 The kernel boot option "pcie_aspm=off" seems not to make any difference here, the system crashed today after manipulating (resize, crop..) some pictures in KolourPaint and try to open 3 pictures at the same time... X completely freezes, system is unoperabel, but still reacting (e.g. connecting a usb-device will create an entry to /var/log/messages...)

Regards
Karsten

Comment 4 Chris Rouch 2010-07-28 09:36:44 UTC

I've rebooted with that option:

% cat /proc/cmdline 
ro root=/dev/mapper/privg-priroot rd_LVM_LV=privg/priroot rd_LVM_LV=datavg/swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.iso-8859-15 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet vga=791 pcie_aspm=off

and i've switched desktop effects back on. So far, no problems with the display.

Comment 5 Chris Rouch 2010-07-28 10:40:33 UTC

Eventually it froze again. So this option did not help.

Comment 6 Chris Rouch 2010-08-01 10:38:06 UTC

I've just had another freeze, even with 3d effects switched off, with the mouse moving from firefox to a smooth task icon. /var/log/messages contains this:

Aug  1 12:30:01 watson kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch
 2


This is with koji kernel 2.6.34.1-29.fc13.x86_64

Comment 7 J Edwards 2010-08-02 03:11:15 UTC

I have had multiple freezes with the GS 8400 using the nouveau driver.

Video card:
01:00.0 VGA compatible controller: nVidia Corporation G86 [GeForce 8400M GS] (rev a1)

/var/log/messages:
Aug  1 22:53:09 fractal kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2
Aug  1 22:53:09 fractal kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0fa8 Data 0x00000000:0x00042050
Aug  1 22:53:09 fractal kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE

Same symptoms: Able to move cursor, although buttons no longer respond to hovering, am not able to ctrl+alt+backspace to restart X server, am not able to ctrl+alt+fn to escape to tty.

Same as above, a hard restart was required.  Unlike the above, the system did not hang on the first restart -- it started fine -- however my wireless card did not work.

Upon a second reboot (this time soft) my wireless card (as well as the rest of the system from what I can tell) functioned properly.

uname -a:
Linux fractal 2.6.33.6-147.fc13.x86_64 #1 SMP Tue Jul 6 22:32:17 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

Comment 8 Ben Kevan 2010-08-10 18:01:26 UTC

Same symptom here: 

Errors in /var/log/messages are: 

 Aug 10 10:48:23 bkfc05006060 kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2
Aug 10 10:48:23 bkfc05006060 kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0fa4 Data 0x00000000:0x00042050
Aug 10 10:48:23 bkfc05006060 kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD
Aug 10 10:48:23 bkfc05006060 kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0fa8 Data 0x00000000:0x00036c0f
Aug 10 10:48:23 bkfc05006060 kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE

Running in Fluxbox with xcompmgr 

Lotus Notes, Chromium, urxvt, pidgin were the apps open

I just reboot and all is fine (I'm hardwired, as it happens to me at work). 

This has happened TWICE today, which is quite the annoyance.. and detrimental to my work :( .. I'd like to stay on F13, but if it continues I'll have to switch elsewhere.. 

I haven't yet tried it without xcompmgr and no experimental mesa is not installed

Comment 9 Dan Scholnik 2010-08-11 08:26:54 UTC

Same problem, happens every few days:
kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2

Machine is a Latitude D630 w/ NVIDIA Quadro NVS 135M, kernel is 2.6.33.6-147.2.4.fc13.x86_64 (same problem on every kernel before it also).

I've seen crashes triggered while loading firefox pages, while changing the background image, and various other random events.  I'm not using compositing or 3D or anything else fancy.  The system is still partly alive, the mouse still moves and hitting the power button will shut it down but the rest of the keys are ignored.

Comment 10 Chuck Ebbert 2010-08-11 12:19:33 UTC

Does kernel 2.6.34.3-37 from the updates-testing repository make any difference?

Comment 11 Ben Kevan 2010-08-11 23:12:44 UTC

Chuck, 

I'd imagine it would since the upstream testing kernel change log indicates that they are disabling acceleration by default for nouveau users. You can do this on your current build by using the following kernel param: 

nouveau.noaccel=1

Which is what has given me some sanity and stability for the last 2 days.

Comment 12 Dan Scholnik 2010-08-11 23:51:09 UTC

Testing now, so far no issues but it often takes several days to get a crash.

Comment 13 Chuck Ebbert 2010-08-13 10:35:31 UTC

Is this the same as bug 596330 ?

Comment 14 Ben Skeggs 2010-08-13 11:18:47 UTC

Chuck: No, the NVA3/A5/A8 hang is silent.  The GPU locks up without signalling *any* problem to the driver.

Comment 15 Stephan Dühr 2010-08-13 11:26:40 UTC

hmm, I was hoping it the same bug, I also see 
kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2
when it freezes. I'm now running the testing kernel mentioned there and hope
that it fixes the problem.

I have
01:00.0 VGA compatible controller: nVidia Corporation G84M [Quadro NVS 140M] (rev a1) (prog-if 00 [VGA contr
oller])
        Subsystem: Lenovo ThinkPad T61

Comment 16 Stephan Dühr 2010-08-13 22:39:48 UTC

Still freezing with 2.6.34.3-37.fc13.x86_64, but I get two additional messages:

Aug 13 23:31:10 t61 kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2
Aug 13 23:31:10 t61 kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0f80 
Data 0x00000000:0x0017000d
Aug 13 23:31:10 t61 kernel: [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE

Comment 17 Dan Scholnik 2010-08-18 17:01:28 UTC

Alas, no joy with 2.6.34.3-37.fc13.x86_64 here either, it finally crashed again:

kernel: [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2

Comment 18 Ben Kevan 2010-08-18 17:07:06 UTC

All you have to do is add: 

nouveau.noaccel=1

To your boot kernel params, until the issue is fixed upstream (they are working on it). 

I haven't had a crash since I added that a week ago (prior I was freezing about once a day, sometimes more).

Comment 19 Geoff Wattles 2010-08-21 01:04:21 UTC

Two freezes today after having run Dell Precision T550 striped RAID x86_64 Fedora 13 successfully for one week. The video is 
 nouveau 0000:03:00.0: Detected an NV50 generation card (0x094c00a1)
Linux:
Linux version 2.6.33.6-147.2.4.fc13.x86_64 (mockbuild.fedoraproject.org) (gcc version 4.4.4 20100630 (
Red Hat 4.4.4-10) (GCC) ) #1 SMP Fri Jul 23 17:14:44 UTC 2010

Error from /var/log/messages:
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PFIFO_DMA_PUSHER - Ch 2
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - Ch 2/2 Class 0x502d Mthd 0x0
604 Data 0x0008a3c0:0x0008a3c0
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0
608 Data 0x00000000:0x3f040000
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x1
358 Data 0x00000000:0x00042050
Aug 20 16:59:23 routerbuilder kernel: [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - INVALID_ENUM

I will try the above work around (nouveau.noaccel=1).

Comment 20 Dan Scholnik 2010-08-21 06:58:49 UTC

I tried adding nouveau.noaccel=1 to 2.6.34.3-37.fc13.x86_64.  Every time I tried to use xrandr to rotate my second monitor, I got a segfault:

** (gnome-panel:2900): CRITICAL **: panel_applet_frame_change_background: assertion `PANEL_IS_WIDGET (GTK_WIDGET (frame)->parent)' failed
Tracker-Message: Loading database... '/home/scholnik/.cache/tracker/contents.db' (contents)
Tracker-Message: Opened sqlite3 database:'/home/scholnik/.cache/tracker/contents.db'
Tracker-Message:   Setting cache size to 1024
Tracker-Message: Loading database... '/home/scholnik/.cache/tracker/fulltext.db' (unknown)
Tracker-Message: Opened sqlite3 database:'/home/scholnik/.cache/tracker/fulltext.db'
Tracker-Message:   Setting cache size to 512
Tracker-Message: Opened sqlite3 database:'/home/scholnik/.cache/tracker/meta.db'
Tracker-Message:   Setting cache size to 2000
resize called 2464 1280

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x28) [0x45cef8]
1: /usr/bin/X (0x400000+0x5ce59) [0x45ce59]
2: /lib64/libc.so.6 (0x38ce200000+0x32a20) [0x38ce232a20]
3: /usr/lib64/xorg/modules/libexa.so (exaGetPixmapDriverPrivate+0x14) [0x7fb0728b61c4]
4: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7fb07372a000+0x23d63) [0x7fb07374dd63]
5: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7fb07372a000+0x23def) [0x7fb07374ddef]
6: /usr/bin/X (0x400000+0x8f4bb) [0x48f4bb]
7: /usr/bin/X (BlockHandler+0x50) [0x437cc0]
8: /usr/bin/X (WaitForSomething+0x141) [0x45c031]
9: /usr/bin/X (0x400000+0x35eb2) [0x435eb2]
10: /usr/bin/X (0x400000+0x2189a) [0x42189a]
11: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x38ce21ec5d]
12: /usr/bin/X (0x400000+0x21449) [0x421449]
Segmentation fault at address 0x60

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting


Since the reason I use nouveau is that the nvidia driver doesn't support rotating the second monitor, this is a non fix for me.

Comment 21 Charles Butterfield 2010-09-02 04:54:55 UTC

"Me Too" -- Not sure if this is the best of the several related tickets, just had to choose one.  Can open a new one if Ben thinks that's better, or move these comments to another existing ticket.  Please advise.

I too have been having video lockups related to mouse guestures.  The mouse continues move the cursor, but nothing else changes the screen (neither mouse clicks, keyboard input, CTL-ALT-F1-6, telinit 3 from remote, etc).  Eventually a remote reboot or manual power cycle is required.  I would love it is there is in fact something I could do from an SSH session (other than reboot) but I haven't figured out what that might be.

I have NOT tried the nouveau.noaccel=1, will try that next.

My Gear:
Dell Precision Workstation T5500
nVidia NVS-290
Kernel: 2.6.33.8-149.fc13.x86_64
xorg-x11-drv-nouveau.x86_64 1:0.0.16-7.20100423git13c1043.fc13


Symptoms in /var/log/messages:

PFIFO_DMA_PUSHER - Ch 2 (most common)
PFIFO_DMA_PUSHER - Ch 127  (one lockup)
PGRAPH_DATA_ERROR (one lockup)

also a lot of "nouveau_channel_free: freeing fifo" near the errors, but also when nothing goes fatally wrong, so not sure if it is a clue.

Last Hang in /var/log/Xorg.0.log.old
 
EQ overflowing. The server is probably stuck in an infinite loop.
Backtrace:
0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x45cef8]
1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x49a4c4]
2: /usr/bin/Xorg (xf86PostMotionEventP+0xc4) [0x46f0e4]
3: /usr/lib64/xorg/modules/input/evdev_drv.so [0x7f4b0af77dbf]
4: /usr/bin/Xorg (0x400000+0x6d747) [0x46d747]
5: /usr/bin/Xorg (0x400000+0x11ccf3) [0x51ccf3]
6: /lib64/libc.so.6 (0x3898000000+0x32a20) [0x3898032a20]
7: /lib64/libc.so.6 (ioctl+0x7) [0x38980d95a7]
8: /usr/lib64/libdrm.so.2 (drmIoctl+0x28) [0x38af003388]
9: /usr/lib64/libdrm.so.2 (drmCommandWrite+0x1b) [0x38af00360b]
10: /usr/lib64/libdrm_nouveau.so.1 (0x7f4b0cb38000+0x2dfd) [0x7f4b0cb3adfd]
11: /usr/lib64/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xfe) [0x7f4b0cb3afee]
12: /usr/lib64/libdrm_nouveau.so.1 (0x7f4b0cb38000+0x207a) [0x7f4b0cb3a07a]
13: /usr/lib64/libdrm_nouveau.so.1 (nouveau_pushbuf_flush+0x190) [0x7f4b0cb3a450]
14: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f4b0cd5a000+0x1f7c8) [0x7f4b0cd797c8]
15: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f4b0cd5a000+0x6c8e) [0x7f4b0cd60c8e]
16: /usr/lib64/xorg/modules/libexa.so (0x7f4b0c702000+0x523f) [0x7f4b0c70723f]
17: /usr/lib64/xorg/modules/libexa.so (0x7f4b0c702000+0x7494) [0x7f4b0c709494]
18: /usr/lib64/xorg/modules/libexa.so (0x7f4b0c702000+0xd117) [0x7f4b0c70f117]
19: /usr/lib64/xorg/modules/libexa.so (0x7f4b0c702000+0xe0b2) [0x7f4b0c7100b2]
20: /usr/bin/Xorg (0x400000+0xde000) [0x4de000]
21: /usr/lib64/xorg/modules/libexa.so (0x7f4b0c702000+0xcd08) [0x7f4b0c70ed08]
22: /usr/bin/Xorg (0x400000+0xd44e7) [0x4d44e7]
23: /usr/bin/Xorg (0x400000+0x3619c) [0x43619c]
24: /usr/bin/Xorg (0x400000+0x2189a) [0x42189a]
25: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x389801ec5d]
26: /usr/bin/Xorg (0x400000+0x21449) [0x421449]

Comment 22 Charles Butterfield 2010-09-02 05:07:16 UTC

Just noticed that my kernel is "so yesterday", yum update just pulled in 2.6.34.6-47.fc13 from stable updates.  Is there any consensus on whether nouveau.noaccel=1 is needed for this version too?

Comment 23 Ben Skeggs 2010-09-02 05:33:47 UTC

(In reply to comment #22)
> Just noticed that my kernel is "so yesterday", yum update just pulled in
> 2.6.34.6-47.fc13 from stable updates.  Is there any consensus on whether
> nouveau.noaccel=1 is needed for this version too?

There's actually an update which would be in that kernel which fixes the *only* case of this hang I've ever been able to reproduce on any of my cards.  Apparently others still see hangs sometimes.  I've not a clue why, I can't manage to cause it myself.

Comment 24 Charles Butterfield 2010-09-02 05:59:03 UTC

I'm using the new 2.6.34 kernel, but ran into bug #620313 which required a work-around to be bootable.  That appears to be an orthognal issue related to the Dell T3500 AHCI architecture.  Keeping my fingers crossed.  Will report on my experience after I get some, whee :-)

Comment 25 Charles Butterfield 2010-09-04 00:59:44 UTC

I've been using 2.6.34.6-47 WITHOUT nouveau.noaccel=1 for a day now, and haven't run into the hang.  Can't find and new PFIFO or PGRAPH lines in /var/log/messages either.  I'm cautiously optimistic.

Comment 26 Charles Butterfield 2010-09-05 19:32:12 UTC

Sadly, I just encountered the freeze with plain 2.6.34.6-47.  I'm now adding nouveau.noaccel=1 back to my grub.conf :-(

The freeze occurred while dragging the scrollbar in a gnome-terminal.  The single /var/log/messages entry was:

hpc16 kernel: [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2

Comment 27 John L Magee 2010-09-18 22:50:27 UTC

FWIW, this never happened since installing F13 on 8/15 until yesterday on 2.6.34.6-54.fc13.x86_64 which was installed on 9/12. Had two freezes yesterday and one today. Set nouveau.noaccel=1 and will see how it goes

Comment 28 Thomas Jarosch 2010-10-20 20:05:56 UTC

Same thing here on two boxes:

- One dual-head box using 8400 GS and binary nvidia driver
- Single head box using 9400 GT and nouveau driver
  (also tried a 8400 GS on this box)

The issue appears like once a day. Keyboard is unresponsive, mouse cursor movement still works. Both boxes are still reachable via network!

So ping/ssh still works. When the X server was stuck, I did a backtrace via sysreq. Then I shot it with SIGKILL. The server restarted and hung on startup. I've attached gdb to the hanging process and captured a stack trace.

Here's what I got:
X_gdb_backtrace.txt: Stuck X process call trace
sysreq_backtrace_nouveau1.txt: Kernel backtrace before killing X
sysreq_backtrace_nouveau2.txt: Kernel backtrace after killing X + boot output

Hopefully this provides new evidence what's going on.

Related bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=28320
https://bugzilla.redhat.com/show_bug.cgi?id=642204

Comment 29 Thomas Jarosch 2010-10-20 20:06:33 UTC

Created attachment 454664 [details]
Hanging X process call trace

Comment 30 Thomas Jarosch 2010-10-20 20:07:04 UTC

Created attachment 454665 [details]
Kernel backtrace before killing X

Comment 31 Thomas Jarosch 2010-10-20 20:07:27 UTC

Created attachment 454666 [details]
Kernel backtrace after killing X + boot output

Comment 32 Michael 2011-03-03 01:44:48 UTC

Hi, another "me too". I got this freeze twice today, both times I was using mplayer. It's still a problem in fully updated Fedora 14 as of 02 March 2011.

Details:

# mplayer -V
MPlayer SVN-r31628-4.4.4 (C) 2000-2010 MPlayer Team

# uname -a
Linux magrathea 2.6.35.11-83.fc14.x86_64 #1 SMP Mon Feb 7 07:06:44 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

# rpm -qa|grep nouveau
xorg-x11-drv-nouveau-0.0.16-11.20100826git065576d.fc14.x86_64

From lspci -v:

02:00.0 VGA compatible controller: nVidia Corporation G98 [GeForce 8400 GS] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Giga-byte Technology Device 349c
        Flags: bus master, fast devsel, latency 0, IRQ 24
        Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Memory at f8000000 (64-bit, non-prefetchable) [size=32M]
        I/O ports at bc00 [size=128]
        Expansion ROM at fbbe0000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nouveau
        Kernel modules: nouveau, nvidiafb



And the end of my /var/log/messages:

Mar  2 19:38:36 magrathea kernel: [ 4417.529250] [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x00200252e4 Put 0x00200252e8 IbGet 0x00000d95 IbPut 0x00000dc3 State 0x8000ae0c Push 0x00406040
Mar  2 19:38:36 magrathea kernel: [ 4417.529261] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0e08 Data 0x00000000:0x00042050
Mar  2 19:38:36 magrathea kernel: [ 4417.529263] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
Mar  2 19:38:36 magrathea kernel: [ 4417.529289] [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x00200416f8 Put 0x00200417bc IbGet 0x00000d96 IbPut 0x00000dc3 State 0x80000000 Push 0x00406040
Mar  2 19:38:36 magrathea kernel: [ 4417.529314] [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x00200423c4 Put 0x0020042e80 IbGet 0x00000d9e IbPut 0x00000dc7 State 0x80004610 Push 0x00406040
Mar  2 19:38:36 magrathea kernel: [ 4417.529324] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0200 Data 0x001c06cd:0x00042050
Mar  2 19:38:36 magrathea kernel: [ 4417.529326] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD
Mar  2 19:38:36 magrathea kernel: [ 4417.529338] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x0204 Data 0x001c06cd:0x001c06cd
Mar  2 19:38:36 magrathea kernel: [ 4417.529339] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
Mar  2 19:38:36 magrathea kernel: [ 4417.529353] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x020c Data 0x00000000:0x45a10000
Mar  2 19:38:36 magrathea kernel: [ 4417.529354] [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD
Mar  2 19:38:36 magrathea kernel: [ 4417.529379] [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x002002531c Put 0x0020025320 IbGet 0x00000da3 IbPut 0x00000dcb State 0x8000b58

Comment 33 Hans 2011-03-17 11:25:14 UTC

Hi,
I have the same problem (since a month now -/+) with a F14 station:

# uname -a
Linux foton 2.6.35.11-83.fc14.x86_64 #1 SMP Mon Feb 7 07:06:44 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

# rpm -qa | grep nouveau
xorg-x11-drv-nouveau-0.0.16-11.20100826git065576d.fc14.x86_64

# lspci -v
04:00.0 VGA compatible controller: nVidia Corporation G80 [GeForce 8800 GTX] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device 8233
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at ea000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Memory at e8000000 (64-bit, non-prefetchable) [size=32M]
        I/O ports at ac00 [size=128]
        Expansion ROM at ebfe0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: nouveau
        Kernel modules: nouveau, nvidiafb

The graphical system hangs once a day on average and it happens with different applications. Access over ssh is still possible and all non-graphical stuff seems to be unaffected.

End of dmesg:

[11085.373151] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020021c94 Put 0x0020021c98 IbGet 0x0000047d IbPut 0x0000049f State 0x8000a3d0 Push 0x00406040
[11085.373457] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1360 Data 0x00000000:0x00000001
[11085.373459] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373469] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1340 Data 0x00004001:0x00008006
[11085.373471] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373481] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1344 Data 0x00004001:0x00004001
[11085.373483] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373493] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1348 Data 0x00008006:0x00004001
[11085.373495] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373505] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x134c Data 0x00008006:0x00008006
[11085.373507] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373517] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1350 Data 0x00000000:0x00004001
[11085.373519] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373528] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1358 Data 0x00000000:0x00004001
[11085.373530] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373549] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x155c Data 0x00000000:0x00000000
[11085.373551] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373561] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1560 Data 0x00000000:0x40a63000
[11085.373563] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373572] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1564 Data 0x00000000:0x00000000
[11085.373574] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.373594] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x002006142c Put 0x0020062634 IbGet 0x00000482 IbPut 0x000004a7 State 0x80004214 Push 0x00406040
[11085.373692] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/2 Class 0x502d Mthd 0x020c Data 0x00042050:0x00042050
[11085.373695] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
[11085.373704] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/2 Class 0x502d Mthd 0x0210 Data 0x00000000:0x000a1240
[11085.373706] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1360 Data 0x00000000:0x00000001
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1340 Data 0x00004001:0x00008006
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1344 Data 0x00004001:0x00004001
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1348 Data 0x00008006:0x00004001
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x134c Data 0x00008006:0x00008006
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1350 Data 0x00000000:0x00004001
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x1358 Data 0x00000000:0x00004001
[11085.374002] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374103] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x5097 Mthd 0x155c Data 0x00000000:0x00000000
[11085.374106] [drm] nouveau 0000:04:00.0: PGRAPH_DATA_ERROR - unknown value 0x0000000d
[11085.374142] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020021cc4 Put 0x0020021cc8 IbGet 0x00000489 IbPut 0x000004af State 0x8000af05 Push 0x00406040
[11085.374297] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020050010 Put 0x0020051484 IbGet 0x0000048c IbPut 0x000004b3 State 0x80000020 Push 0x00406040
[11085.374315] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020051494 Put 0x0020051efc IbGet 0x0000048e IbPut 0x000004b3 State 0x40000004 Push 0x00406040
[11085.374331] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020051efc Put 0x0020052efc IbGet 0x00000490 IbPut 0x000004b3 State 0x80002054 Push 0x00406040
[11085.374348] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020060130 Put 0x0020060258 IbGet 0x00000494 IbPut 0x000004b3 State 0x8000b28c Push 0x00406040
[11085.374364] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020060258 Put 0x0020060384 IbGet 0x00000496 IbPut 0x000004b3 State 0x80002054 Push 0x00406040
[11085.374382] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020021d0c Put 0x0020021d10 IbGet 0x0000049b IbPut 0x000004b3 State 0x8000a684 Push 0x00406040
[11085.374399] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x00200605dc Put 0x0020060728 IbGet 0x0000049c IbPut 0x000004b3 State 0x80000000 Push 0x00406040
[11085.374416] [drm] nouveau 0000:04:00.0: PFIFO_DMA_PUSHER - Ch 2 Get 0x0020060728 Put 0x002006141c IbGet 0x0000049e IbPut 0x000004b3 State 0x80002054 Push 0x00406040

Comment 34 Bug Zapper 2011-06-01 12:50:16 UTC

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 35 Bug Zapper 2011-06-29 12:58:21 UTC

Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.

airlied
anton
benjavalero
ben.kevan
bitbashing
bskeggs
bulk
cb20777
dougsland
frederic.coiffier
gansalmon
geoffreyx.d.wattles
hans
infertux
itamar
jheiv
jlmagee
jonathan
karo1170
kernel-maint
madhu.chinakonda
mdunphy
scholnik
stephan.duehr
thomas.jarosch
xjcook