Red Hat Bugzilla – Bug 849440
Last modified: 2013-01-02 12:48:12 EST
Created attachment 605489 [details]
some logs that seem relevant
Description of problem:
while using fedora on my home computers, i see a lot of kernel crashes, and some of them crashes the system.
the message on screen looks like this:
kernel BUG at drivers/gpu/drm/ttm/ttm_ko.c:16591
unfortunately, it is never saved to /var/loh/messages and the pictures i took last time with my camera are unclear so maybe my transcription is wrong.
What is saved, is the Xorg Tainted part of the problem. Most (all?) seem to be related to nouveau, and the problem seems to be amplified by the new 3.5.x kernels.
The messages look like this:
The latest week's log looks like is 1.8 GB large:
-rw-r--r--. 1 root root 1875742682 Aug 19 10:18 /var/log/messages-20120819
I've seen logs this large before on this machine. I;ve also seen another computer crashing (even at boot), after switching to linux 3.5.x .
Unlike that computer, on this i use qemu a lot, so that may trigger more crashes.
Both have NVIDIA video cards.
On the computer that crashes a lot:
# lspci -nn | grep -i VGA
02:00.0 VGA compatible controller : nVidia Corporation G73 [GeForce 7300 GT] [10de:0393] (rev a1)
The other one has a 6150 integrated one.
# cat /var/log/messages-20120819 | grep -i taint | wc -l
# cat /var/log/messages-20120819 | grep BUG | wc -l
The BUG line is a few lines above the Tainted line, in the picture. I wonder why so many Tainted lines are in the kernel and not that many BUG lines :)
Version-Release number of selected component (if applicable):
# uname -a
Linux guzu.dyndns.org 3.5.2-1.fc17.i686.PAE #1 SMP Wed Aug 15 16:30:14 UTC 2012 i686 i686 i386 GNU/Linux
Steps to Reproduce:
# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.5.2-1.fc17.i686.PAE root=UUID=18748fc0-abfd-403c-ad5c-1adf27e9d5e8 ro rd.md=0 rd.lvm=0 rd.dm=0 KEYTABLE=us quiet SYSFONT=latarcyrheb-sun16 rhgb rd.luks=0 LANG=en_US.UTF-8
Maybe this crashes point to a defective hardware?
# rpm -qa | grep nouveau
there are indeed many problems here. Most of the traces aren't actually useful, because of the 'tainted' flags. They indicate that a problem already happened before that one.
The only 'not tainted' problem in those logs is the networking related oops, so lets focus on that one for now. I just reported it upstream.
You might want to try running memtest86 for a while just to rule out some of the more obvious hardware problems.
Are you still seeing these problems with 3.5.5 or 3.6?
I do not have 3.5.5 nor 3.6, but i'll check if there's an update coming :)
Truth is, my latest four 'messages' have at most 3.1 megs, so i'd say the problem is gone, or it's because i haven't played Starcraft lately.
$ ls -lh /var/log/messages*
-rw-------. 1 root root 848K Oct 8 21:14 /var/log/messages
-rw-------. 1 root root 566K Sep 16 09:28 /var/log/messages-20120916
-rw-------. 1 root root 3.0M Sep 23 08:15 /var/log/messages-20120923
-rw-------. 1 root root 2.3M Sep 30 09:12 /var/log/messages-20120930
-rw-------. 1 root root 3.1M Oct 7 08:33 /var/log/messages-20121007
i get a lot of crashes, with 3.6 kernels, and everytime i looked at the freeze screen, nouveau was there. unfortunately, nothing can be found in /var/log/messages .