Bug 195769 - Kernel BUG at include/linux/list.h:180
Summary: Kernel BUG at include/linux/list.h:180
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-06-17 15:39 UTC by David Highley
Modified: 2015-01-04 22:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-10-19 20:25:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Part of /var/log/message file. (8.15 KB, text/plain)
2006-10-19 08:41 UTC, David Highley
no flags Details

Description David Highley 2006-06-17 15:39:35 UTC
Description of problem:
Kernel BUG at include/linux/list.h:180

Version-Release number of selected component (if applicable):
2.6.16-1.2133_FC5

How reproducible:
It has occured 3 times in the last month. All 3 times it has happened in the
middle of the night. The header list.h only has 167 lines.

Steps to Reproduce:
1. Not sure, but it maybe related to mythv recordings and or how full the disks is.
2.
3.
  
Actual results:
Jun 17 01:06:33 redwood kernel: List corruption. prev->next should be
ffff810035c8d000, but was ffff8100171a1814
Jun 17 01:06:33 redwood kernel: ----------- [cut here ] --------- [please bite
here ] ---------
Jun 17 01:06:33 redwood kernel: Kernel BUG at include/linux/list.h:180
Jun 17 01:06:33 redwood kernel: invalid opcode: 0000 [1] SMP
Jun 17 01:06:33 redwood kernel: last sysfs file:
/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Jun 17 01:06:33 redwood kernel: CPU 0
Jun 17 01:06:33 redwood kernel: Modules linked in: nfs lockd nfs_acl autofs4
sunrpc video button battery ac sg ipv6 lp parport_pc parport floppy nvram usblp
ehci_hcd usb_storage ohci_hcd snd_emu10k1_synth snd_emux_synth snd_seq_virmidi
snd_seq_midi_emul snd_emu10k1 bt878 tuner snd_intel8x0 snd_rawmidi bttv
snd_ac97_codec snd_ac97_bus snd_seq_dummy video_buf snd_seq_oss nvidia(U)
compat_ioctl32 i2c_algo_bit snd_seq_midi_event snd_seq snd_pcm_oss v4l2_common
btcx_risc ir_common tveeprom videodev emu10k1_gp gameport snd_mixer_oss
i2c_nforce2 i2c_core snd_pcm snd_seq_device snd_util_mem snd_hwdep forcedeth
snd_timer snd soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3
jbd sata_nv libata sd_mod scsi_mod
Jun 17 01:06:33 redwood kernel: Pid: 202, comm: kswapd0 Tainted: P      2.6.16-1
.2133_FC5 #1
Jun 17 01:06:33 redwood kernel: RIP: 0010:[<ffffffff8017c39d>]
<ffffffff8017c39d>{free_block+135}
Jun 17 01:06:33 redwood kernel: RSP: 0018:ffff81000235fb68  EFLAGS: 00010086
Jun 17 01:06:33 redwood kernel: RAX: 0000000000000054 RBX: ffff810035c8d000 RCX:
000000000000aa52
Jun 17 01:06:33 redwood kernel: RDX: 0000000000000000 RSI: 0000000000000096 RDI:
ffffffff803c89e0
Jun 17 01:06:33 redwood kernel: RBP: ffff810037ff0f40 R08: ffffffff803c89f8 R09:
ffff81003ea87480
Jun 17 01:06:33 redwood kernel: R10: 0000000000000010 R11: 0000000000000000 R12:
ffff810035c8dad8
Jun 17 01:06:34 redwood kernel: R13: 0000000000000000 R14: ffff81003ffd9400 R15:
ffff81003ffdacd0
Jun 17 01:06:34 redwood kernel: FS:  0000000046e0c940(0000)
GS:ffffffff80514000(0000) knlGS:00000000f7fd68e0
Jun 17 01:06:34 redwood kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jun 17 01:06:34 redwood kernel: CR2: 00002aaab0c8a000 CR3: 000000002e2b0000 CR4:
00000000000006e0
Jun 17 01:06:34 redwood kernel: Process kswapd0 (pid: 202, threadinfo
ffff81000235e000, task ffff810002241820)
Jun 17 01:06:34 redwood kernel: Stack: ffff8100032ac090 000000000000002f 0000001
50000003c ffff810037ff0f40
Jun 17 01:06:34 redwood kernel:        000000000000003c ffff81003ffdac00
0000000000000000 ffff81003ffd9400
Jun 17 01:06:34 redwood kernel:        ffff810037ff0f90 ffffffff8017c0e9

Expected results:


Additional info:

Comment 1 Dave Jones 2006-06-26 14:41:16 UTC

*** This bug has been marked as a duplicate of 73733 ***

Comment 2 David Highley 2006-07-03 17:32:24 UTC
Since I removed the nvidia kernel module and I have upgraded the kernel to
2.6.17-1.2139_FC5 we have a different issue. Here is the latest kernel trace.

Jul  2 04:03:12 redwood kernel: List corruption. next->prev should be
ffff81000cb10000, but was 0556401754cc3938
Jul  2 04:03:12 redwood kernel: ----------- [cut here ] --------- [please bite
here ] ---------
Jul  2 04:03:12 redwood kernel: Kernel BUG at include/linux/list.h:185
Jul  2 04:03:12 redwood kernel: invalid opcode: 0000 [1] SMP
Jul  2 04:03:12 redwood kernel: last sysfs file:
/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Jul  2 04:03:12 redwood kernel: CPU 0
Jul  2 04:03:12 redwood kernel: Modules linked in: nfs lockd nfs_acl ipv6
autofs4 sunrpc sg video button battery acpi_memhotplug ac lp parport_pc parport
usblp
usb_storage ohci_hcd ehci_hcd floppy snd_intel8x0 snd_emu10k1_synth bt878
snd_emux_synth tuner snd_seq_virmidi snd_seq_midi_emul bttv snd_emu10k1
video_buf ir_common compat_ioctl32 emu10k1_gp gameport i2c_algo_bit v4l2_common
btcx_risc tveeprom videodev snd_rawmidi snd_ac97_codec snd_ac97_bus snd_util_mem
snd_seq_dummy snd_hwdep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_mixer_oss snd_pcm forcedeth i2c_nforce2 snd_timer snd soundcore
snd_page_alloc
i2c_core dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd sata_nv libata sd_mod
scsi_mod
Jul  2 04:03:12 redwood kernel: Pid: 31471, comm: beagle-build-in Not tainted
2.6.17-1.2139_FC5 #1
Jul  2 04:03:12 redwood kernel: RIP: 0010:[<ffffffff80261fd7>]
<ffffffff80261fd7>{cache_alloc_refill+337}
Jul  2 04:03:12 redwood kernel: RSP: 0018:ffff81000dff3b08  EFLAGS: 00010086
Jul  2 04:03:12 redwood kernel: RAX: 0000000000000054 RBX: 0000000000000027 RCX:
ffffffff80548a98
Jul  2 04:03:12 redwood kernel: RDX: 0000000000000000 RSI: 0000000000000096 RDI:
ffffffff80548a80
Jul  2 04:03:12 redwood kernel: RBP: ffff81000cb10000 R08: ffffffff80548a98 R09:
ffff81000dff3858
Jul  2 04:03:12 redwood kernel: R10: 0000000000000010 R11: 0000000000000000 R12:
ffff81003efc6e40
Jul  2 04:03:12 redwood kernel: R13: ffff8100011bd000 R14: ffff81003efc6e50 R15:
0000000000000015
Jul  2 04:03:12 redwood kernel: FS:  0000000040485940(0063)
GS:ffffffff8069c000(0000) knlGS:00000000f7fd68e0
Jul  2 04:03:12 redwood kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jul  2 04:03:12 redwood kernel: CR2: 00002aaaaaaac000 CR3: 00000000366fa000 CR4:
00000000000006e0
Jul  2 04:03:12 redwood kernel: Process beagle-build-in (pid: 31471, threadinfo
ffff81000dff2000, task ffff81003105e100)
Jul  2 04:03:12 redwood kernel: Stack: 000000d03e0ab4e0 ffff81003efc9500
ffff81003efc6e80 00000000000000d0
Jul  2 04:03:12 redwood kernel:        ffff81003efc9500 0000000000000246
ffff810019221830 ffff81003e04f400
Jul  2 04:03:12 redwood kernel:        ffff810001170a28 ffffffff8020a35f
Jul  2 04:03:12 redwood kernel: Call Trace: <ffffffff8020a35f>{kmem_cache_alloc+127}
Jul  2 04:03:12 redwood kernel:       
<ffffffff80313c50>{selinux_inode_alloc_security+44}
Jul  2 04:03:12 redwood kernel:        <ffffffff8022692a>{alloc_inode+257}
<ffffffff8022426c>{iget_locked+114}
Jul  2 04:03:12 redwood kernel:        <ffffffff8809077c>{:ext3:ext3_lookup+81}
<ffffffff8020ca3f>{do_lookup+206}
Jul  2 04:03:12 redwood kernel:        <ffffffff802098b1>{__link_path_walk+2590}
<ffffffff8020e6a7>{link_path_walk+92}
Jul  2 04:03:12 redwood kernel:       
<ffffffff803170af>{selinux_inode_getattr+80} <ffffffff8020c7fe>{do_path_lookup+633}
Jul  2 04:03:12 redwood kernel:        <ffffffff802249c9>{__user_walk_fd+55}
<ffffffff8022988c>{vfs_stat_fd+27}
Jul  2 04:03:12 redwood kernel:        <ffffffff802246b3>{sys_newstat+25}
<ffffffff80262d8e>{system_call+126}
Jul  2 04:03:12 redwood kernel:
Jul  2 04:03:12 redwood kernel: Code: 0f 0b 68 73 8e 47 80 c2 b9 00 48 8b 55 00
48 8b 45 08 48 89
Jul  2 04:03:12 redwood kernel: RIP <ffffffff80261fd7>{cache_alloc_refill+337}
RSP <ffff81000dff3b08>
Jul  2 04:03:12 redwood kernel:  <3>BUG: sleeping function called from invalid
context at include/linux/rwsem.h:43
Jul  2 04:03:12 redwood kernel: in_atomic():0, irqs_disabled():1
Jul  2 04:03:12 redwood kernel:
Jul  2 04:03:12 redwood kernel: Call Trace:
<ffffffff80299bc2>{blocking_notifier_call_chain+31}
Jul  2 04:03:12 redwood kernel:        <ffffffff80215c97>{do_exit+32}
<ffffffff80270b57>{kernel_math_error+0}
Jul  2 04:03:12 redwood kernel:        <ffffffff802710f4>{do_invalid_op+173}
<ffffffff80261fd7>{cache_alloc_refill+337}
Jul  2 04:03:12 redwood kernel:        <ffffffff8021b4f2>{bad_range+16}
<ffffffff8020a13f>{get_page_from_freelist+841}
Jul  2 04:03:12 redwood kernel:        <ffffffff80290991>{printk+82}
<ffffffff80263c65>{error_exit+0}
Jul  2 04:03:12 redwood kernel:       
<ffffffff80261fd7>{cache_alloc_refill+337}
<ffffffff80261fd7>{cache_alloc_refill+337}
Jul  2 04:03:12 redwood kernel:        <ffffffff8020a35f>{kmem_cache_alloc+127}
<ffffffff80313c50>{selinux_inode_alloc_security+44}
Jul  2 04:03:12 redwood kernel:        <ffffffff8022692a>{alloc_inode+257}
<ffffffff8022426c>{iget_locked+114}
Jul  2 04:03:12 redwood kernel:        <ffffffff8809077c>{:ext3:ext3_lookup+81}
<ffffffff8020ca3f>{do_lookup+206}
Jul  2 04:03:12 redwood kernel:        <ffffffff802098b1>{__link_path_walk+2590}
<ffffffff8020e6a7>{link_path_walk+92}
Jul  2 04:03:12 redwood kernel:       
<ffffffff803170af>{selinux_inode_getattr+80} <ffffffff8020c7fe>{do_path_lookup+633}
Jul  2 04:03:12 redwood kernel:        <ffffffff802249c9>{__user_walk_fd+55}
<ffffffff8022988c>{vfs_stat_fd+27}
Jul  2 04:03:12 redwood kernel:        <ffffffff802246b3>{sys_newstat+25}
<ffffffff80262d8e>{system_call+126}
Jul  2 04:03:12 redwood kernel: BUG: beagle-build-in/31471, lock held at task
exit time!
Jul  2 04:03:12 redwood kernel:  [ffff810019221830] {inode_init_once}
Jul  2 04:03:12 redwood kernel: .. held by:   beagle-build-in:31471
[ffff81003105e100, 134]
Jul  2 04:03:12 redwood kernel: ... acquired at:               do_lookup+0x8b/0x188



Comment 3 Orion Poplawski 2006-07-24 15:53:41 UTC
Also seen on our FC4 webserver with 2.6.17-1.2141_FC4smp

Jul 22 05:09:32 hawk kernel: kernel BUG at include/linux/list.h:180!
Jul 22 05:09:32 hawk kernel: invalid opcode: 0000 [#1]
Jul 22 05:09:32 hawk kernel: SMP
Jul 22 05:09:32 hawk kernel: last sysfs file: /class/vc/vcsa5/dev
Jul 22 05:09:32 hawk kernel: Modules linked in: ipv6 autofs4 nfs lockd nfs_acl
sunrpc ip_
conntrack_ftp ipt_REJECT ipt_LOG xt_state ip_conntrack nfnetlink xt_tcpudp
iptable_filter
 ip_tables x_tables uhci_hcd i2c_piix4 i2c_core e100 mii floppy dm_snapshot
dm_zero dm_mi
rror ext3 jbd dm_mod aic7xxx scsi_transport_spi sd_mod scsi_mod
Jul 22 05:09:32 hawk kernel: CPU:    1
Jul 22 05:09:32 hawk kernel: EIP:    0060:[<c0462f0a>]    Not tainted VLI
Jul 22 05:09:32 hawk kernel: EFLAGS: 00010092   (2.6.17-1.2141_FC4smp #1)
Jul 22 05:09:32 hawk kernel: EIP is at free_block+0x69/0x189
Jul 22 05:09:32 hawk kernel: eax: 00000044   ebx: e04a9000   ecx: 00000046  
edx: 0000000
6
Jul 22 05:09:32 hawk kernel: esi: e04a9980   edi: effe11e0   ebp: 00000000  
esp: effeeef
4
Jul 22 05:09:32 hawk kernel: ds: 007b   es: 007b   ss: 0068
Jul 22 05:09:32 hawk kernel: Process events/1 (pid: 9, threadinfo=effee000
task=c18a0050)
Jul 22 05:09:32 hawk kernel: Stack: c0615a58 e04a9000 e04e9000 efe116a0 0000001e
c1872d00
 00000005 efe0f034
Jul 22 05:09:32 hawk kernel:        efe0f020 0000001e efe0f000 00000000 c04630af
00000000
 00000000 c1872d00
Jul 22 05:09:32 hawk kernel:        effe1204 c1792a60 effe11e0 effe1170 c1872d00
c046315c
 00000000 00000000
Jul 22 05:09:32 hawk kernel: Call Trace:
Jul 22 05:09:32 hawk kernel:  <c04630af> drain_array+0x85/0xb1  <c046315c>
cache_reap+0x8
1/0x1c0
Jul 22 05:09:32 hawk kernel:  <c04310ef> run_workqueue+0x77/0xb4  <c04630db>
cache_reap+0
x0/0x1c0
Jul 22 05:09:32 hawk kernel:  <c04311b9> worker_thread+0x0/0x106  <c043128e>
worker_threa
d+0xd5/0x106
Jul 22 05:09:32 hawk kernel:  <c041ee2d> default_wake_function+0x0/0xc 
<c0434001> kthrea
d+0x9d/0xc9
Jul 22 05:09:32 hawk kernel:  <c0433f64> kthread+0x0/0xc9  <c0402005>
kernel_thread_helpe
r+0x5/0xb
Jul 22 05:09:32 hawk kernel: Code: 03 8b 52 0c 8b 5a 24 8b 4c 24 08 8b 54 24 28
8b 43 04
8b bc 91 90 00 00 00 8b 00 39 d8 74 17 50 53 68 58 5a 61 c0 e8 e4 13 fc ff <0f>
0b b4 00
8e 5a 61 c0 83 c4 0c 8b 03 8b 40 04 39 d8 74 17 50
Jul 22 05:09:32 hawk kernel: EIP: [<c0462f0a>] free_block+0x69/0x189 SS:ESP
0068:effeeef4
Jul 22 05:09:32 hawk kernel:  <3>BUG: sleeping function called from invalid
context at in
clude/linux/rwsem.h:43
Jul 22 05:09:32 hawk kernel: in_atomic():0, irqs_disabled():1
Jul 22 05:09:32 hawk kernel:  <c042f3ce> blocking_notifier_call_chain+0x18/0x4b
 <c0425a4
4> do_exit+0x1c/0x78d
Jul 22 05:09:32 hawk kernel:  <c040541e> die+0x25b/0x263  <c04056e3>
do_invalid_op+0x0/0x
9d
Jul 22 05:09:32 hawk kernel:  <c0405774> do_invalid_op+0x91/0x9d  <c0462f0a>
free_block+0
x69/0x189
Jul 22 05:09:32 hawk kernel:  <c04242ca> vprintk+0x2a5/0x2c9  <c04048cb>
error_code+0x4f/
0x54
Jul 22 05:09:32 hawk kernel:  <c0462f0a> free_block+0x69/0x189  <c04630af>
drain_array+0x
85/0xb1
Jul 22 05:09:32 hawk kernel:  <c046315c> cache_reap+0x81/0x1c0  <c04310ef>
run_workqueue+
0x77/0xb4
Jul 22 05:09:32 hawk kernel:  <c04630db> cache_reap+0x0/0x1c0  <c04311b9>
worker_thread+0
x0/0x106
Jul 22 05:09:32 hawk kernel:  <c043128e> worker_thread+0xd5/0x106  <c041ee2d>
default_wak
e_function+0x0/0xc
Jul 22 05:09:32 hawk kernel:  <c0434001> kthread+0x9d/0xc9  <c0433f64>
kthread+0x0/0xc9
Jul 22 05:09:32 hawk kernel:  <c0402005> kernel_thread_helper+0x5/0xb
Jul 22 05:09:32 hawk kernel: BUG: events/1/9, lock held at task exit time!
Jul 22 05:09:32 hawk kernel:  [c06ff200] {cache_chain_mutex}
Jul 22 05:09:32 hawk kernel: .. held by:          events/1:    9 [c18a0050, 110]
Jul 22 05:09:32 hawk kernel: ... acquired at:               cache_reap+0x11/0x1c0


Comment 4 David Highley 2006-08-01 03:47:25 UTC
Since the kernel trace was removed I have had two kernel faults in two days.
Both seemed to involve cpu scaling.
Linux redwood 2.6.17-1.2157_FC5 #1 SMP Tue Jul 11 22:53:56 EDT 2006 x86_64
x86_64 x86_64 GNU/Linux

Jul 30 16:45:59 redwood kernel: List corruption. prev->next should be
ffff810033e1b000, but was ffff810098aaa6af
Jul 30 16:45:59 redwood kernel: ----------- [cut here ] --------- [please bite
here ] ---------
Jul 30 16:45:59 redwood kernel: Kernel BUG at include/linux/list.h:180
Jul 30 16:45:59 redwood kernel: invalid opcode: 0000 [1] SMP
Jul 30 16:45:59 redwood kernel: last sysfs file:
/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Jul 30 16:45:59 redwood kernel: CPU 0
Jul 30 16:45:59 redwood kernel: Modules linked in: nfs lockd nfs_acl autofs4
sunrpc video button battery acpi_memhotplug ac sg ipv6 lp parport_pc parport
usb_storage usblp snd_intel8x0 snd_emu10k1_synth snd_emux_synth snd_seq_virmidi
snd_seq_midi_emul bt878 ehci_hcd tuner bttv video_buf ir_common compat_ioctl32
i2c_algo_bit snd_emu10k1 ohci_hcd snd_rawmidi emu10k1_gp gameport v4l2_common
btcx_risc tveeprom snd_ac97_codec snd_ac97_bus snd_util_mem floppy videodev
snd_hwdep nvidia(U) snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm i2c_nforce2 i2c_core snd_timer
forcedeth snd soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3
jbd sata_nv libata
sd_mod scsi_mod
Jul 30 16:45:59 redwood kernel: Pid: 207, comm: kswapd0 Tainted: P     
2.6.17-1.2157_FC5 #1
Jul 30 16:45:59 redwood kernel: RIP: 0010:[<ffffffff802c6622>]
<ffffffff802c6622>{free_block+159}
Jul 30 16:45:59 redwood kernel: RSP: 0018:ffff81003ed35b38  EFLAGS: 00010086
Jul 30 16:45:59 redwood kernel: RAX: 0000000000000054 RBX: ffff810033e1b000 RCX:
ffffffff80549a98
Jul 30 16:45:59 redwood kernel: RDX: 0000000000000000 RSI: 0000000000000096 RDI:
ffffffff80549a80
Jul 30 16:45:59 redwood kernel: RBP: ffff810037ff5ec0 R08: ffffffff80549a98 R09:
ffff81003ed35888
Jul 30 16:45:59 redwood kernel: R10: 0000000000000010 R11: 0000000000000000 R12:
ffff810033e1b4f0
Jul 30 16:45:59 redwood kernel: R13: 0000000000000000 R14: ffff81003f1c1440 R15:
ffff8100011bd968
Jul 30 16:45:59 redwood kernel: FS:  0000000040a00940(0000)
GS:ffffffff8069d000(0000) knlGS:00000000f7fd68e0
Jul 30 16:45:59 redwood kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b



Jul 31 20:00:44 redwood kernel: List corruption. next->prev should be
ffff810033f37000, but was 00e0084156075805
Jul 31 20:00:44 redwood kernel: ----------- [cut here ] --------- [please bite
here ] ---------
Jul 31 20:00:44 redwood kernel: Kernel BUG at include/linux/list.h:185
Jul 31 20:00:44 redwood kernel: invalid opcode: 0000 [1] SMP
Jul 31 20:00:44 redwood kernel: last sysfs file:
/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Jul 31 20:00:44 redwood kernel: CPU 0
Jul 31 20:00:44 redwood kernel: Modules linked in: nfs lockd nfs_acl autofs4
sunrpc video button battery acpi_memhotplug ac sg ipv6 lp parport_pc parport usblp
usb_storage snd_intel8x0 snd_emu10k1_synth snd_emux_synth snd_seq_virmidi
snd_seq_midi_emul snd_emu10k1 ohci_hcd bt878 ehci_hcd tuner emu10k1_gp gameport
snd_rawmidi snd_ac97_codec snd_ac97_bus snd_util_mem bttv video_buf ir_common
compat_ioctl32 nvidia(U) snd_hwdep i2c_algo_bit v4l2_common floppy snd_seq_dummy
btcx_risc tveeprom videodev snd_seq_oss snd_seq_midi_event snd_seq
snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm forcedeth snd_timer snd
soundcore i2c_nforce2 i2c_core snd_page_alloc dm_snapshot dm_zero dm_mirror
dm_mod ext3 jbd sata_nv libata
sd_mod scsi_mod
Jul 31 20:00:44 redwood kernel: Pid: 5, comm: events/0 Tainted: P     
2.6.17-1.2157_FC5 #1
Jul 31 20:00:44 redwood kernel: RIP: 0010:[<ffffffff802c6649>]
<ffffffff802c6649>{free_block+198}
Jul 31 20:00:44 redwood kernel: RSP: 0000:ffff810037e17d48  EFLAGS: 00010086
Jul 31 20:00:44 redwood kernel: RAX: 0000000000000054 RBX: ffff810033f37000 RCX:
ffffffff80549a98
Jul 31 20:00:44 redwood kernel: RDX: 0000000000000000 RSI: 0000000000000092 RDI:
ffffffff80549a80
Jul 31 20:00:44 redwood kernel: RBP: ffff81003f1c6c40 R08: ffffffff80549a98 R09:
ffff810037e17a98
Jul 31 20:00:44 redwood kernel: R10: 0000000000000010 R11: 0000000000000000 R12:
ffff810033f376d0
Jul 31 20:00:44 redwood kernel: R13: ffff81003f1c6c80 R14: ffff81003f1d10c0 R15:
ffff81003f1d21b0
Jul 31 20:00:44 redwood kernel: FS:  0000000048c0d940(0000)
GS:ffffffff8069d000(0000) knlGS:0000000000000000
Jul 31 20:00:44 redwood kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jul 31 20:00:44 redwood kernel: CR2: 00002aaaaaaac000 CR3: 000000001fb83000 CR4:
00000000000006e0
Jul 31 20:00:44 redwood kernel: Process events/0 (pid: 5, threadinfo
ffff810037e16000, task ffff810037fef820)
Jul 31 20:00:44 redwood kernel: Stack: ffffffffffffffff 0000000037fef820
0000003100000060 ffff81003f1d2028
Jul 31 20:00:45 redwood kernel:        0000000000000060 ffff81003f1d2000
ffff81003f1c6c80 0000000000000000
Jul 31 20:00:45 redwood kernel:        ffff81003f1d10c0 ffffffff802c6813
Jul 31 20:00:45 redwood kernel: Call Trace: <ffffffff802c6813>{drain_array+139}
<ffffffff802c801d>{cache_reap+250}
Jul 31 20:00:45 redwood kernel:        <ffffffff802c7f23>{cache_reap+0}
<ffffffff80250f16>{run_workqueue+159}
Jul 31 20:00:45 redwood kernel:        <ffffffff8024d69e>{worker_thread+0}
<ffffffff8024d78e>{worker_thread+240}
Jul 31 20:00:45 redwood kernel:       
<ffffffff8028b810>{default_wake_function+0} <ffffffff80234d6c>{kthread+246}
Jul 31 20:00:45 redwood kernel:        <ffffffff80263ade>{child_rip+8}
<ffffffff80234c76>{kthread+0}
Jul 31 20:00:45 redwood kernel:        <ffffffff80263ad6>{child_rip+0}
Jul 31 20:00:45 redwood kernel:
Jul 31 20:00:45 redwood kernel: Code: 0f 0b 68 73 9e 47 80 c2 b9 00 48 8b 13 48
8b 43 08 48 89 42
Jul 31 20:00:45 redwood kernel: RIP <ffffffff802c6649>{free_block+198} RSP
<ffff810037e17d48>
Jul 31 20:00:45 redwood kernel:  <3>BUG: sleeping function called from invalid
context at include/linux/rwsem.h:43
Jul 31 20:00:45 redwood kernel: in_atomic():0, irqs_disabled():1
Jul 31 20:00:45 redwood kernel:
Jul 31 20:00:45 redwood kernel: Call Trace:
<ffffffff802998a6>{blocking_notifier_call_chain+31}
Jul 31 20:00:45 redwood kernel:        <ffffffff80215c43>{do_exit+32}
<ffffffff80270817>{kernel_math_error+0}
Jul 31 20:00:45 redwood kernel:        <ffffffff80270db4>{do_invalid_op+173}
<ffffffff802c6649>{free_block+198}
Jul 31 20:00:45 redwood kernel:        <ffffffff80290675>{printk+82}
<ffffffff80263925>{error_exit+0}
Jul 31 20:00:45 redwood kernel:        <ffffffff802c6649>{free_block+198}
<ffffffff802c6649>{free_block+198}
Jul 31 20:00:45 redwood kernel:        <ffffffff802c6813>{drain_array+139}
<ffffffff802c801d>{cache_reap+250}
Jul 31 20:00:45 redwood kernel:        <ffffffff802c7f23>{cache_reap+0}
<ffffffff80250f16>{run_workqueue+159}
Jul 31 20:00:45 redwood kernel:        <ffffffff8024d69e>{worker_thread+0}
<ffffffff8024d78e>{worker_thread+240}
Jul 31 20:00:45 redwood kernel:       
<ffffffff8028b810>{default_wake_function+0} <ffffffff80234d6c>{kthread+246}
Jul 31 20:00:45 redwood kernel:        <ffffffff80263ade>{child_rip+8}
<ffffffff80234c76>{kthread+0}
Jul 31 20:00:45 redwood kernel:        <ffffffff80263ad6>{child_rip+0}
Jul 31 20:00:45 redwood kernel: BUG: events/0/5, lock held at task exit time!
Jul 31 20:00:45 redwood kernel:  [ffffffff80551a80] {cache_chain_mutex}
Jul 31 20:00:45 redwood kernel: .. held by:          events/0:    5
[ffff810037fef820, 110]
Jul 31 20:00:45 redwood kernel: ... acquired at:               cache_reap+0x26/0x2fd


Comment 5 David Highley 2006-08-01 03:53:41 UTC
Additional notes. The mother board is a GA-K8NS Gigabyte K8 Triton board amd
processor is AMD Athlon ADA3000AEP4AX. I'm running two other systems with the
same board and processor but Fedora core 4 which do not have an issue.
Linux spruce 2.6.17-1.2142_FC4 #1 Tue Jul 11 22:41:06 EDT 2006 x86_64 x86_64
x86_64 GNU/Linux

Comment 6 Dave Jones 2006-08-02 19:32:20 UTC
The traces with the nvidia module loaded are pretty uninteresting, as it's not
been unknown for that to cause memory corruption issues in the past. However,
they are similar to the traces in comments #2 & #3.

I assume the affected boxes pass a test of memtest86 ?

It's unusual because if that list gets corrupted, we notice bad things happening
very quickly, and it only seems to be affecting you and Orion, and bugs here
would tend to affect a lot more people.

Something else that may try to isolate this would be to try running without some
of the modules loaded.  Knowing for eg, "It only happens if bttv has been
loaded" would be a useful datapoint.

But first of all, please rule out the nvidia module from any further traces.

Comment 7 David Highley 2006-08-03 01:50:54 UTC
After the first posting I did remove the nvidia module and ran until the system
crashed again which ruled out the nvidia driver module. I have not run memtest86
but will see if I can figure out a way to do it. The system does not have a
floppy drive and we have been trying to create bootable thumb drives at work
which seems to be some what of a black art.

Comment 8 Orion Poplawski 2006-08-03 02:53:18 UTC
yum install memtest86+

This should install /boot/memtest86+-1.65.  Then add to /etc/grub.conf:

title Memtest86+
        root (hd0,0)
        kernel /memtest86+-1.65

(use the root and kernel prefix from your existing kernel entries)

then reboot and select Memtest86+.  Voila!

Comment 9 David Highley 2006-08-03 05:26:19 UTC
They had an iso image so I burned a CD and ran memory tests for 2.5 hours 6
passes with no failures. That gave me time to help my son with his Winders virus
clean up:-) I have disabled the cpuspeed service since that was listed as the
last command in the crash. Since it is so infrequent it may take a few weeks
before I know if I have found the problem.

Comment 10 Dave Jones 2006-08-11 06:20:43 UTC
Note, rmmod nvidia is not enough to ensure an untainted kernel.
It must never have been loaded during boot.
By the time you get to rmmod it, it may already have corrupted kernel text.


Comment 11 David Highley 2006-08-11 16:52:11 UTC
I did reboot the system and went through a couple of kernel updates. So the
issue I'm experiencing is not related to the nvidia module.

Comment 12 David Highley 2006-08-22 02:30:59 UTC
I think turning off the cpuspeed service has stopped the issue. There has been a
lengthy discussion on the mailing list:
http://www.ivtvdriver.org/pipermail/ivtv-users/

With the subject: The ivtv DMA error

It appears that others are experiencing related issues with the AMD systems and
the k8powernow and cpuspeed functions.

Comment 13 Dave Jones 2006-10-16 21:31:34 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 14 David Highley 2006-10-19 08:41:00 UTC
Created attachment 138867 [details]
Part of /var/log/message file.

Comment 15 David Highley 2006-10-19 08:50:26 UTC
Dam I dislike unfriendly modern applications. I entered more comments before
doing the attachment which were disgarded! To re-explain the steps that go with
the above log file fragment.

I upgraded to kernel 2.6.18-1.2200.fc5 and then turned the service cpuspeed back
on. I did not upgrade from FC4 to FC5 but checked the device-mapper & lvm2 rpms
anyway and found that there are two device-mapper, x86-64 & i386, and one lvm2
rpms installed. I have had one crash since doing this. The trace is in the
attachment file. I also see the following at boot time in the file
/var/log/messages:
Oct 16 22:00:37 redwood kernel: powernow-k8: Found 1 AMD Athlon(tm) 64 Processor
3000+ processors (version 2.00.00)
Oct 16 22:00:37 redwood kernel: powernow-k8: invalid freq entries 3900000 kHz
vs. 65535000 kHz
Oct 16 22:00:37 redwood kernel: powernow-k8: invalid freq entries 3900000 kHz
vs. 65535000 kHz
Oct 16 22:00:37 redwood kernel: powernow-k8:    0 : fid 0xc (2000 MHz), vid 0x2
Oct 16 22:00:37 redwood kernel: powernow-k8:    1 : fid 0xa (1800 MHz), vid 0x6
Oct 16 22:00:37 redwood kernel: powernow-k8:    2 : fid 0x2 (1000 MHz), vid
0x12Oct 16 22:00:37 redwood kernel: ACPI: (supports S0 S1 S4 S5)


Comment 16 Dave Jones 2006-10-19 17:53:29 UTC
This bug check has been triggered in a number of different reports with the
nvidia module loaded.  Unfortunatly, there's nothing we can do to fix problems
in Nvidia's code.


Comment 17 David Highley 2006-10-19 18:28:32 UTC
This bug has nothing to do with the nvidia module. I have removed the module in
the past and had the same failures occur.

Comment 18 Dave Jones 2006-10-19 19:13:14 UTC
Ok, but please don't post tainted traces. They are worthless, and just add to
the noise.  Apart from comment #2, every trace posted so far has been tainted.

We need an untainted trace from the current kernel.

Comment 19 David Highley 2006-10-19 19:58:29 UTC
Never mind, it seems to hard to figure out. I think it is bios and powernow-k8
compatiibility issues. But just close it the bug as I can work around it by
turning off the cpuspeed service.


Note You need to log in before you can comment on or make changes to this bug.