Bug 232621 - Execution of the frysk testsuite triggers kernel BUG at mm/slab.c:610
Summary: Execution of the frysk testsuite triggers kernel BUG at mm/slab.c:610
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 6
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 427887
TreeView+ depends on / blocked
 
Reported: 2007-03-16 13:28 UTC by Kris Van Hees
Modified: 2008-02-08 04:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-02-08 04:27:37 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
excerpt from /var/log/messages (9.12 KB, text/plain)
2007-03-26 22:11 UTC, George N. White III
no flags Details

Description Kris Van Hees 2007-03-16 13:28:11 UTC
Description of problem:
The execution of the frysk testsuite (from current CVS) resulted in the
following kernel BUG report on the ix386 2.6.20-1.2925.fc5 kernel:

------------[ cut here ]------------
kernel BUG at mm/slab.c:610!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq
freq_table dm_mirror dm_multipath dm_mod video sbs i2c_ec i2c_core dock button
battery ac lp floppy joydev sg snd_hda_intel snd_hda_codec snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
snd_pcm iTCO_wdt pcspkr iTCO_vendor_support tg3 parport_pc ide_cd parport
serio_raw cdrom snd_timer snd soundcore snd_page_alloc ata_piix libata sd_mod
scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    1
EIP:    0060:[<c046a358>]    Not tainted VLI
EFLAGS: 00010046   (2.6.20-1.2925.fc6 #1)
EIP is at free_block+0x46/0xe5
eax: 40000000   ebx: ed981ba0   ecx: 00000001   edx: c171fc28
esi: 00000001   edi: f7e36e00   ebp: f7e5ec80   esp: f7e56f0c
ds: 007b   es: 007b   ss: 0068
Process events/1 (pid: 9, ti=f7e56000 task=f7e076b0 task.ti=f7e56000)
Stack: 00000008 00000000 00000001 00000000 f7e36e20 f7e36e20 00000001 f7e36e00 
       00000000 c046a481 00000000 00000000 f7e5ec80 f7e30764 f7e30740 f7e5ec80 
       00000296 c240a640 c046b696 00000000 00000000 c240a644 f7e51d40 c043067a 
Call Trace:
 [<c046a481>] drain_array+0x8a/0xb5
 [<c046b696>] cache_reap+0x61/0x124
 [<c043067a>] run_workqueue+0x85/0x125
 [<c0617777>] _spin_lock_irqsave+0x9/0xd
 [<c046b635>] cache_reap+0x0/0x124
 [<c0430ff3>] worker_thread+0xf9/0x124
 [<c041e4f2>] default_wake_function+0x0/0xc
 [<c0430efa>] worker_thread+0x0/0x124
 [<c0433607>] kthread+0xb0/0xd9
 [<c0433557>] kthread+0x0/0xd9
 [<c04051a7>] kernel_thread_helper+0x7/0x10
 =======================
Code: 00 00 00 8b 44 24 10 8b 18 8d 83 00 00 00 40 c1 e8 0c 6b d0 28 03 15 80 a6
7f c0 8b 02 f6 c4 40 74 03 8b 52 0c 8b 02 84 c0 78 04 <0f> 0b eb fe 8b 72 24 8b
54 24 28 89 f0 8b bc 95 94 00 00 00 e8 
EIP: [<c046a358>] free_block+0x46/0xe5 SS:ESP 0068:f7e56f0c

Version-Release number of selected component (if applicable):
kernel 2.6.20-1.2925.fc6

How reproducible:
I have not been ablr to reproduce this immediately on demand.  Continuous
execution of the testsuite eventually triggers this bug, though I have not had
enough execution runs (yet) to make a determination about probability.  This is
also complicated by a system hang bug in the kernel that is occasionally
triggered as well.  I will try to capture console output when that happens and
file a separate bug for that as well (nothing in log files due to the hang being
in interrupt handling, it seems).

Steps to Reproduce:
1. Compile frysk (CVS HEAD version) using:
     mkdir build
    ../frysk/autogen.sh
    make
2. Execute the following in the build directory:
   for i in frysk-*; do ( make -C $i check ); done
3. Repeat if the bug did not get triggered, until it does.
  
Actual results:
Kernel BUG report as shown above.

Expected results:
No kernel BUG being triggered :)

Comment 1 George N. White III 2007-03-17 12:24:08 UTC
also encountered for fc5, 2.6.19-1.2288.2.4.fc5smp running on (uniprocessor)
Dell PowerEdge 600SC

Mar 17 09:05:21 cerberus kernel: ------------[ cut here ]------------
Mar 17 09:05:21 cerberus kernel: kernel BUG at mm/slab.c:610!
Mar 17 09:05:21 cerberus kernel: invalid opcode: 0000 [#1]
Mar 17 09:05:21 cerberus kernel: SMP
Mar 17 09:05:21 cerberus kernel: last sysfs file: /block/hda/hda1/size
Mar 17 09:05:21 cerberus kernel: Modules linked in: nls_utf8 hfsplus nfsd 
exportfs lockd nfs_acl ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc video 
sbs i2c_ec dock button battery asus_acpi backlight ac lp parport_pc parport 
sd_mod sg sbp2 scsi_mod floppy ohci1394 ieee1394 ehci_hcd ohci_hcd serio_raw 
ide_cd i2c_piix4 i2c_core pcspkr cdrom e1000 dm_snapshot dm_zero dm_mirror 
dm_mod ext3 jbd
Mar 17 09:05:21 cerberus kernel: CPU:    0
Mar 17 09:05:21 cerberus kernel: EIP:    0060:[<c046c1ef>]    Not tainted VLI
Mar 17 09:05:21 cerberus kernel: EFLAGS: 00010046   (2.6.20-1.2300.fc5smp #1)
Mar 17 09:05:21 cerberus kernel: EIP is at free_block+0x46/0xe5
Mar 17 09:05:21 cerberus kernel: eax: 00000400   ebx: c06c6d70   ecx: 00000060
 edx: c1010ef0
Mar 17 09:05:21 cerberus kernel: esi: 00000060   edi: c1309800   ebp: f7ef42c0
 esp: c1aebf0c
Mar 17 09:05:21 cerberus kernel: ds: 007b   es: 007b   ss: 0068
Mar 17 09:05:21 cerberus kernel: Process events/0 (pid: 5, ti=c1aeb000 
task=f7ec0030 task.ti=c1aeb000)
Mar 17 09:05:21 cerberus kernel: Stack: 00000008 00000000 00000060 00000000 
c1309820 c1309820 00000060 c1309800
Mar 17 09:05:21 cerberus kernel:        00000000 c046c318 00000000 00000000 
f7ef42c0 f7ee4464 f7ee4440 f7ef42c0
Mar 17 09:05:21 cerberus kernel:        00000296 c1a0b4c0 c046d566 00000000 
00000000 c1a0b4c4 f7ee42c0 c04349ba
Mar 17 09:05:21 cerberus kernel: Call Trace:
Mar 17 09:05:21 cerberus kernel:  [<c046c318>] drain_array+0x8a/0xb5
Mar 17 09:05:21 cerberus kernel:  [<c046d566>] cache_reap+0x93/0x124
Mar 17 09:05:21 cerberus kernel:  [<c04349ba>] run_workqueue+0x85/0x125
Mar 17 09:05:21 cerberus kernel:  [<c061e1ff>] _spin_lock_irqsave+0x9/0xd
Mar 17 09:05:21 cerberus kernel:  [<c046d4d3>] cache_reap+0x0/0x124
Mar 17 09:05:21 cerberus kernel:  [<c043532c>] worker_thread+0xf9/0x124
Mar 17 09:05:21 cerberus kernel:  [<c0421af2>] default_wake_function+0x0/0xc
Mar 17 09:05:21 cerberus kernel:  [<c0435233>] worker_thread+0x0/0x124
Mar 17 09:05:21 cerberus kernel:  [<c0437933>] kthread+0xb0/0xd9
Mar 17 09:05:21 cerberus kernel:  [<c0437883>] kthread+0x0/0xd9
Mar 17 09:05:21 cerberus kernel:  [<c0404b47>] kernel_thread_helper+0x7/0x10
Mar 17 09:05:21 cerberus kernel:  =======================
Mar 17 09:05:21 cerberus kernel: Code: 00 00 00 8b 44 24 10 8b 18 8d 83 00 00 
00 40 c1 e8 0c 6b d0 28 03 15 80 ee 7c c0 8b 02 f6 c4 40 74 03 8b 52 0c 8b 02 
84 c0 78 04 <0f> 0b eb fe 8b 72 24 8b 54 24 28 89 f0 8b bc 95 94 00 00 00 e8
Mar 17 09:05:21 cerberus kernel: EIP: [<c046c1ef>] free_block+0x46/0xe5 SS:ESP 
0068:c1aebf0c
Mar 17 09:05:46 cerberus kernel:  <0>Bad page state in process 'Xorg'
Mar 17 09:05:46 cerberus kernel: page:c13082c0 flags:0xc1a76664 
mapping:c130889c mapcount:810897476 count:-268391426 (Not tainted)
Mar 17 09:05:46 cerberus kernel: Trying to fix it up, but a reboot is needed
Mar 17 09:05:46 cerberus kernel: Backtrace:
Mar 17 09:05:46 cerberus kernel:  [<c0456933>] bad_page+0x6a/0x96
Mar 17 09:05:46 cerberus kernel:  [<c0456978>] destroy_compound_page+0x19/0x45
Mar 17 09:05:46 cerberus kernel:  [<c0456a0e>] free_pages_bulk+0x6a/0x177
Mar 17 09:05:46 cerberus kernel:  [<c0456ceb>] free_hot_cold_page+0x102/0x112
Mar 17 09:05:46 cerberus kernel:  [<c045d6dd>] unmap_vmas+0x31e/0x58b
Mar 17 09:05:46 cerberus kernel:  [<c046089b>] unmap_region+0x93/0xf7
Mar 17 09:05:46 cerberus kernel:  [<c0461291>] do_munmap+0x15a/0x1af
Mar 17 09:05:46 cerberus kernel:  [<c0461316>] sys_munmap+0x30/0x3e
Mar 17 09:05:46 cerberus kernel:  [<c0403f6c>] syscall_call+0x7/0xb
Mar 17 09:05:46 cerberus kernel:  =======================
Mar 17 09:05:46 cerberus kernel: Bad page state in process 'Xorg'
Mar 17 09:05:46 cerberus kernel: page:c13082c0 flags:0xc1a56604 
mapping:00000000: mapcount:0 count:-268391426 (Tainted: G    B)
Mar 17 09:05:46 cerberus kernel: Trying to fix it up, but a reboot is needed


Comment 2 George N. White III 2007-03-17 12:27:07 UTC
Opps -- above kernel version is the old (working) kernel.  The bug occurs for
kernel-smp-2.6.20-1.2300.fc5.


Comment 3 George N. White III 2007-03-26 12:17:37 UTC
I tried kernel-smp-2.6.20-1.2307.fc5.  It ran for about 30 hours, then:

Mar 26 05:08:22 cerberus kernel: Bad page state in process 'beagle-build-in'
Mar 26 05:08:22 cerberus kernel: page:c1307ff0 flags:0x40000000 
mapping:f7ee45c8 mapcount:0 count:0 (Not tainted)
Mar 26 05:08:22 cerberus kernel: Trying to fix it up, but a reboot is needed
Mar 26 05:08:22 cerberus kernel: Backtrace:
Mar 26 05:08:22 cerberus kernel:  [<c045693b>] bad_page+0x6a/0x96
Mar 26 05:08:22 cerberus kernel:  [<c045717a>] get_page_from_freelist+0x1e1/
0x2a7
Mar 26 05:08:22 cerberus kernel:  [<c04572a8>] __alloc_pages+0x68/0x2aa
Mar 26 05:08:22 cerberus kernel:  [<c045885c>] __do_page_cache_readahead+0xc5/
0x1cc
Mar 26 05:08:22 cerberus kernel:  [<c04c1ff4>] avc_has_perm+0x4e/0x58
Mar 26 05:08:22 cerberus kernel:  [<c04c1ff4>] avc_has_perm+0x4e/0x58
Mar 26 05:08:22 cerberus kernel:  [<c04589af>] 
blockable_page_cache_readahead+0x4c/0x9f
Mar 26 05:08:22 cerberus kernel:  [<c0458bc6>] page_cache_readahead+0x12b/0x196
Mar 26 05:08:22 cerberus kernel:  [<f8860d4e>] ext3_readdir+0x376/0x5f0 [ext3]

There was no kernel bug entry in the logs.  The system has been running normally
using 2.6.19-1.2288.2.4.fc5smp.

Comment 4 George N. White III 2007-03-26 22:11:08 UTC
Created attachment 150970 [details]
excerpt from /var/log/messages

initial portion of the message block:

Mar 26 17:26:37 cerberus kernel: kernel BUG at mm/slab.c:610!
Mar 26 17:26:37 cerberus kernel: invalid opcode: 0000 [#1]
Mar 26 17:26:37 cerberus kernel: SMP 
Mar 26 17:26:37 cerberus kernel: last sysfs file: /block/hda/hda1/size

Comment 5 Chuck Ebbert 2007-03-27 15:21:06 UTC
You may have a memory/hardware problem with this machine.
Try running memtest86 overnight on it.


Comment 6 George N. White III 2007-03-29 22:31:26 UTC
Hardware problems are not likely -- the machine has been running 24/7 with
2.6.19 kernels and earlier since fc2, has clean power and low risk of
overheating (in Canada where it is still winter, A/C not required).  Just to be
sure, ran memtest86 for over 23 hours (ECC off) without any problems found.  
Since March 16th it ran 2.6.20 kernels for less 24 hours with 3 crashes, the
other 11 days
2.6.19-1.2288.2.4.fc5smp was running without problems.

Comment 7 Jon Stanley 2008-01-08 01:54:30 UTC
(This is a mass-update to all current FC6 kernel bugs in NEW state)

Hello,

I'm reviewing this bug list as part of the kernel bug triage project, an attempt
to isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!

Comment 8 Jon Stanley 2008-02-08 04:27:37 UTC
Per the previous comment in this bug, I am closing it as INSUFFICIENT_DATA,
since no information has been lodged for over 30 days.

Please re-open this bug or file a new one if you can provide the requested data,
and thanks for filing the original report!


Note You need to log in before you can comment on or make changes to this bug.