110837 – kernel BUG at page_alloc.c:139!

Bug 110837 - kernel BUG at page_alloc.c:139!

Summary: kernel BUG at page_alloc.c:139!

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	9
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Arjan van de Ven
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-11-24 19:18 UTC by Craig Lawson
Modified:	2007-04-18 16:59 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-09-30 15:41:43 UTC
Embargoed:

Attachments	(Terms of Use)

Description Craig Lawson 2003-11-24 19:18:13 UTC

Description of problem:
A process locked up in the kernel.

Version-Release number of selected component (if applicable):
2.4.20-20.9smp

How reproducible:
Rarely.

Steps to Reproduce:
1. I was using anytopnm. But I have used this many times before with
no difficulties.
  
Actual results:
Kernel lock-up on one processor. The other processor continued to 
function.

Expected results:
No lock-up.

Additional info:
Found this in /var/log/messages (twice):

kernel BUG at page_alloc.c:139!
invalid operand: 0000
usb-uhci nls_iso8859-1 udf snd-pcm-oss ppp_synctty ppp_async 
ppp_generic slhc snd-mixer-oss radeon agpgart ipt_state 
ipt_MASQUERADE iptable_nat ip_conntrack w
CPU:    0
EIP:    0060:[<c01490fb>]    Not tainted
EFLAGS: 00210282

EIP is at __free_pages_ok [kernel] 0xeb (2.4.20-20.9smp)
eax: 02001010   ebx: c1d31f68   ecx: c1000030   edx: e190f680
esi: 00000000   edi: 00000000   ebp: 00000000   esp: f15e3e74
ds: 0068   es: 0068   ss: 0068
Process anytopnm (pid: 4818, stackpage=f15e3000)
Stack: c63eea80 fffee4cc 4213320c c72d8700 3f571067 c0136aeb c72d8700 
3f571067 
       4213320c 00000001 3e34d163 c1d31f68 00000025 c1d31f68 c0138089 
c72d8700 
       4213320c c1ddb0e8 c1ddb0e8 df3bdc00 efaeb420 ce1eb680 4213320c 
00000001 
Call Trace:   [<c0136aeb>] vm_set_pte [kernel] 0x3b (0xf15e3e88))
[<c0138089>] do_wp_page [kernel] 0x159 (0xf15e3eac))
[<c0138fce>] handle_mm_fault [kernel] 0x11e (0xf15e3ed4))
[<c011c508>] do_page_fault [kernel] 0x188 (0xf15e3f04))
[<c0130bf9>] sys_rt_sigaction [kernel] 0xa9 (0xf15e3f60))
[<c0127565>] sys_wait4 [kernel] 0x1e5 (0xf15e3f74))
[<c0108d2d>] sys_sigreturn [kernel] 0xed (0xf15e3f94))
[<c011c380>] do_page_fault [kernel] 0x0 (0xf15e3fb0))
[<c01099c0>] error_code [kernel] 0x34 (0xf15e3fb8))


Code: 0f 0b 8b 00 6d 62 28 c0 b8 02 00 00 00 f0 0f b3 43 18 b8 04 
 ------------[ cut here ]------------
kernel BUG at page_alloc.c:139!
invalid operand: 0000
usb-uhci nls_iso8859-1 udf snd-pcm-oss ppp_synctty ppp_async 
ppp_generic slhc snd-mixer-oss radeon agpgart ipt_state 
ipt_MASQUERADE iptable_nat ip_conntrack w
CPU:    0
EIP:    0060:[<c01490fb>]    Not tainted
EFLAGS: 00210282

EIP is at __free_pages_ok [kernel] 0xeb (2.4.20-20.9smp)
eax: 02001010   ebx: c1d31f68   ecx: c1000030   edx: e190f680
esi: 00000000   edi: 00000000   ebp: 00000000   esp: f15e3bb0
ds: 0068   es: 0068   ss: 0068
Process anytopnm (pid: 4821, stackpage=f15e3000)
Stack: c0344680 00000002 c016a300 df3bd200 c0344680 c1c40030 c0345850 
fffee4cc 
       00000163 00000000 c1d31f68 00000163 00000000 fffee4cc c0136bec 
c1d31f68 
       3c521045 c0139886 cb745780 42000000 fffee4cc c1df20d8 00000001 
00000000 
Call Trace:   [<c016a300>] dput [kernel] 0x30 (0xf15e3bb8))
[<c0136bec>] __free_pte [kernel] 0x4c (0xf15e3be8))
[<c0139886>] zap_pte_range [kernel] 0x226 (0xf15e3bf4))
[<c01373cb>] zap_page_range [kernel] 0x10b (0xf15e3c20))
[<c013b050>] exit_mmap [kernel] 0xd0 (0xf15e3c64))
[<c015cf3d>] exec_mmap [kernel] 0x1fd (0xf15e3c88))
[<c0120e31>] unshare_files [kernel] 0x31 (0xf15e3c90))
[<c015cf9d>] flush_old_exec [kernel] 0x4d (0xf15e3ca4))
[<c0178bcc>] load_elf_binary [kernel] 0x2cc (0xf15e3cbc))
[<f8823237>] ext3_do_update_inode [ext3] 0x177 (0xf15e3ce4))
[<f880e02c>] journal_get_write_access_Rsmp_2b583cf6 [jbd] 0x5c 
(0xf15e3d04))
[<c014986d>] __alloc_pages [kernel] 0x7d (0xf15e3da8))
[<c0178900>] load_elf_binary [kernel] 0x0 (0xf15e3df0))
[<c015d674>] search_binary_handler [kernel] 0x124 (0xf15e3dfc))
[<c015d88b>] do_execve [kernel] 0x17b (0xf15e3e44))
[<c0107d90>] sys_execve [kernel] 0x50 (0xf15e3fa4))
[<c01098cf>] system_call [kernel] 0x33 (0xf15e3fc0))


Code: 0f 0b 8b 00 6d 62 28 c0 b8 02 00 00 00 f0 0f b3 43 18 b8 04

Comment 1 Arjan van de Ven 2003-11-24 19:19:12 UTC

does it happen without alsa too ?

Comment 2 Craig Lawson 2003-11-25 03:19:25 UTC

I don't know.

I use Alsa all the time, which could explain the occasional lock-ups. 
Except I recently upgraded from a Pentium-3 to a Pentium-4 and 
switched to an SMP kernel. I assumed the cause of the lock-ups was 
miscoordination between the processors because I experienced a lot 
more lock-ups since upgrading. For example, see Bug 68673 -- I doubt 
that had anything to do with Alsa.

On the other hand, when the lock-up described by this bug occurred, I 
was running a script that heavily loaded the filesystem for 20 
minutes, and it has locked up several times in the past.

If you think it could still be Alsa (I am using 0.9.8), then I can 
try disabling it, though it could take a while for another lock-up as 
they are unpredictable.

Also, I sure would appreciate some tips on tracking down kernel 
lock-ups. I usually don't get anything in the log or an Oops screen. 
For all I know, most of them could be X lock-ups.

Comment 3 jcpeck 2004-03-24 02:17:09 UTC

I have recently encountered this issue running the SMP kernel on a 
dual-Opteron system.  In this case, the processing running was crond.

[root@kazeon log]# uname -r
2.4.20-30.9smp

Excerpt from /var/log/messages

Mar 21 04:30:02 kazeon kernel: ------------[ cut here ]------------
Mar 21 04:30:02 kazeon kernel: kernel BUG at page_alloc.c:139!
Mar 21 04:30:02 kazeon kernel: invalid operand: 0000
Mar 21 04:30:02 kazeon kernel: nfsd ide-cd cdrom parport_pc lp 
parport autofs
nfs lockd sunrpc tg3 keybdev mousedev hid input usb-ohci usbcore ext3 
jbd dpt_i2o
sd_mod scsi_mod
Mar 21 04:30:02 kazeon kernel: CPU:    1
Mar 21 04:30:02 kazeon kernel: EIP:    0060:[<c014935b>]    Not 
tainted
Mar 21 04:30:02 kazeon kernel: EFLAGS: 00010286
Mar 21 04:30:02 kazeon kernel:
Mar 21 04:30:02 kazeon kernel: EIP is at __free_pages_ok [kernel] 0xeb
(2.4.20-30.9smp)
Mar 21 04:30:02 kazeon kernel: eax: 02001018   ebx: c1dc7818   ecx: 
c1000030  
edx: f6ba8d00
Mar 21 04:30:02 kazeon kernel: esi: 00000000   edi: 00000000   ebp: 
00000000  
esp: f591be0c
Mar 21 04:30:02 kazeon kernel: ds: 0068   es: 0068   ss: 0068
Mar 21 04:30:02 kazeon kernel: Process crond (pid: 22479, 
stackpage=f591b000)
Mar 21 04:30:02 kazeon kernel: Stack: f07c9e80 ef550045 bfc02000 
ffffffff
00000002 bfc02000 c01397e0 fffe4ffc
Mar 21 04:30:02 kazeon kernel:        ffffffff 00053e2e c1dc7818 
00000003
c0000000 00000003 c0136c0c c1dc7818
Mar 21 04:30:02 kazeon kernel:        00000002 c0137485 c80c2680 
de8c2bfc
bfc00000 00003000 c03fd000 f591a000
Mar 21 04:30:02 kazeon kernel: Call Trace:   [<c01397e0>] 
zap_pte_range [kernel]
0x160 (0xf591be24))
Mar 21 04:30:02 kazeon kernel: [<c0136c0c>] __free_pte [kernel] 0x4c 
(0xf591be44))
Mar 21 04:30:02 kazeon kernel: [<c0137485>] zap_page_range [kernel] 
0x1a5
(0xf591be50))
Mar 21 04:30:02 kazeon kernel: [<c013b080>] exit_mmap [kernel] 0xd0 
(0xf591be94))
Mar 21 04:30:02 kazeon kernel: [<c0120812>] mmput [kernel] 0x62 
(0xf591beb8))
Mar 21 04:30:02 kazeon kernel: [<c0126b86>] do_exit [kernel] 0x136
(0xf591bec8))Mar 21 04:30:02 kazeon kernel: [<c0126f0b>] 
do_group_exit [kernel]
0x8b (0xf591bee4))
Mar 21 04:30:02 kazeon kernel: [<c012fd2f>] get_signal_to_deliver 
[kernel] 0x1df
(0xf591bef8))
Mar 21 04:30:02 kazeon kernel: [<c0109634>] do_signal [kernel] 0x64 
(0xf591bf20))
Mar 21 04:30:02 kazeon kernel: [<c010ee5a>] call_reschedule_interrupt 
[kernel]
0x5 (0xf591bf64))
Mar 21 04:30:02 kazeon kernel: [<c011e3ef>] schedule [kernel] 0x19f 
(0xf591bf8c))
Mar 21 04:30:02 kazeon kernel: [<c011c380>] do_page_fault [kernel] 
0x0 (0xf591bfb0))
Mar 21 04:30:02 kazeon kernel: [<c011c380>] do_page_fault [kernel] 
0x0 (0xf591bfbc))
Mar 21 04:30:02 kazeon kernel: [<c0109908>] signal_return [kernel] 
0x14
(0xf591bfc0))
Mar 21 04:30:02 kazeon kernel:
Mar 21 04:30:02 kazeon kernel:
Mar 21 04:30:02 kazeon kernel: Code: 0f 0b 8b 00 1a 66 28 c0 b8 02 00 
00 00 f0
0f b3 43 18 b8 04

Comment 4 Bugzilla owner 2004-09-30 15:41:43 UTC

Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.