92013 – kernel BUG at vmscan.c

Bug 92013 - kernel BUG at vmscan.c

Summary: kernel BUG at vmscan.c

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	9
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Arjan van de Ven
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:	90990
Blocks:
TreeView+	depends on / blocked

Reported:	2003-05-31 07:42 UTC by Toni Parviainen
Modified:	2007-04-18 16:54 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-09-30 15:41:01 UTC
Embargoed:

Attachments	(Terms of Use)
vmscan.c:118 kernel panic with 2.4.20-20.9 + bootup info (22.77 KB, text/plain) 2003-10-23 09:18 UTC, Janne Pikkarainen	no flags	Details
View All

Description Toni Parviainen 2003-05-31 07:42:13 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.2)
Gecko/20030208 Netscape/7.02

Description of problem:
This morning I couldn't log into my server (ssh), open xterm-windows and samba
didn't work. The machine replied to ping, but nothing else.
I was forced to boot. After that I looked into /var/log/messages
(timestamp information removed):

May 31 04:02:39 pullasorsa kernel: ------------[ cut here ]------------
kernel: kernel BUG at vmscan.c:118!
kernel: invalid operand: 0000
kernel: loop ip_nat_irc ip_conntrack_irc ip_nat_ftp ip_conntrack_ftp autofs
3c59x ipt_REJECT
 ipt_limit ipt_LOG ipt_state iptable_nat ip_conntrack iptable_filter ip_ta
kernel: CPU:    0
kernel: EIP:    0060:[<c0135fc0>]    Not tainted
kernel: EFLAGS: 00010206
kernel:
kernel: EIP is at reclaim_page [kernel] 0x2a0 (2.4.20-9)
kernel: eax: 010c020c   ebx: c1292a48   ecx: 00000100   edx: 0000059d
kernel: esi: c02ec860   edi: c1292a2c   ebp: 0000052a   esp: cf6cbe68
kernel: ds: 0068   es: 0068   ss: 0068
kernel: Process rpmq (pid: 6516, stackpage=cf6cb000)
kernel: Stack: c02eca44 00000000 c02ec860 c02ecdc4 00000003 00000001 c0138da0
c02ecdcc
kernel:        0000031f 000001d2 00000000 c0138e61 c02ecdc0 00000000 00000003
00000001
kernel:        c124968c c7859148 c124968c c124968c 00000001 c02ecdc0 c7859148
0000190d
kernel: Call Trace:   [<c0138da0>] __alloc_pages_limit [kernel] 0x70 (0xcf6cbe80))
kernel: [<c0138e61>] __alloc_pages [kernel] 0xa1 (0xcf6cbe94))
kernel: [<c012e5b0>] page_cache_read [kernel] 0x80 (0xcf6cbed0))
kernel: [<c012edcb>] generic_file_readahead [kernel] 0xdb (0xcf6cbee8))
kernel: [<c012f253>] do_generic_file_read [kernel] 0x353 (0xcf6cbf10))
kernel: [<c012f610>] file_read_actor [kernel] 0x0 (0xcf6cbf3c))
kernel: [<c012f732>] generic_file_read [kernel] 0x82 (0xcf6cbf5c))
kernel: [<c012f610>] file_read_actor [kernel] 0x0 (0xcf6cbf6c))
kernel: [<c013ff92>] sys_pread [kernel] 0xa2 (0xcf6cbf94))
kernel: [<c0109103>] system_call [kernel] 0x33 (0xcf6cbfc0))
kernel:
kernel:
kernel: Code: 0f 0b 76 00 e1 f9 23 c0 e9 9f fd ff ff 8d 76 00 57 56 ba 01
kernel:  ------------[ cut here ]------------


As I look into other messages file, I can see that this did happen
once before, May 16. The time, messages and process are exactly the same.

There is a cron job in cron.daily which caused this to happen:
[root@pullasorsa cron.daily]# cat rpm
#!/bin/sh
rpm -qa --qf '%{name}-%{version}-%{release}.%{arch}.rpm\n' 2>&1 \
        | sort > /var/log/rpmpkgs

The rpm package version is:
[root@pullasorsa cron.daily]# rpm -q rpm
rpm-4.2-0.69



Version-Release number of selected component (if applicable):
2.4.20-9

How reproducible:
Couldn't Reproduce


Additional info:

I had some crashes before this bug (reported as bug # 90990) so I run memtest86
for about 48 hours (all test and only standard set of tests) and it couldn't
find anything. I'm not sure if this is related to it.

Atleast these crashes seem to happen when there are lot of disk activity (see
bug 90990: backup script, samba, rpm, updatedb). I did shutdown the machine with
-F to force fsck on reboot. It didn't find anything wrong with hard drives.

Comment 1 Janne Pikkarainen 2003-10-23 09:18:57 UTC

Created attachment 95428 [details]
vmscan.c:118 kernel panic with 2.4.20-20.9 + bootup info

The very same bug did hit my friends server last night. This is what he got. It
seems that there's something fishy going on with reclaim_page() ... that's my
guess, anyway.

My friends server has Red Hat 9.0 with latest kernel errata installed. The same
system used to work without a hitch even under heavier load, but couple of days
after he installed the new kernel this oops did hit. And during the last few
days he even didn't stress the server.

Comment 2 Janne Pikkarainen 2003-11-04 09:13:12 UTC

Oh dear. I just heard that my friends server is quite a bit
overclocked.  The box used to run Windows and overclock settings were
tuned for it ("Does this thing still boot? Good, let's O/C a bit more!
Does it still boot? No... let's go down a step."), but since Linux
actually uses the hardware :-), overclocking made the box very crashy. 

He will restore it to default settings and see if that helps to this
problem. Personally I think that calming down the server will resolve
this case for him.

Comment 3 Bugzilla owner 2004-09-30 15:41:01 UTC

Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.