Bug 65608 - VM oops in 2.4.18-3
VM oops in 2.4.18-3
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.3
athlon Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-05-28 10:26 EDT by Need Real Name
Modified: 2007-04-18 12:42 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-05-28 12:45:39 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Need Real Name 2002-05-28 10:26:26 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020408

Description of problem:
This is a dual Athlon, 1 gig registered ECC DDR RAM, will try 2.4.18-4 but
it doesn't look ext3-related (the only big local filesystem is reiserfs
over s/w raid0).

I do suspect the hardware on this machine. If someone could tell me "that
looks like a bad x", I'd be very grateful. More details on request :-/

Unable to handle kernel paging request at virtual address 0200f82b
 printing eip:
c0137dc0
*pde = 00000000
Oops: 0000
nls_iso8859-1 nls_cp437 vfat fat soundcore nfs tuner tvaudio bttv videodev i2c
CPU:    0
EIP:    0010:[<c0137dc0>]    Not tainted
EFLAGS: 00010206

EIP is at page_remove_rmap [kernel] 0x50 (2.4.18-3)
eax: 0200f827   ebx: c1df9c38   ecx: c1000030   edx: c3a19168
esi: c3a19168   edi: c33bc618   ebp: 3fe37025   esp: c6b87eb0
ds: 0018   es: 0018   ss: 0018
Process crond (pid: 7463, stackpage=c6b87000)
Stack: 00100000 c3a19168 0005a000 c0126ab1 00000020 00000000 42100000 c6b85420
       42000000 00000000 42100000 c6b85420 c011c6e6 00000000 c6b86000 00000000
       00000000 00000000 c6b86000 c6b860b4 00100000 0012c000 42000000 00000001
Call Trace: [<c0126ab1>] do_zap_page_range [kernel] 0x181
[<c011c6e6>] sys_wait4 [kernel] 0x396
[<c0127010>] zap_page_range [kernel] 0x50
[<c01297da>] exit_mmap [kernel] 0xca
[<c0117e36>] mmput [kernel] 0x26
[<c011c183>] do_exit [kernel] 0xb3
[<c011c6e6>] sys_wait4 [kernel] 0x396
[<c0108913>] system_call [kernel] 0x33


Code: 39 70 04 75 0d 53 57 50 e8 a3 02 00 00 83 c4 0c eb 08 89 c7

$ mount
/dev/sda3 on / type ext3 (rw)
none on /proc type proc (rw)
usbdevfs on /proc/bus/usb type usbdevfs (rw)
/dev/sda2 on /boot type ext3 (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md0 on /export type reiserfs (rw,noatime,notail)
none on /dev/shm type tmpfs (rw)
none on /tmp type tmpfs (rw)

[ + plus some autofs / nfs stuff ]


Version-Release number of selected component (if applicable):


How reproducible:
Didn't try

Steps to Reproduce:
possibly hardware problem?
	

Additional info:
Comment 1 Arjan van de Ven 2002-05-28 10:39:57 EDT
If you don't trust your memory, the memtest86 program (search on
www.freshmeat.net for it if needed) is a pretty good tester of ram chips.
Comment 2 Need Real Name 2002-05-28 12:25:46 EDT
It had passed a few passes of memtest86 before being put into production.
I'll re-run it (we have had RAM go dodgy in the past), but it is ECC RAM,
so (to quote Alan Cox):
  memtest86 will give fairly honest answers on ECC RAM. It'll see errors
  that ECC didnt correct or were caused by chipset/cache/wiring
  capacitance etc. Those are the same errors the kernel will see
Comment 3 Arjan van de Ven 2002-05-28 12:45:33 EDT
Well you could also try to see if the "ecc" kernel module (included in all RH's
recent kernels) detects ECC faults... it's supposed to report ECC soft-failures
to syslog
Comment 4 Need Real Name 2002-05-28 13:33:46 EDT
Ahhh! memtest86 3.0 has ECC "stuff" in it. Bingo. My bad.

Note You need to log in before you can comment on or make changes to this bug.