Bug 73880 - multiple kernel panics, kernel null pointers, page faults, etc...
Summary: multiple kernel panics, kernel null pointers, page faults, etc...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-09-12 20:57 UTC by Need Real Name
Modified: 2007-04-18 16:46 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-12-17 02:52:49 UTC
Embargoed:


Attachments (Terms of Use)
console dump (2.54 KB, text/plain)
2002-09-12 20:58 UTC, Need Real Name
no flags Details
console dump (877 bytes, text/plain)
2002-09-12 21:00 UTC, Need Real Name
no flags Details
console dump (28.61 KB, text/plain)
2002-09-12 21:02 UTC, Need Real Name
no flags Details
console dump (4.94 KB, text/plain)
2002-09-12 21:05 UTC, Need Real Name
no flags Details
console dump (17.39 KB, text/plain)
2002-09-12 21:07 UTC, Need Real Name
no flags Details
console dump (3.96 KB, text/plain)
2002-09-12 21:08 UTC, Need Real Name
no flags Details
console dump (2.85 KB, text/plain)
2002-09-12 21:09 UTC, Need Real Name
no flags Details
console dump (5.36 KB, text/plain)
2002-09-12 21:10 UTC, Need Real Name
no flags Details
console dump (4.95 KB, text/plain)
2002-09-12 21:11 UTC, Need Real Name
no flags Details
console dump (1.70 KB, text/plain)
2002-09-12 21:12 UTC, Need Real Name
no flags Details
console dump (3.82 KB, text/plain)
2002-09-12 21:13 UTC, Need Real Name
no flags Details
console dump (4.47 KB, text/plain)
2002-09-12 21:14 UTC, Need Real Name
no flags Details
console dump (3.21 KB, text/plain)
2002-09-12 21:15 UTC, Need Real Name
no flags Details
module code (569.58 KB, application/octet-stream)
2002-09-12 23:39 UTC, Need Real Name
no flags Details

Description Need Real Name 2002-09-12 20:57:14 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020606

Description of problem:
RedHat 7.3 kernel 2.4.18-5bigmem panics with great frequency generating kernel
Oops. Stack seems to get overwritten with bad data.  Servers are 4 way Xeon 1.6
GHz with hyterthreading turned on.  Systems have 8 gig physcial RAM and 16 gig
of swap.

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. boot system
2. run internal app
3. wait
	

Actual Results:  System hangs with a kernel Oops dump to the console.

Expected Results:  no kernel panic

Additional info:

Internal code has run on redhat 6.2, 7.0, 7.1, 7.2, kernel.org 2.4.18 kernel,
suse 8.0 without this problem.  I have numerous Oops dumps as examples.

Comment 1 Need Real Name 2002-09-12 20:58:58 UTC
Created attachment 75964 [details]
console dump

Comment 2 Need Real Name 2002-09-12 21:00:34 UTC
Created attachment 75965 [details]
console dump

Comment 3 Arjan van de Ven 2002-09-12 21:01:50 UTC
what modules are you using ?

Comment 4 Need Real Name 2002-09-12 21:02:20 UTC
Created attachment 75966 [details]
console dump

Comment 5 Need Real Name 2002-09-12 21:04:36 UTC
Here is the output of lsmod:

nfs                    89180  47  (autoclean)
lockd                  58080   1  (autoclean) [nfs]
sunrpc                 83444   1  (autoclean) [nfs lockd]
pcmcia_core            55616   0 
eepro100               20848   1 
cpqasm                346016   2 
cpqevt                  6148   0  [cpqasm]
ext3                   69824   2 
jbd                    52896   2  [ext3]
cciss                  36672  10 
sd_mod                 12864   0  (unused)
scsi_mod              114148   1  [cciss sd_mod]


Comment 6 Need Real Name 2002-09-12 21:05:37 UTC
Created attachment 75967 [details]
console dump

Comment 7 Need Real Name 2002-09-12 21:07:19 UTC
Created attachment 75968 [details]
console dump

Comment 8 Need Real Name 2002-09-12 21:08:18 UTC
Created attachment 75969 [details]
console dump

Comment 9 Need Real Name 2002-09-12 21:09:15 UTC
Created attachment 75970 [details]
console dump

Comment 10 Need Real Name 2002-09-12 21:10:25 UTC
Created attachment 75971 [details]
console dump

Comment 11 Need Real Name 2002-09-12 21:11:16 UTC
Created attachment 75972 [details]
console dump

Comment 12 Need Real Name 2002-09-12 21:12:23 UTC
Created attachment 75973 [details]
console dump

Comment 13 Need Real Name 2002-09-12 21:13:16 UTC
Created attachment 75974 [details]
console dump

Comment 14 Need Real Name 2002-09-12 21:14:04 UTC
Created attachment 75975 [details]
console dump

Comment 15 Arjan van de Ven 2002-09-12 21:14:39 UTC
Please try to reproduce this bug without binary only kernel modules loaded and
then reopen the bug

Comment 16 Need Real Name 2002-09-12 21:15:11 UTC
Created attachment 75976 [details]
console dump

Comment 17 Need Real Name 2002-09-12 21:50:56 UTC
It has failed without these kernel modules.  It should also be noted that these
modules are not binary only, the are open source written by HP and Compaq.  I
will post the code if you want?  Also note that 7.2 works with these modules.

Comment 18 Arjan van de Ven 2002-09-12 22:26:01 UTC
if you have an URL to the code, yes please; I'd like to take a look and check
their stack behavior for one (assuming they are actually open source and not
just a binary only blob with some glue code)

Comment 19 Need Real Name 2002-09-12 23:39:52 UTC
Created attachment 76004 [details]
module code

Comment 20 Need Real Name 2002-09-12 23:42:25 UTC
attached module code, anything within that is not open source I can probably get
HP to get/work with you off-line.

Comment 21 Need Real Name 2002-09-13 00:36:30 UTC
It should be noted that this problem also occures without the kernel module loaded.

Comment 22 Arjan van de Ven 2002-09-13 08:19:45 UTC
this is a very HUGE binary only module with a tiny bit of sourcecode ;(
Anyway please try the 2.4.18-12.5 or 2.4.18-14 kernel from the rawhide portion
of our FTP site; it has a stack overflow detector that actually might give a
backtrace BEFORE the stack overflows (the trace AFTER it does is basically
useless ;( )

Comment 23 dnd 2002-09-17 12:47:38 UTC
In order to get a reliable netdump "vmcore" on Red Hat 7.3 we need 
the "netconsole" module from Red Hat. It seems that Red Hat 7.3 
bundles "netdump-server-0.6.4-1.i386.rpm" and "netdump-0.6.4-1.i386.rpm" but 
does the 2.4.18-3 bundled kernel _does_not_ include the "netconsole" module 
required for "netdump" client to function correctly. 

Where can we get the "netconsole" module required for Red Hat 7.3 (2.4.18-3 or 
any errata kernels) to function with bundled "netdump-0.6.4-1.i386.rpm" ??

Comment 24 Arjan van de Ven 2002-09-17 12:49:47 UTC
the supported 7.3 kernel does support netdump

Comment 25 Need Real Name 2002-09-17 12:53:22 UTC
why provide the client then?

Comment 26 Arjan van de Ven 2002-09-17 12:54:28 UTC
because if you actually use the supported kernel you DO have the netdump module.
The HP guy just isn't using that....

Comment 27 dnd 2002-09-17 14:25:33 UTC
Hi RedHat,

> because if you actually use the supported kernel you DO have the
> netdump module. The HP guy just isn't using that....

I'm using the supported kernel from Red Hat 7.3, i.e. bundled 2.4.18-3 kernel 
and this bundled release does_not contain the 'netconsole' module. 

Do you mean 'supported' as in 2.4.18-10 errata kernel??



Comment 28 Arjan van de Ven 2002-09-17 14:29:09 UTC
correct

Comment 29 dnd 2002-09-17 14:35:28 UTC
Thanks, that explains a lot...i.e. supported kernel is 2.4.18-10.

Questions:

Does this 'supported' kernel 2.4.18-10 contain the 'stack overflow detector' 
code??   If not as stated earlier only 2.4.18-12.5 rawhide kernel??  We can 
only locate 2.4.18-12.5 rawhide kernel and not 2.4.18-14??


Comment 30 Dave Jones 2003-12-17 02:52:49 UTC
Ye olde bug with no activity in well over a year. closing.



Note You need to log in before you can comment on or make changes to this bug.