Bug 114445

Summary: Kernel crashes on HP DL360 G3
Product: [Retired] Red Hat Linux Reporter: Leonid Mamtchenkov <leonid>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: riel
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:41:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Leonid Mamtchenkov 2004-01-28 10:18:28 UTC
Description of problem:
Kernel crashes.  Probably releated to Compaq array controller.

Version-Release number of selected component (if applicable):
Linux billnode01.linux.dom 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST
2003 i686 i686 i386 GNU/Linux

How reproducible:
Sometimes

Steps to Reproduce:
1.
2.
3.
  
Actual results:
Jan 28 04:07:18 billnode01 kernel:  #req FMinit on LST, LPSM 4h# 
#LDn# invalid operand: 0000
Jan 28 04:07:18 billnode01 kernel: tg3 ipchains keybdev mousedev hid
input usb-ohci usbcore ext3 jbd cpqfc cciss sd_mod scsi
_mod
Jan 28 04:07:18 billnode01 kernel: CPU:    0
Jan 28 04:07:18 billnode01 kernel: EIP:    0060:[<f8843fc1>]    Not
tainted
Jan 28 04:07:18 billnode01 kernel: EFLAGS: 00010246
Jan 28 04:07:18 billnode01 kernel:
Jan 28 04:07:18 billnode01 kernel: EIP is at SendLogins [cpqfc] 0x41
(2.4.20-8smp)
Jan 28 04:07:18 billnode01 kernel: eax: 0000007e   ebx: 00000000  
ecx: 0001a9a9   edx: f7bc0000
Jan 28 04:07:18 billnode01 kernel: esi: a00001f8   edi: f7c00080  
ebp: f7bffe74   esp: f7bffd98
Jan 28 04:07:18 billnode01 kernel: ds: 0068   es: 0068   ss: 0068
Jan 28 04:07:18 billnode01 kernel: Process cpqfcTS_wt_0 (pid: 25,
stackpage=f7bff000)
Jan 28 04:07:18 billnode01 kernel: Stack: f7bffdd8 c011db26 c03b9880
00000000 00000001 f7bffdc8 00000000 f7bffd98
Jan 28 04:07:18 billnode01 kernel:        00000000 f7bc0000 f7c00080
c03b9880 c0384000 c0385fd0 c011e26f 00000000
Jan 28 04:07:18 billnode01 kernel:        f7b99000 c03b9880 c03b9880
00000100 00000000 00000000 00000000 00000000
Jan 28 04:07:18 billnode01 kernel: Call Trace:   [<c011db26>]
load_balance [kernel] 0x36 (0xf7bffd9c))
Jan 28 04:07:18 billnode01 kernel: [<c011e26f>] schedule [kernel]
0x19f (0xf7bffdd0))
Jan 28 04:07:18 billnode01 kernel: [<f8846bb2>] cpqfcTSStartExchange
[cpqfc] 0x102 (0xf7bffe0c))
Jan 28 04:07:18 billnode01 kernel: [<f8844694>] IssueReportLunsCommand
[cpqfc] 0x74 (0xf7bffe4c))
Jan 28 04:07:18 billnode01 kernel: [<f88426a2>] cpqfcTS_WorkTask
[cpqfc] 0xb2 (0xf7bffe78))
Jan 28 04:07:18 billnode01 kernel: [<c011db26>] load_balance [kernel]
0x36 (0xf7bffeb4))
Jan 28 04:07:18 billnode01 kernel: [<c011e26f>] schedule [kernel]
0x19f (0xf7bffee8))
Jan 28 04:07:18 billnode01 kernel: [<c0108622>] __down_interruptible
[kernel] 0xd2 (0xf7bfff1c))
Jan 28 04:07:18 billnode01 kernel: [<f884251c>] cpqfcTSWorkerThread
[cpqfc] 0x1dc (0xf7bfff50))
Jan 28 04:07:18 billnode01 kernel: [<f884827a>] .rodata.str1.1 [cpqfc]
0x32e (0xf7bfff58))
Jan 28 04:07:18 billnode01 kernel: [<c0109882>] ret_from_fork [kernel]
0x6 (0xf7bfffbc))
Jan 28 04:07:18 billnode01 kernel: [<f8842340>] cpqfcTSWorkerThread
[cpqfc] 0x0 (0xf7bfffe0))
Jan 28 04:07:18 billnode01 kernel: [<c010759d>] kernel_thread_helper
[kernel] 0x5 (0xf7bffff0))
Jan 28 04:07:18 billnode01 kernel:
Jan 28 04:07:18 billnode01 kernel:
Jan 28 04:07:18 billnode01 kernel: Code: c6 88 f8 56 10 29 d4 89 f2 8d
4c 24 23 83 e1 f0 89 8d 3c ff


Expected results:
No crash.

Additional info:
I have been running the same operating system on the other server (DL
360 G1) with no problems what-so-ever for few month.  When the new
server (DL 360 G3) arrived, I have backed up the operating system and
restored it to the new server.  There is a shared storage attached to
the server via a fiber channel, if that helps.  FC controller has been
transfered from the old server to the new server.

I've seen few hangs after about 2 hours of uptime, so FC controller
firmware has been upgraded as suggested by HP.  Everthing then went
fine for about 18 hours until the crash with this trace.

Rollback to old hardware shows that everything works fine.

Comment 1 Bugzilla owner 2004-09-30 15:41:49 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/