Bug 609975 - kernel BUG at kernel/timer.c:951! EIP: SendIocReset... [mptbase]
kernel BUG at kernel/timer.c:951! EIP: SendIocReset... [mptbase]
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.0
All Linux
high Severity high
: rc
: ---
Assigned To: Tomas Henzl
Red Hat Kernel QE team
:
Depends On:
Blocks: 582286
  Show dependency treegraph
 
Reported: 2010-07-01 07:19 EDT by Jan Stodola
Modified: 2010-07-27 06:46 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-07-27 06:46:45 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Stodola 2010-07-01 07:19:02 EDT
Description of problem:
This is what I found in console.log when running installation test:

...
running /sbin/loader 
 %Gdetecting hardware... 
waiting for hardware to initialize... 
[-- MARK -- Thu Jul  1 06:55:00 2010] 
BUG: unable to handle kernel  
tg3.c:v3.108 (February 17, 2010) 
  alloc irq_desc for 31 on node -1 
  alloc kstat_irqs on node -1 
tg3 0000:0e:03.0: PCI INT A -> GSI 31 (level, low) -> IRQ 31 
NULL pointer dereference at (null) 
IP: [<f8068a16>] SendIocReset+0x46/0x110 [mptbase] 
*pdpt = 00000000359e2001 *pde = 0000000000000000  
Oops: 0000 [#1] SMP  
last sysfs file: /sys/module/mptspi/initstate 
Modules linked in: tg3(+)(U) mptspi(U) mptscsih(U) mptbase(U) scsi_transport_spi(U) sr_mod(U) cdrom(U) ata_generic(U) pata_acpi(U) pata_amd(U) ipv6(U) iscsi_ibft(U) pcspkr(U) edd(U) floppy(U) iscsi_tcp(U) libiscsi_tcp(U) libiscsi(U) scsi_transport_iscsi(U) squashfs(U) cramfs(U) 
 
Pid: 334, comm: scsi_scan_2 Not tainted (2.6.32-37.el6.i686 #1) Quartet 
EIP: 0060:[<f8068a16>] EFLAGS: 00010246 CPU: 0 
EIP is at SendIocReset+0x46/0x110 [mptbase] 
EAX: 00000000 EBX: c1e2c000 ECX: c0b7a500 EDX: 00000000 
ESI: 20000000 EDI: 00000001 EBP: 00001389 ESP: c1e07b60 
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 
Process scsi_scan_2 (pid: 334, ti=c1e06000 task=f58f8560 task.ti=c1e06000) 
Stack: 
 f58f8560 fffbe1d2 00000286 00000000 c1e2c000 00000001 000003e7 00000001 
<0> f8069157 00000000 00000000 00000000 00000000 30f6ea3b 00000000 c1e2c000 
<0> 20000000 00000003 c1e2c000 00000000 00000001 f8069449 f80729e4 c1e2c008 
Call Trace: 
 [<f8069157>] ? KickStart+0x677/0x8f0 [mptbase] 
 [<f8069449>] ? MakeIocReady+0x79/0x380 [mptbase] 
 [<f806a469>] ? mpt_do_ioc_recovery+0x299/0x18d0 [mptbase] 
 [<c04421a3>] ? finish_task_switch+0x33/0xa0 
 [<c0816a7f>] ? schedule+0x42f/0xae0 
 [<c0474ebb>] ? up+0xb/0x40 
 [<c044ff26>] ? release_console_sem+0x1a6/0x1f0 
 [<c045f640>] ? process_timeout+0x0/0x10 
 [<f80a6c57>] ? mptspi_ioc_reset+0x17/0x50 [mptspi] 
 [<f806bb53>] ? mpt_HardResetHandler+0xb3/0x220 [mptbase] 
 [<f806c220>] ? mpt_config+0x380/0x550 [mptbase] 
 [<f80a7b50>] ? mptspi_write_spi_device_pg1+0x160/0x450 [mptspi] 
 [<c081afd0>] ? do_page_fault+0x1c0/0x480 
 [<c043671a>] ? kmap_atomic_prot+0x11a/0x150 
 [<c05e5404>] ? vsnprintf+0xd4/0x400 
 [<f80a7e9d>] ? mptspi_write_width+0x5d/0x70 [mptspi] 
 [<f80a8020>] ? mptspi_target_alloc+0x170/0x270 [mptspi] 
 [<c06a60d1>] ? attribute_container_add_device+0x51/0x180 
 [<c06ba988>] ? scsi_alloc_target+0x248/0x2b0 
 [<c06bb956>] ? __scsi_scan_target+0x66/0x6d0 
 [<c04081e7>] ? __switch_to+0xd7/0x1a0 
 [<c06bc037>] ? scsi_scan_channel+0x77/0x90 
 [<c06bc131>] ? scsi_scan_host_selected+0xe1/0x170 
 [<c06bc236>] ? do_scsi_scan_host+0x76/0x80 
 [<c06bc251>] ? do_scan_async+0x11/0x120 
 [<c06bc240>] ? do_scan_async+0x0/0x120 
 [<c0470094>] ? kthread+0x74/0x80 
 [<c0470020>] ? kthread+0x0/0x80 
 [<c040a547>] ? kernel_thread_helper+0x7/0x10 
Code: f2 8b 83 e8 00 00 00 c1 e6 18 89 30 89 fa 89 d8 e8 a0 e7 ff ff 31 ed 85 c0 0f 88 89 00 00 00 8d b6 00 00 00 00 8b 83 e8 00 00 00 <8b> 30 81 e6 00 00 00 f0 81 fe 00 00 00 10 89 b3 0c 01 00 00 74  
EIP: [<f8068a16>] SendIocReset+0x46/0x110 [mptbase] SS:ESP 0068:c1e07b60 
CR2: 0000000000000000 
------------[ cut here ]------------ 
kernel BUG at kernel/timer.c:951! 
invalid opcode: 0000 [#2] SMP  
last sysfs file: /sys/module/mptspi/initstate 
Modules linked in: tg3(+)(U) mptspi(U) mptscsih(U) mptbase(U) scsi_transport_spi(U) sr_mod(U) cdrom(U) ata_generic(U) pata_acpi(U) pata_amd(U) ipv6(U)


Version-Release number of selected component (if applicable):
RHEL6.0-20100622.1 / i386 / Server
kernel-2.6.32-37.el6.i686.rpm  

How reproducible:
tried only once

Steps to Reproduce:
1. try to install RHEL6.0-20100622.1 / i386 / Server in Beaker

Actual results:
kernel panic

Expected results:
successful installation and boot
Comment 3 Tom Coughlan 2010-07-19 18:30:56 EDT
LSI folks: this looks like it is in mpt fusion. Please take a look. 

(In reply to comment #0)

> How reproducible:
> tried only once

Jan, it looks like the system tries to do a write, then we land in  mpt_HardResetHandler,
then somewhere in the error recovery path, we get a NULL pointer dereference. 

This could be caused by a hardware error. The system should not crash, but it might be helpful to know whether the problem is reproducible, and whether the hardware works with other versions of the o.s.. If you can give it a try that would be appreciated.
Comment 4 kashyap 2010-07-20 03:09:54 EDT
(In reply to comment #3)
> LSI folks: this looks like it is in mpt fusion. Please take a look. 
> 
> (In reply to comment #0)
> 
> > How reproducible:
> > tried only once
> 
> Jan, it looks like the system tries to do a write, then we land in 
> mpt_HardResetHandler,
> then somewhere in the error recovery path, we get a NULL pointer dereference. 
> 
> This could be caused by a hardware error. The system should not crash, but it
> might be helpful to know whether the problem is reproducible, and whether the
> hardware works with other versions of the o.s.. If you can give it a try that
> would be appreciated.    

I agree with Tom. Meanwhile can you attach object dump for mptbase.
"objdump -Sd mptbase.o > mptbase.dump"

Thanks, Kashyap
Comment 6 Tomas Henzl 2010-07-27 06:46:45 EDT
The latest indicates that this is most probably a hardware problem.
I'm closing this one now.

Note You need to log in before you can comment on or make changes to this bug.