Bug 55564 - (SCSI DPT_I2O)kernel: Oops: 0002
(SCSI DPT_I2O)kernel: Oops: 0002
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.0
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-11-01 20:13 EST by Need Real Name
Modified: 2007-04-18 12:37 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-06-07 16:14:27 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lspci -vvv (5.64 KB, text/plain)
2001-11-01 20:22 EST, Need Real Name
no flags Details
df -k (256 bytes, text/plain)
2001-11-01 20:23 EST, Need Real Name
no flags Details
dmesg (6.01 KB, text/plain)
2001-11-01 20:44 EST, Need Real Name
no flags Details
lsmod (407 bytes, text/plain)
2001-11-02 09:26 EST, Need Real Name
no flags Details
lsof (just fyi, about 50k) (49.54 KB, text/plain)
2001-11-02 09:27 EST, Need Real Name
no flags Details
top (5.00 KB, text/plain)
2001-11-02 09:31 EST, Need Real Name
no flags Details
ps auxw (5.14 KB, text/plain)
2001-11-02 09:36 EST, Need Real Name
no flags Details

  None (edit)
Description Need Real Name 2001-11-01 20:13:43 EST
Description of Problem:
RH 7.0 / kernel 2.2.16-22 freezes up and requires reboot.


Version-Release number of selected component (if applicable):
Linux bnfs01.photopoint.com 2.2.16-22 #1 Tue Aug 22 16:49:06 EDT 2000 i686
unknown
Red Hat Linux release 7.0 (Guinness)


How Reproducible:
Occurs, apparently, at random intervals (including twice today).


Steps to Reproduce:
1. -
2. -
3. -


Actual Results:
-


Expected Results:
-


Additional Information:


My main question is:
Would this be a kernel issue, or a hardware issue?


Snip out of /var/log/messages for one occurance:
	Nov  1 04:08:11 bnfs01 sshd2[23268]: protocol version not supported in
local: 'Illegal protocol version.'
***	Nov  1 04:08:35 bnfs01 kernel: kfree: Bad obj 82eb6340
***	Nov  1 04:08:35 bnfs01 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
***	Nov  1 04:08:35 bnfs01 kernel: current->tss.cr3 = 04fad000, %cr3 =
04fad000
***	Nov  1 04:08:35 bnfs01 kernel: *pde = 00000000
***	Nov  1 04:08:35 bnfs01 kernel: Oops: 0002
***	Nov  1 04:08:35 bnfs01 kernel: CPU:    0
***	Nov  1 04:08:35 bnfs01 kernel: EIP:    0010:[kfree+403/424]
***	Nov  1 04:08:35 bnfs01 kernel: EFLAGS: 00010286
***	Nov  1 04:08:35 bnfs01 kernel: eax: 0000001b   ebx: c56307c0   ecx:
0000001e   edx: 0000001d
***	Nov  1 04:08:35 bnfs01 kernel: esi: 82eb6340   edi: c336d760   ebp:
000002aa   esp: c0c55e68
***	Nov  1 04:08:35 bnfs01 kernel: ds: 0018   es: 0018   ss: 0018
***	Nov  1 04:08:35 bnfs01 kernel: Process updatedb (pid: 23103, process
nr: 68, stackpage=c0c55000)
***	Nov  1 04:08:35 bnfs01 kernel: Stack: c56307c0 c2eb62e0 c336d760
000002aa c2eb62e0 c336d760 c01313c4 82eb6340
***	Nov  1 04:08:35 bnfs01 kernel:        c0c55ed0 c0c55ed0 c021e264
00000fff c0c55ed0 00000001 00000fff c013235b
***	Nov  1 04:08:35 bnfs01 kernel:        fffff2ab 00000fff 00000000
c025a5c0 c021e264 c025a5c0 c3477180 c327cdd0
***	Nov  1 04:08:35 bnfs01 kernel: Call Trace: [prune_dcache+220/300]
[try_to_free_inodes+199/264] [grow_inodes+30/384] [get_new_inode+173/280]
[get_new_inode+185/280] [iget+88/96] [ext2_lookup+84/124]
***	Nov  1 04:08:35 bnfs01 kernel:        [real_lookup+79/160]
[lookup_dentry+296/488] [__namei+40/88] [sys_newlstat+14/96]
[system_call+52/56] [startup_32+43/285]
***	Nov  1 04:08:35 bnfs01 kernel: Code: c7 05 00 00 00 00 00 00 00 00 83
c4 08 5b 5e 5f 5d 83 c4 08
	Nov  1 07:08:36 bnfs01 syslogd 1.3-3: restart.


Snip out of /var/log/messages for another occurance:
	Nov  1 11:12:29 bnfs01 kernel: svc: unknown program 100227 (me 100003)
***	Nov  1 15:26:55 bnfs01 kernel: kfree: Bad obj 80aeefa0
***	Nov  1 15:26:55 bnfs01 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
***	Nov  1 15:26:55 bnfs01 kernel: current->tss.cr3 = 00101000, %cr3 =
00101000
***	Nov  1 15:26:55 bnfs01 kernel: *pde = 00000000
***	Nov  1 15:26:55 bnfs01 kernel: Oops: 0002
***	Nov  1 15:26:55 bnfs01 kernel: CPU:    0
***	Nov  1 15:26:55 bnfs01 kernel: EIP:    0010:[kfree+403/424]
***	Nov  1 15:26:55 bnfs01 kernel: EFLAGS: 00010282
***	Nov  1 15:26:55 bnfs01 kernel: eax: 0000001b   ebx: c0541720   ecx:
00000000   edx: 0000003b
***	Nov  1 15:26:55 bnfs01 kernel: esi: 80aeefa0   edi: c2eb2440   ebp:
0000041c   esp: c6105df8
***	Nov  1 15:26:55 bnfs01 kernel: ds: 0018   es: 0018   ss: 0018
***	Nov  1 15:26:55 bnfs01 kernel: Process nfsd (pid: 573, process nr: 40,
stackpage=c6105000)
***	Nov  1 15:26:55 bnfs01 kernel: Stack: c0541720 c2eb62e0 c2eb2440
0000041c c2eb62e0 c2eb2440 c01313c4 80aeefa0
***	Nov  1 15:26:55 bnfs01 kernel:        c6105e60 c6105e60 c021e264
00000dd7 c6105e60 00000001 00000dd7 c013235b
***	Nov  1 15:26:55 bnfs01 kernel:        fffff43b 00000dd7 00000000
c025a280 c021e264 c025a280 c11954e0 c6762330
***	Nov  1 15:26:55 bnfs01 kernel: Call Trace: [prune_dcache+220/300]
[try_to_free_inodes+199/264] [grow_inodes+30/384] [inet_sendmsg+0/144]
[get_new_inode+185/280] [iget+88/96] [ext2_lookup+84/124]
***	Nov  1 15:26:55 bnfs01 kernel:        [real_lookup+79/160]
[lookup_dentry+296/488] [<c8163ac4>] [<c816a960>] [<c8161b08>] [<c816a960>]
[<c8161437>] [<c816a960>]
***	Nov  1 15:26:55 bnfs01 kernel:        [<c814f468>] [<c816ad20>]
[<c816a88c>] [<c8161235>] [kernel_thread+35/48]
***	Nov  1 15:26:55 bnfs01 kernel: Code: c7 05 00 00 00 00 00 00 00 00 83
c4 08 5b 5e 5f 5d 83 c4 08
	Nov  1 16:52:34 bnfs01 syslogd 1.3-3: restart.


cat /proc/cpuinfo:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 1
cpu MHz         : 551.265
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
sep_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov
pat pse36 mmx fxsr xmm
bogomips        : 1101.00


lspci -v:
00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge
(rev 03)
        Subsystem: Asustek Computer, Inc.: Unknown device 8024
        Flags: bus master, medium devsel, latency 64
        Memory at e4000000 (32-bit, prefetchable)
        Capabilities: <available only to root>

00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev
03) (prog-if 00 [Normal decode])
        Flags: bus master, 66Mhz, medium devsel, latency 64
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
        I/O behind bridge: 0000d000-0000dfff
        Memory behind bridge: dd800000-dfefffff
        Prefetchable memory behind bridge: e3f00000-e3ffffff

00:04.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
        Flags: bus master, medium devsel, latency 0

00:04.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
(prog-if 80 [Master])
        Flags: bus master, medium devsel, latency 32
        I/O ports at b800

00:04.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
(prog-if 00 [UHCI])
        Flags: bus master, medium devsel, latency 0, IRQ 12
        I/O ports at b400

00:04.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
        Flags: medium devsel

00:0b.0 PCI bridge: Distributed Processing Technology PCI Bridge (rev 02)
(prog-if 00 [Normal decode])
        Flags: bus master, medium devsel, latency 32
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
        Capabilities: <available only to root>

00:0b.1 I2O: Distributed Processing Technology SmartRAID V Controller (rev
02) (prog-if 01)
        Subsystem: Distributed Processing Technology: Unknown device c05a
        Flags: bus master, medium devsel, latency 64, IRQ 10
        BIST result: 00
        Memory at e0000000 (32-bit, prefetchable)
        Capabilities: <available only to root>

00:0d.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink]
(rev 74)
        Subsystem: 3Com Corporation 3C905C-TX Fast Etherlink for PC
Management NIC
        Flags: bus master, medium devsel, latency 32, IRQ 12
        I/O ports at a800
        Memory at dd000000 (32-bit, non-prefetchable)
        Capabilities: <available only to root>

01:00.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro AGP
1X/2X (rev 5c) (prog-if 00 [VGA])
        Subsystem: ATI Technologies Inc: Unknown device 0084
        Flags: bus master, stepping, medium devsel, latency 64, IRQ 11
        Memory at de000000 (32-bit, non-prefetchable)
        I/O ports at d800
        Memory at dd800000 (32-bit, non-prefetchable)
        Expansion ROM at e3fe0000 [disabled]
        Capabilities: <available only to root>
Comment 1 Need Real Name 2001-11-01 20:22:41 EST
Created attachment 36121 [details]
lspci -vvv
Comment 2 Need Real Name 2001-11-01 20:23:34 EST
Created attachment 36122 [details]
df -k
Comment 3 Need Real Name 2001-11-01 20:44:37 EST
Created attachment 36123 [details]
dmesg
Comment 4 Need Real Name 2001-11-01 21:16:46 EST
Since the calltrace mentions inodes and ext2, I guessed it could be some disk
failure, or maybe a bug in the scsi driver.

I've been browsing around the Net quite a bit, looking for similar occurances,
but haven't found any that looks like the same problem.  Here's one example:
    http://www.uwsg.indiana.edu/hypermail/linux/kernel/9907.1/0823.html

Any help/comments/input would be greatly appreciated.  :-)
Comment 5 Arjan van de Ven 2001-11-02 04:20:07 EST
Can you also attach the output of "lsmod" ?
Comment 6 Need Real Name 2001-11-02 09:26:14 EST
Created attachment 36183 [details]
lsmod
Comment 7 Need Real Name 2001-11-02 09:27:11 EST
Created attachment 36184 [details]
lsof (just fyi, about 50k)
Comment 8 Need Real Name 2001-11-02 09:31:08 EST
Created attachment 36185 [details]
top
Comment 9 Need Real Name 2001-11-02 09:36:09 EST
Created attachment 36186 [details]
ps auxw
Comment 10 Need Real Name 2001-11-02 09:40:45 EST
cat /etc/modules.conf:

alias scsi_hostadapter dpt_i2o 
alias eth0 3c59x 
options 3c59x options=4 full_duplex=1 debug=1
alias parport_lowlevel parport_pc 
alias eth1 3c90x 
alias usb-controller usb-uhci
Comment 11 Need Real Name 2001-11-02 17:40:31 EST
Here's a (potentially) missing piece of information:

The lsmod output shows a dpt_i2o, which is the driver for the Adaptec ATA RAID,
Model 2400A, that's in the box.

Relevant links are:
http://linux.adaptec.com/
http://www.adaptec.com/worldwide/support/driverdetail.html?cat=%2fOperating+System%2fLinux&filekey=aar2400_linux_v221_drv.rpm

Adaptec's driver is specifically for RH 7.0, which the reason we're using that
in this case.

I hope their driver isn't the the cause of the failure, but I'll send them a
link to this page, so they know.
Comment 12 Alan Cox 2003-06-07 16:14:27 EDT
The later 2.4.x kernels have DPT i2o support as standard and somewhat cleaned
up. Please re-open the bug if the problem is still occuring with 2.4.x kernels.
Thanks

Note You need to log in before you can comment on or make changes to this bug.