Bug 147269 - Ubootable system with 2.4.21-27.0.x kernels and sym53c8xx_2
Summary: Ubootable system with 2.4.21-27.0.x kernels and sym53c8xx_2
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-02-05 22:30 UTC by Joshua Baker-LePain
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-02-09 20:01:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Joshua Baker-LePain 2005-02-05 22:30:16 UTC
Description of problem:

I have a system with an Overland Powerloader AIT3 tape library hanging
off a Tekram DC-390U3W controller.  With recent kernels, this system
becomes unbootable.  Trying to boot kernel-smp-2.4.21-27.0.2.EL
results in a string of SCSI bus resets and finally an OOPS.  Trying to
boot kernel-smp-2.4.21-27.0.1.EL results in (seemingly) neverending
SCSI bus resets.  kernel-smp-2.4.21-20.0.1.EL works just fine.  I'm
using the sym53c8xx_2 driver, which seems to be the same version among
these kernels.

System info: dual 2.4 GHz Xeons on Supermicro X5DPI-G2 motherboard. 
lspci info for Tekram controller:
02:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010
Ultra3 SCSI Adapter (rev 01)
        Subsystem: Tekram Technology Co.,Ltd. DC-390U3W
        Flags: bus master, medium devsel, latency 140, IRQ 48
        I/O ports at 3000 [size=256]
        Memory at fc204000 (64-bit, non-prefetchable) [size=1K]
        Memory at fc200000 (64-bit, non-prefetchable) [size=8K]
        Expansion ROM at <unassigned> [disabled] [size=64K]
        Capabilities: [40] Power Management version 2

02:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010
Ultra3 SCSI Adapter (rev 01)
        Subsystem: Tekram Technology Co.,Ltd. DC-390U3W
        Flags: bus master, medium devsel, latency 140, IRQ 48
        I/O ports at 3400 [size=256]
        Memory at fc204400 (64-bit, non-prefetchable) [size=1K]
        Memory at fc202000 (64-bit, non-prefetchable) [size=8K]
        Expansion ROM at <unassigned> [disabled] [size=64K]
        Capabilities: [40] Power Management version 2

cat /proc/scsi/scsi:
Attached devices: 
Host: scsi0 Channel: 00 Id: 03 Lun: 00
  Vendor: OVERLAND Model: LIBRARYPRO       Rev: 0420
  Type:   Medium Changer                   ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: SONY     Model: SDX-700C         Rev: 0202
  Type:   Sequential-Access                ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 05 Lun: 00
  Vendor: SONY     Model: SDX-700C         Rev: 0202
  Type:   Sequential-Access                ANSI SCSI revision: 02


Version-Release number of selected component (if applicable):


How reproducible: Always


Steps to Reproduce:
1.  Boot system with recent kernel
  
Actual results:  SCSI bus resets/OOPS


Expected results: Normal operation


Additional info:

Comment 1 Suzanne Hillman 2005-02-07 20:52:42 UTC
The version of the RHEL3 kernel which was last released was
kernel-smp-2.4.21-27.EL, not the versions you mention above.

Please check if you can repeat the problem on that kernel.

Comment 2 Ernie Petrides 2005-02-07 22:37:53 UTC
Actually, 2.4.21-27.0.2.EL is the latest release (2nd security
errata release after U4), so this is fine.


Comment 5 Tom Coughlan 2005-02-09 18:45:01 UTC
Joshua,

You are right, there were no changes to the driver. There were also no changes
to the SCSI midlayer, including the SCSI whitelist, that look relevant. 

Are the SCSI bus resets preceeded by command timeout messages? Please post the
messages.

My guess is that interrups are not getting delivered, causing command timeouts,
which cause bus resets. Let us know what type of system it is.

Tom

Comment 6 Joshua Baker-LePain 2005-02-09 20:01:36 UTC
Well color me confused.  Now that I'm trying to capture logs, I can't
recreate the bug.  27.0.1 and 27.0.2 boot in the same manner 20.0.1
did.    What's even better is that I haven't touched the library since
those kernels wouldn't boot.  Before I opened this bug, I had tried
all the obvious things (checked cables, power-cycled the loader, etc),
and it still wouldn't boot.  Now, a few days (and backup cycles, but
no power cycles of the loader) later, it seems to work just fine.

I guess I'll close this and mark it as a Heisenbug.  Sorry for the noise.


Note You need to log in before you can comment on or make changes to this bug.