Bug 92056 - Symbios SCSI timeout
Summary: Symbios SCSI timeout
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-06-02 08:53 UTC by neil
Modified: 2007-04-18 16:54 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-06-09 08:53:02 UTC
Embargoed:


Attachments (Terms of Use)

Description neil 2003-06-02 08:53:38 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208
Netscape/7.02

Description of problem:
I am using a dual 1.8GHz Xeon HP X4000 workstation. Using the aforementioned
kernel causes a SCSI timeout when the station has been left idle for a number of
hours (usually overnight).

Sample from Messages:

May 30 22:39:37 xmasterz kernel: scsi : aborting command due to timeout : pid
72609, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 04 be e3 00 00 58 00
May 30 22:39:37 xmasterz kernel: sym53c8xx_abort: pid=72609 serial_number=72609
serial_number_at_timeout=72609
May 30 22:39:37 xmasterz kernel: SCSI host 0 abort (pid 72609) timed out - resetting
May 30 22:39:37 xmasterz kernel: SCSI bus is being reset for host 0 channel 0.
May 30 22:39:37 xmasterz kernel: sym53c8xx_reset: pid=72609 reset_flags=2
serial_number=72609 serial_number_at_timeout=72609
May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: restart (scsi reset).
May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: handling phase mismatch from
SCRIPTS.
May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: Downloading SCSI SCRIPTS.
May 30 22:40:32 xmasterz kernel: sym53c1010-66-0-<0,0>: ordered tag forced.
May 30 22:40:39 xmasterz kernel: SCSI host 0 abort (pid 72610) timed out - resetting
May 30 22:40:39 xmasterz kernel: SCSI bus is being reset for host 0 channel 0.

This has actually resulted in the bios being unable to pick-up disc signatures.
I have flashed the bios but the error persists

Version-Release number of selected component (if applicable):
kernel-2.4.20-13.7smp

How reproducible:
Always

Steps to Reproduce:
1.get mentioned hardware
2.update to mentioned kernel ver.
3.leave overnight on gas mark 1
4. system nicely cooked
    

Actual Results:  porkage

Expected Results:  porkage

Additional info:

problem is not apparent during the day (when I am using the system).

Comment 1 Alan Cox 2003-06-08 12:19:39 UTC
I can think of two causes for this. Firstly the abort could be because of a
genuine problem - a drive aborting a command for example, secondly it might be a
really weird interaction with bios power management.  The bios not finding the
disk sounds like the disk firmware crashed.

First thing to try would be disabling any power management in the bios and then
booting with apm=off as a boot option. I'm not sure it will change anything but
it eliminates one suspicion


Comment 2 neil 2003-06-09 08:53:02 UTC
I had to get a quick resolution for this so I slapped in a HP NetRaid
controller, disabled the on-board SCSI and rebuilt the system. No more SCSI
time-outs. I'm unlikely to return the system to it's faulty state but I really
appreciated the comments. The bios power management sounds a good contender if
somewhat troubling (it really shouldn't do that!). If I do get my hands on a
couple of extra disks I'll re-enable the on-board controller - slap them in and
let you know what happens.
Cheers :n) 


Note You need to log in before you can comment on or make changes to this bug.