From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208
Description of problem:
I am using a dual 1.8GHz Xeon HP X4000 workstation. Using the aforementioned
kernel causes a SCSI timeout when the station has been left idle for a number of
hours (usually overnight).
Sample from Messages:
May 30 22:39:37 xmasterz kernel: scsi : aborting command due to timeout : pid
72609, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 04 be e3 00 00 58 00
May 30 22:39:37 xmasterz kernel: sym53c8xx_abort: pid=72609 serial_number=72609
May 30 22:39:37 xmasterz kernel: SCSI host 0 abort (pid 72609) timed out - resetting
May 30 22:39:37 xmasterz kernel: SCSI bus is being reset for host 0 channel 0.
May 30 22:39:37 xmasterz kernel: sym53c8xx_reset: pid=72609 reset_flags=2
May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: restart (scsi reset).
May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: handling phase mismatch from
May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: Downloading SCSI SCRIPTS.
May 30 22:40:32 xmasterz kernel: sym53c1010-66-0-<0,0>: ordered tag forced.
May 30 22:40:39 xmasterz kernel: SCSI host 0 abort (pid 72610) timed out - resetting
May 30 22:40:39 xmasterz kernel: SCSI bus is being reset for host 0 channel 0.
This has actually resulted in the bios being unable to pick-up disc signatures.
I have flashed the bios but the error persists
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.get mentioned hardware
2.update to mentioned kernel ver.
3.leave overnight on gas mark 1
4. system nicely cooked
Actual Results: porkage
Expected Results: porkage
problem is not apparent during the day (when I am using the system).
I can think of two causes for this. Firstly the abort could be because of a
genuine problem - a drive aborting a command for example, secondly it might be a
really weird interaction with bios power management. The bios not finding the
disk sounds like the disk firmware crashed.
First thing to try would be disabling any power management in the bios and then
booting with apm=off as a boot option. I'm not sure it will change anything but
it eliminates one suspicion
I had to get a quick resolution for this so I slapped in a HP NetRaid
controller, disabled the on-board SCSI and rebuilt the system. No more SCSI
time-outs. I'm unlikely to return the system to it's faulty state but I really
appreciated the comments. The bios power management sounds a good contender if
somewhat troubling (it really shouldn't do that!). If I do get my hands on a
couple of extra disks I'll re-enable the on-board controller - slap them in and
let you know what happens.