From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 Description of problem: I am using a dual 1.8GHz Xeon HP X4000 workstation. Using the aforementioned kernel causes a SCSI timeout when the station has been left idle for a number of hours (usually overnight). Sample from Messages: May 30 22:39:37 xmasterz kernel: scsi : aborting command due to timeout : pid 72609, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 04 be e3 00 00 58 00 May 30 22:39:37 xmasterz kernel: sym53c8xx_abort: pid=72609 serial_number=72609 serial_number_at_timeout=72609 May 30 22:39:37 xmasterz kernel: SCSI host 0 abort (pid 72609) timed out - resetting May 30 22:39:37 xmasterz kernel: SCSI bus is being reset for host 0 channel 0. May 30 22:39:37 xmasterz kernel: sym53c8xx_reset: pid=72609 reset_flags=2 serial_number=72609 serial_number_at_timeout=72609 May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: restart (scsi reset). May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: handling phase mismatch from SCRIPTS. May 30 22:39:37 xmasterz kernel: sym53c1010-66-0: Downloading SCSI SCRIPTS. May 30 22:40:32 xmasterz kernel: sym53c1010-66-0-<0,0>: ordered tag forced. May 30 22:40:39 xmasterz kernel: SCSI host 0 abort (pid 72610) timed out - resetting May 30 22:40:39 xmasterz kernel: SCSI bus is being reset for host 0 channel 0. This has actually resulted in the bios being unable to pick-up disc signatures. I have flashed the bios but the error persists Version-Release number of selected component (if applicable): kernel-2.4.20-13.7smp How reproducible: Always Steps to Reproduce: 1.get mentioned hardware 2.update to mentioned kernel ver. 3.leave overnight on gas mark 1 4. system nicely cooked Actual Results: porkage Expected Results: porkage Additional info: problem is not apparent during the day (when I am using the system).
I can think of two causes for this. Firstly the abort could be because of a genuine problem - a drive aborting a command for example, secondly it might be a really weird interaction with bios power management. The bios not finding the disk sounds like the disk firmware crashed. First thing to try would be disabling any power management in the bios and then booting with apm=off as a boot option. I'm not sure it will change anything but it eliminates one suspicion
I had to get a quick resolution for this so I slapped in a HP NetRaid controller, disabled the on-board SCSI and rebuilt the system. No more SCSI time-outs. I'm unlikely to return the system to it's faulty state but I really appreciated the comments. The bios power management sounds a good contender if somewhat troubling (it really shouldn't do that!). If I do get my hands on a couple of extra disks I'll re-enable the on-board controller - slap them in and let you know what happens. Cheers :n)