Bug 89164

Summary: sym53c8xx-1.7.3c-20010512 loops forever timing out
Product: [Retired] Red Hat Linux Reporter: Joe Keller <joseph.keller>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-12-17 01:41:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Joe Keller 2003-04-18 20:57:51 UTC
Description of problem:

Periodically when pressing the reset button on a Pentium 3 cPCI blade which is 
connected to a dual IDE disk drive carrier with SCSI interface, the boot hangs 
with the following timeout messages:

sym53c1010-66-0: rev 0x1 on pci bus 1 device 4 function 0 irq 26
sym53c1010-66-0: Symbios format NVRAM, ID 7, Fast-80, Parity Checking
sym53c1010-66-0: on-chip RAM at 0xfe000000
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.
sym53c1010-66-1: rev 0x1 on pci bus 1 device 4 function 1 irq 27
sym53c1010-66-1: Symbios format NVRAM, ID 7, Fast-80, Parity Checking
sym53c1010-66-1: on-chip RAM at 0xfe002000
sym53c1010-66-1: restart (scsi reset).
sym53c1010-66-1: handling phase mismatch from SCRIPTS.
sym53c1010-66-1: Downloading SCSI SCRIPTS.
scsi0 : sym53c8xx-1.7.3c-20010512
scsi1 : sym53c8xx-1.7.3c-20010512
blk: queue f7b6c018, I/O limit 4095Mb (mask 0xffffffff)
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun 0 
Inquiry 00 00 00 ff 00 
sym53c8xx_abort: pid=0 serial_number=1 serial_number_at_timeout=1
SCSI host 0 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
sym53c8xx_reset: pid=0 reset_flags=2 serial_number=1 serial_number_at_timeout=1
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.
SCSI host 0 abort (pid 1) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
sym53c8xx_reset: pid=1 reset_flags=2 serial_number=2 serial_number_at_timeout=2
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.
SCSI host 0 abort (pid 2) timed out - resetting
SCSI bus is being reset for host 0 channel 0.

The problem isn't cleared up until the disk carrier is reseated.  Reseating the 
Pentium 3 blade doesn't clear up the problem.  Problem
has never been displayed when performing Linux reboot command.



Version-Release number of selected component (if applicable):

kernel-smp-2.4.18-10

How reproducible:

kernel-smp-2.4.18-10

Steps to Reproduce:
1. Press reset button
2.
3.
    
Actual results:

Keep getting the following where XXX increments by 1

SCSI host 0 abort (pid XXX) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
sym53c8xx_reset: pid=XXX reset_flags=2 serial_number=XXX 
serial_number_at_timeout=XXX
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.


Expected results:

The disk carrier should be recognized as follows:

blk: queue f7b6c018, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: ADTRON    Model: SC6x2 Q16015_045  Rev: 2s04
  Type:   Direct-Access                      ANSI SCSI revision: 02
blk: queue f7541e18, I/O limit 4095Mb (mask 0xffffffff)
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
sym53c1010-66-0-<0,0>: phase change 6-7 9@003f34b8 resid=5.
sym53c1010-66-0-<0,*>: FAST-10 SCSI 10.0 MB/s (100.0 ns, offset 7)
SCSI device sda: 39070080 512-byte hdwr sectors (20004 MB)
 sda: sda1 sda2


Additional info:

Comment 1 Arjan van de Ven 2003-04-19 10:39:37 UTC
please try the erratum kernel as well.
In addition we ship an alternate driver (sym53c8xx_2) that might have improved
behavior for this case. However it really sounds like the hardware is getting
confused by the reset button's power break