Bug 89164 - sym53c8xx-1.7.3c-20010512 loops forever timing out
Summary: sym53c8xx-1.7.3c-20010512 loops forever timing out
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-04-18 20:57 UTC by Joe Keller
Modified: 2007-04-18 16:53 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2003-12-17 01:41:50 UTC
Embargoed:


Attachments (Terms of Use)

Description Joe Keller 2003-04-18 20:57:51 UTC
Description of problem:

Periodically when pressing the reset button on a Pentium 3 cPCI blade which is 
connected to a dual IDE disk drive carrier with SCSI interface, the boot hangs 
with the following timeout messages:

sym53c1010-66-0: rev 0x1 on pci bus 1 device 4 function 0 irq 26
sym53c1010-66-0: Symbios format NVRAM, ID 7, Fast-80, Parity Checking
sym53c1010-66-0: on-chip RAM at 0xfe000000
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.
sym53c1010-66-1: rev 0x1 on pci bus 1 device 4 function 1 irq 27
sym53c1010-66-1: Symbios format NVRAM, ID 7, Fast-80, Parity Checking
sym53c1010-66-1: on-chip RAM at 0xfe002000
sym53c1010-66-1: restart (scsi reset).
sym53c1010-66-1: handling phase mismatch from SCRIPTS.
sym53c1010-66-1: Downloading SCSI SCRIPTS.
scsi0 : sym53c8xx-1.7.3c-20010512
scsi1 : sym53c8xx-1.7.3c-20010512
blk: queue f7b6c018, I/O limit 4095Mb (mask 0xffffffff)
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun 0 
Inquiry 00 00 00 ff 00 
sym53c8xx_abort: pid=0 serial_number=1 serial_number_at_timeout=1
SCSI host 0 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
sym53c8xx_reset: pid=0 reset_flags=2 serial_number=1 serial_number_at_timeout=1
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.
SCSI host 0 abort (pid 1) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
sym53c8xx_reset: pid=1 reset_flags=2 serial_number=2 serial_number_at_timeout=2
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.
SCSI host 0 abort (pid 2) timed out - resetting
SCSI bus is being reset for host 0 channel 0.

The problem isn't cleared up until the disk carrier is reseated.  Reseating the 
Pentium 3 blade doesn't clear up the problem.  Problem
has never been displayed when performing Linux reboot command.



Version-Release number of selected component (if applicable):

kernel-smp-2.4.18-10

How reproducible:

kernel-smp-2.4.18-10

Steps to Reproduce:
1. Press reset button
2.
3.
    
Actual results:

Keep getting the following where XXX increments by 1

SCSI host 0 abort (pid XXX) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
sym53c8xx_reset: pid=XXX reset_flags=2 serial_number=XXX 
serial_number_at_timeout=XXX
sym53c1010-66-0: restart (scsi reset).
sym53c1010-66-0: handling phase mismatch from SCRIPTS.
sym53c1010-66-0: Downloading SCSI SCRIPTS.


Expected results:

The disk carrier should be recognized as follows:

blk: queue f7b6c018, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: ADTRON    Model: SC6x2 Q16015_045  Rev: 2s04
  Type:   Direct-Access                      ANSI SCSI revision: 02
blk: queue f7541e18, I/O limit 4095Mb (mask 0xffffffff)
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
sym53c1010-66-0-<0,0>: phase change 6-7 9@003f34b8 resid=5.
sym53c1010-66-0-<0,*>: FAST-10 SCSI 10.0 MB/s (100.0 ns, offset 7)
SCSI device sda: 39070080 512-byte hdwr sectors (20004 MB)
 sda: sda1 sda2


Additional info:

Comment 1 Arjan van de Ven 2003-04-19 10:39:37 UTC
please try the erratum kernel as well.
In addition we ship an alternate driver (sym53c8xx_2) that might have improved
behavior for this case. However it really sounds like the hardware is getting
confused by the reset button's power break


Note You need to log in before you can comment on or make changes to this bug.