Bug 26633

Summary: [noapic] SCSI Timeout Errors on NCR53c1510D / sym53c1510D
Product: [Retired] Red Hat Linux Reporter: Richard Black <richard.black>
Component: kernelAssignee: Michael K. Johnson <johnsonm>
Status: CLOSED RAWHIDE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: bryan.leopard, john.cagle
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard: Florence Gold
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-03-23 23:10:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Richard Black 2001-02-08 02:41:57 UTC
Red Hat Linux Beta - beta3
Feb 7, 2001
Passing HW: ML370 (no ROC)
Failing HW: DL360 (no ROC), ML570 (no ROC)

SCSI Timeout Errors on NCR53c1510D / sym53c1510D

After successful installation and after rebooting, the system displays the 
following repeating error:

SCSI host 0 abort PID90) times out-resetting
SCSI bus is being reset for host 0 channel 0
sym53c8xx_reset PID=0, reset flags=2

Tried 3 separate ML570s and one DL360.

ML370 has the same chipset and installs OK.

Duplicatable: Yes.

Compaq Bug #: 194115

Compaq consideres this defect a  MUST-FIX for Florence.

Comment 1 Glen Foster 2001-02-08 23:43:19 UTC
This defect is considered MUST-FIX for Florence Gold release

Comment 2 Glen Foster 2001-02-09 23:19:01 UTC
This defect is considered MUST-FIX for Florence Gold release

Comment 3 Michael K. Johnson 2001-02-14 20:18:10 UTC
Does this hardware work with the 2.2 kernel?
Does the ncr53c8xx driver work with this hardware?

Comment 4 Michael K. Johnson 2001-02-28 15:11:46 UTC
Does the boot argument "noapic" affect this?

Comment 5 Michael K. Johnson 2001-03-01 03:23:33 UTC
Also, have you tried changing the OS type in the bios?

Answers to these questions would make it a lot easier for us to
try to track these down...

Comment 6 Michael K. Johnson 2001-03-12 20:45:37 UTC
If we do not get feedback on this report, we won't have much else to
do with it other than close it...

Comment 7 Richard Black 2001-03-14 17:53:54 UTC
The following SCSI issues should prove to be dependent upon the APIC fix found 
in Linux Kernel 2.4.2-patch5.  These SCSI issues are being tested at this time, 
now that the latest beta seems to have the needed patch.

30170 - Full-Table-Mapped APIC mode causes a problem with installing Kernel 
Linux version 2.4.1-0.1.9smp or later.
26632 - Cannot install to a Compaq array controller - hang
26633 - SCSI Timeout Errors on NCR53c1510D / sym53c1510D
26634 - [aic7xxx] Kernel Bug, 64bit slot, adaptec ctrlr, DL360


Comment 8 Richard Black 2001-03-19 15:14:38 UTC
qa0309 does not have the fix needed.  Drivers still error out (timeouts, 
hangs, ...)

Comment 9 Bryan Leopard 2001-03-23 00:14:20 UTC
This information was already forwarded to RedHat.  Just wanted to update 
Bugzilla.

I think weve found the problem thats breaking SMP/APIC support on ServerWorks 
servers.
It appears that when the 2.4.0 kernel was released (on Jan. 4), that an old 
version of drivers/ide/osb4.c crept back in.  Whenever we build a kernel with 
this suspect version enabled, it breaks SMP APIC interrupts.
If you do a diff of osb4.c from 2.4.0-prerelease and 2.4.0, youll find a ton 
of changes.  Our guess is that its one of these changes (probably chipset 
initialization) thats breaking APIC support with SMP kernels.


Here is some more information from our developers.

The below fix works!

I see 2 basic ways for this bug to be fixed:

1) Make sure that the

CONFIG_BLK_DEV_OSB4

line in the kernel .config file is NOT set, or

2) Replace the osb4.c file with the osb4.c file from the 2.4.0-prerelease 
source tree.
  (this is probably a better solution)


Comment 10 Bob Matthews 2001-03-23 21:25:21 UTC
CONFIG_BLK_DEV_OSB4 is off in all our config files

Comment 11 Bob Matthews 2001-03-23 21:39:39 UTC
Waiting for Compaq to officially confirm fix.

Comment 12 John Cagle 2001-03-23 23:10:37 UTC
Compaq confirms that the 2.4.2-0.1.35smp kernel (with CONFIG_BLK_DEV_OSB4 
turned OFF) resolves the SCSI Timeout and CPQARRAY lockups during post-
installation boot.

HOWEVER, Compaq would also like to see the CONFIG_BLK_DEV_OSB4 option marked as 
EXPERIMENTAL in drivers/ide/Config.in in all Florence kernel packages, as well 
as in the "ac" series kernels.  This will keep customers from easily stumbling 
across this problem.


Comment 13 Arjan van de Ven 2001-03-26 14:17:50 UTC
This driver is now marked as "Dangerous" in our kernelsource.
Alan has agreed to this concept and the patch will be mailed to him in a few
minutes.