Bug 128765 - System hangs on fiber channel errors with qla2100 HBA's
System hangs on fiber channel errors with qla2100 HBA's
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
2
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-07-29 05:03 EDT by John Bass
Modified: 2015-01-04 17:08 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-04-16 00:10:45 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description John Bass 2004-07-29 05:03:38 EDT
Description of problem:
Hardware errors and load induced errors result in system hangs.

Version-Release number of selected component (if applicable):


How reproducible:
Very - hang several drives off T-Cards with copper cables, and
initialize them as a Raid5 /dev/md drive and use root drive on
a very busy server. Particularly if the drives have read errors.
System is stable with Emulex LP8000's under RH9, dies under FC2
with QLA2100's.

Steps to Reproduce:
1.
2.
3.
  
Actual results:
kernel hangs looping with this error text in system log

Jul 28 05:42:22 rmhit kernel: qla2100 0000:04:04.0: qla2xxx_eh_abort
scsi(2:0:5:0): cmd_timeout_in_sec=0x1e.
Jul 28 05:42:22 rmhit kernel: Debug: sleeping function called from
invalid context at include/asm/semaphore.h:119
Jul 28 05:42:22 rmhit kernel: in_atomic():0, irqs_disabled():1
Jul 28 05:42:22 rmhit kernel:  [<0211e8e7>] __might_sleep+0x80/0x8a
Jul 28 05:42:22 rmhit kernel:  [<328951c5>]
qla2x00_mailbox_command+0x1b9/0x3fc [qla2xxx]
Jul 28 05:42:22 rmhit kernel:  [<32894ffc>]
qla2x00_mbx_sem_timeout+0x0/0x10 [qla2xxx]
Jul 28 05:42:22 rmhit kernel:  [<32895b17>]
qla2x00_abort_command+0xca/0xe5 [qla2xxx]
Jul 28 05:42:23 rmhit kernel:  [<02120e8b>] call_console_drivers+0xbe/0xe3
Jul 28 05:42:23 rmhit kernel:  [<0212109b>] printk+0x122/0x134
Jul 28 05:42:23 rmhit kernel:  [<3288ce40>]
qla2xxx_eh_abort+0xdd/0x7d9 [qla2xxx]
Jul 28 05:42:23 rmhit kernel:  [<328a7000>] qla2100_probe_one+0x0/0x8
[qla2100]
Jul 28 05:42:23 rmhit kernel:  [<3288d358>]
qla2xxx_eh_abort+0x5f5/0x7d9 [qla2xxx]
Jul 28 05:42:23 rmhit kernel:  [<3282fe14>]
scsi_try_to_abort_cmd+0x4d/0x5e [scsi_mod]
Jul 28 05:42:23 rmhit kernel:  [<3282ff3c>]
scsi_eh_abort_cmds+0x52/0xc1 [scsi_mod]
Jul 28 05:42:23 rmhit kernel:  [<32830c69>]
scsi_unjam_host+0x167/0x18b [scsi_mod]
Jul 28 05:42:23 rmhit kernel:  [<32830dbe>]
scsi_error_handler+0x131/0x176 [scsi_mod]
Jul 28 05:42:23 rmhit kernel:  [<32830c8d>]
scsi_error_handler+0x0/0x176 [scsi_mod]
Jul 28 05:42:23 rmhit kernel:  [<021051f1>] kernel_thread_helper+0x5/0xb
Jul 28 05:42:23 rmhit kernel:
Jul 28 05:42:37 rmhit kernel: qla2100 0000:04:04.0: qla2xxx_eh_abort
Exiting: status=Failed
Jul 28 05:42:37 rmhit kernel: qla2100 0000:04:04.0: qla2xxx_eh_abort
scsi(2:0:5:0): cmd_timeout_in_sec=0x1e.


Expected results:


Additional info:

it would be nice to integrate emulex's GPL LP8000 driver.
Comment 1 Andrew Vasquez 2004-08-03 12:01:38 EDT
Please try a more recent driver that has been forwarded along
upstream.  This particular eh_abort issue was addressed in 8.00.00b13.
Comment 2 Dave Jones 2005-04-16 00:10:45 EDT
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat.  The Fedora legacy project will be producing further kernel
updates for security problems only.

If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.

Thank you.

Note You need to log in before you can comment on or make changes to this bug.