Bug 235336 - qla3200 HBA Card hangs with a memory error
Summary: qla3200 HBA Card hangs with a memory error
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-04-05 09:24 UTC by Qian Shen
Modified: 2015-10-26 01:54 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-09-27 22:12:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Qian Shen 2007-04-05 09:24:32 UTC
Description of problem:

The customer uses a qla2300 HBA card to connect to a storage box. when one of
memory chips becomes fault, the qla2300 card reports waring to /var/log/message
and doesn't work. After replaced a new memory chip, the card become work then.

The customer think that it's a big issue that HBA card doesn't work, so it's not
enough to report warning message only one time. He think it will be good that
HBA card can report warning when each i/o request arrived.

The error message like this:

> Uhhuh. NMI received. Dazed and confused, but trying to continue

> > cpqphp: power fault interrupt

> > cpqphp: power fault bit 1 set

> > You probably have a hardware problem with your RAM chips qla2300 

> > 0000:06:02.0: qla2xxx_eh_abort scsi(1:0:0:26): cmd_timeout_in_sec=0x1e.

> > qla2300 0000:06:02.0: Parity error -- HCCR=ffff.

> > qla2300 0000:06:02.0: Mailbox command timeout occured. Issuing ISP abort.

> > qla2300 0000:06:02.0: Performing ISP error recovery - ha= f6f9c258.

> > cpqphp: power fault interrupt

> > cpqphp: power fault bit 0 set

> > qla2300 0000:06:01.0: qla2xxx_eh_abort scsi(0:0:0:26): cmd_timeout_in_sec=0x1e.

> > qla2300 0000:06:01.0: Parity error -- HCCR=ffff.

> > qla2300 0000:06:01.0: Mailbox command timeout occured. Issuing ISP abort.

> > qla2300 0000:06:01.0: Performing ISP error recovery - ha= f7e0c258.

Version-Release number of selected component (if applicable):
RHEL AS 4.4

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Ernie Petrides 2007-09-27 22:12:34 UTC
This is a hardware problem.  The operating system cannot be expected
to continue to function properly when there are disk controller errors.


Note You need to log in before you can comment on or make changes to this bug.