Bug 223198 - qla2400 Failed to load segment 0 of firmware
qla2400 Failed to load segment 0 of firmware
Status: CLOSED CANTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.4
x86_64 Linux
medium Severity urgent
: ---
: ---
Assigned To: Andrew Vasquez
Brian Brock
:
Depends On:
Blocks: 216986
  Show dependency treegraph
 
Reported: 2007-01-18 06:26 EST by Didier Belhomme
Modified: 2007-11-16 20:14 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-01-19 17:37:35 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg output (24.25 KB, text/plain)
2007-01-18 06:26 EST, Didier Belhomme
no flags Details
/var/log/messages file (339.01 KB, text/plain)
2007-01-18 06:26 EST, Didier Belhomme
no flags Details

  None (edit)
Description Didier Belhomme 2007-01-18 06:26:10 EST
Our system (Sun Fire X4200) is connected through two Sun FC-AL 4gbs cards (OEM
of Qlogic 2460) and 2 Brocade 200E switches to a Storage array Sun Storagetek
6340. We are experiencing, after a while and during high IO activity, some
disconnect from the SAN like reported in /var/log/messages :

Jan 18 09:58:22 hector kernel: qla2400 0000:05:01.0: ISP Request Transfer Error.
Jan 18 09:58:22 hector kernel: qla2400 0000:05:02.0: ISP Request Transfer Error.
Jan 18 09:58:22 hector kernel: qla2400 0000:05:01.0: Performing ISP error
recovery - ha= 00000101fbd983c8.
Jan 18 09:58:22 hector kernel: qla2400 0000:05:02.0: Performing ISP error
recovery - ha= 00000101fbed03c8.
Jan 18 09:58:52 hector kernel: qla2400 0000:05:01.0: [ERROR] Failed to load
segment 0 of firmware
Jan 18 09:58:52 hector kernel: Mailbox registers:
Jan 18 09:58:52 hector kernel: scsi(1): mbox 0 0x0000
Jan 18 09:58:52 hector kernel: scsi(1): mbox 1 0x0000
Jan 18 09:58:52 hector kernel: scsi(1): mbox 2 0x0001
Jan 18 09:58:52 hector kernel: scsi(1): mbox 3 0x4000
Jan 18 09:58:52 hector kernel: scsi(1): mbox 4 0x0040
Jan 18 09:58:52 hector kernel: scsi(1): mbox 5 0x0000
Jan 18 09:58:52 hector kernel: qla2400 0000:05:02.0: [ERROR] Failed to load
segment 0 of firmware
Jan 18 09:58:52 hector kernel: Mailbox registers:
Jan 18 09:58:52 hector kernel: scsi(2): mbox 0 0x0000
Jan 18 09:58:52 hector kernel: scsi(2): mbox 1 0x0000
Jan 18 09:58:52 hector kernel: scsi(2): mbox 2 0x0001
Jan 18 09:58:52 hector kernel: scsi(2): mbox 3 0x4000
Jan 18 09:58:52 hector kernel: scsi(2): mbox 4 0x0040
Jan 18 09:58:52 hector kernel: scsi(2): mbox 5 0x0000
Jan 18 09:59:22 hector kernel: qla2400 0000:05:01.0: [ERROR] Failed to load
segment 0 of firmware
Jan 18 09:59:22 hector kernel: Mailbox registers:
Jan 18 09:59:22 hector kernel: scsi(1): mbox 0 0x0000
Jan 18 09:59:22 hector kernel: scsi(1): mbox 1 0x0000
Jan 18 09:59:22 hector kernel: scsi(1): mbox 2 0x0001
Jan 18 09:59:22 hector kernel: scsi(1): mbox 3 0x4000
Jan 18 09:59:22 hector kernel: scsi(1): mbox 4 0x0040
Jan 18 09:59:22 hector kernel: scsi(1): mbox 5 0x0000

The problem is reported on BOTH the card, that is preventing any failover (mpp
driver).

Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 7 [RAIDarray.mpp]SAN1:1:0 Path Failed
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039014 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039019 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039022 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039027 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039030 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039035 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039039 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039043 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: st: I/O error, dev sdb, sector 485504928
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485502648
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485505944
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485503664
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485498568
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485496440

Of course, after a while, the device is in error (so ext3 journal aborting, etc).
Comment 1 Didier Belhomme 2007-01-18 06:26:11 EST
Created attachment 145904 [details]
dmesg output
Comment 2 Didier Belhomme 2007-01-18 06:26:43 EST
Created attachment 145905 [details]
/var/log/messages file
Comment 3 Didier Belhomme 2007-01-19 03:30:46 EST
I have to say that using the recommended driver downloaded from Qlogic (as
indicated in documentation from Sun Microsystems), I keep getting slightly
differents errors. I've reverted to "standard" driver from the kernel in order
to simplify the update. The file downloaded from QLogic is
qla2xxx-v8.01.06-dist.tgz.
Comment 4 Andrew Vasquez 2007-01-19 12:30:57 EST
This issue has been reported to QLogic by Sun and its customers.
The issue stems from this platforms (x4200) inability to support 
modifications to the PCI Max-Memory-Read-Byte-Count.

I can also see that the card is connected into one of the host's
66mhz slot:

 QLogic Fibre Channel HBA Driver: 8.01.04-d7
  QLogic QLA2460 - Sun PCI-X 2.0 to 4Gb FC, Single Channel
  ISP2422: PCI-X Mode 1 (66 MHz) @ 0000:05:01.0 hdma+, host#=1, fw=4.00.18 [IP] 


A potential workaround for this issue is to place the HBA in a
133MHZ slot.  Beyond that, I'd suggest the customer work directly
with Sun.

Comment 5 Didier Belhomme 2007-01-19 15:15:07 EST
The Sun X4200 does have 3 PCI-X 66MHz slots, 1 133MHz and 1 100MHz. Since I have
2 cards to connect (in order to introduce redundancy in the SAN connection), I
can put one in a 133 MHz slot and the other in the 100MHz slot. Do you think
that workaround could work ?

Meanwhile, I'll report the problem to Sun.

And thanks to Andrew for the fast reply.
Comment 6 Andrew Vasquez 2007-01-19 15:54:17 EST
We've only seen the issue when FC HBA cards are attached to
the 66Mhz slots.

Note You need to log in before you can comment on or make changes to this bug.