Bug 223198 - qla2400 Failed to load segment 0 of firmware
Summary: qla2400 Failed to load segment 0 of firmware
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: x86_64
OS: Linux
medium
urgent
Target Milestone: ---
: ---
Assignee: Andrew Vasquez
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 216986
TreeView+ depends on / blocked
 
Reported: 2007-01-18 11:26 UTC by Didier Belhomme
Modified: 2007-11-17 01:14 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-01-19 22:37:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
dmesg output (24.25 KB, text/plain)
2007-01-18 11:26 UTC, Didier Belhomme
no flags Details
/var/log/messages file (339.01 KB, text/plain)
2007-01-18 11:26 UTC, Didier Belhomme
no flags Details

Description Didier Belhomme 2007-01-18 11:26:10 UTC
Our system (Sun Fire X4200) is connected through two Sun FC-AL 4gbs cards (OEM
of Qlogic 2460) and 2 Brocade 200E switches to a Storage array Sun Storagetek
6340. We are experiencing, after a while and during high IO activity, some
disconnect from the SAN like reported in /var/log/messages :

Jan 18 09:58:22 hector kernel: qla2400 0000:05:01.0: ISP Request Transfer Error.
Jan 18 09:58:22 hector kernel: qla2400 0000:05:02.0: ISP Request Transfer Error.
Jan 18 09:58:22 hector kernel: qla2400 0000:05:01.0: Performing ISP error
recovery - ha= 00000101fbd983c8.
Jan 18 09:58:22 hector kernel: qla2400 0000:05:02.0: Performing ISP error
recovery - ha= 00000101fbed03c8.
Jan 18 09:58:52 hector kernel: qla2400 0000:05:01.0: [ERROR] Failed to load
segment 0 of firmware
Jan 18 09:58:52 hector kernel: Mailbox registers:
Jan 18 09:58:52 hector kernel: scsi(1): mbox 0 0x0000
Jan 18 09:58:52 hector kernel: scsi(1): mbox 1 0x0000
Jan 18 09:58:52 hector kernel: scsi(1): mbox 2 0x0001
Jan 18 09:58:52 hector kernel: scsi(1): mbox 3 0x4000
Jan 18 09:58:52 hector kernel: scsi(1): mbox 4 0x0040
Jan 18 09:58:52 hector kernel: scsi(1): mbox 5 0x0000
Jan 18 09:58:52 hector kernel: qla2400 0000:05:02.0: [ERROR] Failed to load
segment 0 of firmware
Jan 18 09:58:52 hector kernel: Mailbox registers:
Jan 18 09:58:52 hector kernel: scsi(2): mbox 0 0x0000
Jan 18 09:58:52 hector kernel: scsi(2): mbox 1 0x0000
Jan 18 09:58:52 hector kernel: scsi(2): mbox 2 0x0001
Jan 18 09:58:52 hector kernel: scsi(2): mbox 3 0x4000
Jan 18 09:58:52 hector kernel: scsi(2): mbox 4 0x0040
Jan 18 09:58:52 hector kernel: scsi(2): mbox 5 0x0000
Jan 18 09:59:22 hector kernel: qla2400 0000:05:01.0: [ERROR] Failed to load
segment 0 of firmware
Jan 18 09:59:22 hector kernel: Mailbox registers:
Jan 18 09:59:22 hector kernel: scsi(1): mbox 0 0x0000
Jan 18 09:59:22 hector kernel: scsi(1): mbox 1 0x0000
Jan 18 09:59:22 hector kernel: scsi(1): mbox 2 0x0001
Jan 18 09:59:22 hector kernel: scsi(1): mbox 3 0x4000
Jan 18 09:59:22 hector kernel: scsi(1): mbox 4 0x0040
Jan 18 09:59:22 hector kernel: scsi(1): mbox 5 0x0000

The problem is reported on BOTH the card, that is preventing any failover (mpp
driver).

Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 7 [RAIDarray.mpp]SAN1:1:0 Path Failed
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039014 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039019 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039022 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039027 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039030 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039035 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039039 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: 495 [RAIDarray.mpp]SAN1:1:0:0 Cmnd failed-retry
on a new path. vcmnd SN 25039043 pdev H1:C0:T1:L0 0x00/0x00/0x00 0x00010000
mpp_statu
Jan 18 10:00:56 hector kernel: 94 [RAIDarray.mpp]SAN1:1:0:0 Selection Retry
count exhausted
Jan 18 10:00:56 hector kernel: st: I/O error, dev sdb, sector 485504928
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485502648
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485505944
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485503664
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485498568
Jan 18 10:00:56 hector kernel: SCSI error : <3 0 0 0> return code = 0x10000
Jan 18 10:00:56 hector kernel: end_request: I/O error, dev sdb, sector 485496440

Of course, after a while, the device is in error (so ext3 journal aborting, etc).

Comment 1 Didier Belhomme 2007-01-18 11:26:11 UTC
Created attachment 145904 [details]
dmesg output

Comment 2 Didier Belhomme 2007-01-18 11:26:43 UTC
Created attachment 145905 [details]
/var/log/messages file

Comment 3 Didier Belhomme 2007-01-19 08:30:46 UTC
I have to say that using the recommended driver downloaded from Qlogic (as
indicated in documentation from Sun Microsystems), I keep getting slightly
differents errors. I've reverted to "standard" driver from the kernel in order
to simplify the update. The file downloaded from QLogic is
qla2xxx-v8.01.06-dist.tgz.

Comment 4 Andrew Vasquez 2007-01-19 17:30:57 UTC
This issue has been reported to QLogic by Sun and its customers.
The issue stems from this platforms (x4200) inability to support 
modifications to the PCI Max-Memory-Read-Byte-Count.

I can also see that the card is connected into one of the host's
66mhz slot:

 QLogic Fibre Channel HBA Driver: 8.01.04-d7
  QLogic QLA2460 - Sun PCI-X 2.0 to 4Gb FC, Single Channel
  ISP2422: PCI-X Mode 1 (66 MHz) @ 0000:05:01.0 hdma+, host#=1, fw=4.00.18 [IP] 


A potential workaround for this issue is to place the HBA in a
133MHZ slot.  Beyond that, I'd suggest the customer work directly
with Sun.



Comment 5 Didier Belhomme 2007-01-19 20:15:07 UTC
The Sun X4200 does have 3 PCI-X 66MHz slots, 1 133MHz and 1 100MHz. Since I have
2 cards to connect (in order to introduce redundancy in the SAN connection), I
can put one in a 133 MHz slot and the other in the 100MHz slot. Do you think
that workaround could work ?

Meanwhile, I'll report the problem to Sun.

And thanks to Andrew for the fast reply.

Comment 6 Andrew Vasquez 2007-01-19 20:54:17 UTC
We've only seen the issue when FC HBA cards are attached to
the 66Mhz slots.


Note You need to log in before you can comment on or make changes to this bug.