Bug 602826

Summary: QLogic 2x00 is unreliable with recent kernel
Product: [Fedora] Fedora Reporter: CrystalCowboy <fake>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: anton, dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, tcallawa
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-03 13:55:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output from a different system
none
dmesg output from a third machine none

Description CrystalCowboy 2010-06-10 20:22:02 UTC
Description of problem: dmesg reveals lots of qla2x00 related error messages during startup. Mounting of drivers over the interface is belated and reliability is bad. 


Version-Release number of selected component (if applicable):
Kernel 2.6.32.12-115.fc12.x86_64 #1 SMP

How reproducible: I have 4 systems with 2 TB or greater RAID space on qla23xx cards. Three of them have crashed in the last two weeks, so the crashes are not entirely reproducible, but the mounting complications are.


Steps to Reproduce:
1. Install qla23xx card with attached RAID
2. reboot
3. Watch dmesg or /var/log/messages for various messages
  
Actual results: Sometimes OK, sometimes disk not mounted in time. Soemtimes disk fails in service.


Expected results: RAID disk is mounted during normal startup procedure


Additional info:
dmesg entries:

Ending clean XFS mount for filesystem: sdb1
scsi(3): Loop down - aborting ISP.
qla2xxx 0000:13:04.0: Loop down - aborting ISP.
scsi(3): dpc: sched qla2x00_abort_isp ha = ffff88003d045800
qla2xxx 0000:13:04.0: Performing ISP error recovery - ha= ffff88003d045800.
scsi(3): **** Load RISC code ****
scsi(5): Asynchronous LOOP DOWN (2 e678 3d37).
qla2xxx 0000:13:06.0: LOOP DOWN detected (2 e678 3d37).
scsi(3): Verifying Checksum of loaded RISC code.
scsi(3): Checksum OK, start firmware.
scsi(3): Issue init firmware.
scsi(5): fcport-0 - port retry count: 29 remaining
scsi(5): Asynchronous LIP RESET (f7f7).
qla2xxx 0000:13:06.0: LIP reset occurred (f7f7).
scsi(5): qla2x00_reset_marker()
scsi(5): LIP occurred (f7f7).
qla2xxx 0000:13:06.0: LIP occurred (f7f7).
scsi(5): Asynchronous LOOP UP (2 Gbps).
qla2xxx 0000:13:06.0: LOOP UP detected (2 Gbps).
scsi(3): Asynchronous LIP RESET (f7f7).
qla2xxx 0000:13:04.0: LIP reset occurred (f7f7).
scsi(3): LIP occurred (f7f7).
qla2xxx 0000:13:04.0: LIP occurred (f7f7).
scsi(3): Asynchronous LOOP UP (2 Gbps).
qla2xxx 0000:13:04.0: LOOP UP detected (2 Gbps).
scsi(3): Asynchronous PORT UPDATE.
scsi(3): Port database changed ffff 0006 0000.
scsi(3): F/W Ready - OK 
scsi(3): fw_state=3 (3d04, 8800, ffff, 65a0) curr time=ffff5d0f.
qla2x00_restart_isp(): Start configure loop, status = 0
scsi(3): Configure loop -- dpc flags =0x1249
qla2x00_mailbox_command(3): **** FAILED. mbx0=4006, mbx1=7e, mbx2=0, cmd=6a ****
qla2x00_get_port_name(3): failed=102.
scsi(3): MBC_GET_PORT_NAME Failed, No FL Port
scsi(3): LOOP READY
qla2x00_restart_isp(): Configure loop done, status = 0x0
qla2x00_abort_isp(3): succeeded.
scsi(3): dpc: qla2x00_abort_isp end
scsi(5): fcport-0 - port retry count: 28 remaining
qla2xxx 0000:13:04.0: scsi(3:0:0:0): Queue depth adjusted-up to 4.
scsi(5): fcport-0 - port retry count: 27 remaining
scsi(5): fcport-0 - port retry count: 26 remaining
scsi(5): fcport-0 - port retry count: 25 remaining
scsi(5): fcport-0 - port retry count: 24 remaining
scsi(5): fcport-0 - port retry count: 23 remaining
scsi(5): fcport-0 - port retry count: 22 remaining
scsi(5): fcport-0 - port retry count: 21 remaining
scsi(5): fcport-0 - port retry count: 20 remaining
scsi(5): fcport-0 - port retry count: 19 remaining
scsi(5): fcport-0 - port retry count: 18 remaining
scsi(5): fcport-0 - port retry count: 17 remaining
scsi(5): fcport-0 - port retry count: 16 remaining
scsi(5): fcport-0 - port retry count: 15 remaining
scsi(5): fcport-0 - port retry count: 14 remaining
scsi(5): fcport-0 - port retry count: 13 remaining
scsi(5): fcport-0 - port retry count: 12 remaining
scsi(5): fcport-0 - port retry count: 11 remaining
scsi(5): fcport-0 - port retry count: 10 remaining
scsi(5): fcport-0 - port retry count: 9 remaining
scsi(5): fcport-0 - port retry count: 8 remaining
scsi(5): fcport-0 - port retry count: 7 remaining
scsi(5): fcport-0 - port retry count: 6 remaining
scsi(5): fcport-0 - port retry count: 5 remaining
scsi(5): fcport-0 - port retry count: 4 remaining
scsi(5): fcport-0 - port retry count: 3 remaining
scsi(5): fcport-0 - port retry count: 2 remaining
scsi(5): fcport-0 - port retry count: 1 remaining
scsi(5): fcport-0 - port retry count: 0 remaining
 rport-5:0-0: blocked FC remote port time out: removing rport
 rport-5:0-1: blocked FC remote port time out: removing target and saving binding
qla2x00_mailbox_command(5): **** FAILED. mbx0=4006, mbx1=67, mbx2=8800, cmd=71 ****
qla2x00_fabric_logout(5): failed=102 mbx1=67.
qla2x00_mailbox_command(5): **** FAILED. mbx0=4006, mbx1=0, mbx2=0, cmd=71 ****
qla2x00_fabric_logout(5): failed=102 mbx1=0.

Comment 1 CrystalCowboy 2010-06-10 20:24:52 UTC
Created attachment 423038 [details]
dmesg output from a different system

Note the large number of entries concerning alq2x00 and scsi, many with ominous warnings

Comment 2 Tom "spot" Callaway 2010-06-10 20:26:56 UTC
I'm not sure what I can do here. Either it is a driver issue, or it is a firmware issue. If it is the former, it belongs to the kernel folks to debug. If it is the latter, you'll need to bring it up with QLogic.

Comment 3 CrystalCowboy 2010-06-10 20:28:48 UTC
Created attachment 423040 [details]
dmesg output from a third machine

dmesg from another machine. similar qla2x00 and scsi fails and retries.

Comment 4 CrystalCowboy 2010-06-10 20:29:59 UTC
(In reply to comment #2)
> I'm not sure what I can do here. Either it is a driver issue, or it is a
> firmware issue. If it is the former, it belongs to the kernel folks to debug.
> If it is the latter, you'll need to bring it up with QLogic.    

Send it on to the kernel folks, I guess.

Comment 5 Bug Zapper 2010-11-03 13:21:04 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 6 Bug Zapper 2010-12-03 13:55:37 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.