Bug 503248 - [Emulex 5.4 bug] Update lpfc to version 8.2.0.44 [NEEDINFO]
[Emulex 5.4 bug] Update lpfc to version 8.2.0.44
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.4
All Linux
high Severity high
: rc
: 5.4
Assigned To: Rob Evers
Red Hat Kernel QE team
: OtherQA
: 502195 (view as bug list)
Depends On:
Blocks: 461676
  Show dependency treegraph
 
Reported: 2009-05-29 17:05 EDT by Jamie Wellnitz
Modified: 2011-01-24 18:57 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-02 04:34:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
cward: needinfo? (jamie.wellnitz)


Attachments (Terms of Use)
Patch to update Emulex lpfc driver 8.2.0.44 (applies on top of 8.2.0.43) (158.15 KB, patch)
2009-05-29 17:05 EDT, Jamie Wellnitz
no flags Details | Diff

  None (edit)
Description Jamie Wellnitz 2009-05-29 17:05:25 EDT
Created attachment 345960 [details]
Patch to update Emulex lpfc driver 8.2.0.44 (applies on top of 8.2.0.43)

Update lpfc inbox driver for RHEL 5.4 to 8.2.0.44.

This patch contains the following changes:

* Changed version number to 8.2.0.44
* Removed temporary RAYWIRE PCI IDs
* Fixed post header template mailbox command timing out (CR 90481)
* Fixed consecutive link up events causing skipped link down processing
* Fixed a target mode discovery bug (CR 89882)
* Removed unused jump table entries
* Fixed a memory leak in lpfc_sli4_read_fcoe_params()
* Added stricter checks for FCF addressing mode
* Increased default WQE count to 256
* Updated FDISC context to VPI
* Fixed crash/hang when doing target or LUN resets
* Fixed immediate SCSI command for LUN reset translation to WQE
* Extended mailbox utility to allow MBX_POLL command in-between async MBQ commands
* Use in-kernel PCI functions where they are provided by the kernel
* Removed FCoE PCI device ID 0705
* Fixed re-taking the same spin lock while already holding that lock in lpfc_sli_eratt_read()
* Added code to send only FLOGI, FDISC and LOGO to Fabric controller as FIP
* Fixed a typo on adding vpi base
* Fix GID_FT timeout
* Remove pseudo SLI3 registers and only access SLI2/3 registers on SLI2/3 HBAs
* Replace DMA_(64|32)BIT_MASK macro with DMA_BIT_MASK(64|32)
* Refactor nested if statements to avoid assignment within conditional
* Fixed default work queue size
* Set the ct field of FDISC to 3
* Finish removal of pseudo SLI3 registers
* Fixed over allocation of SCSI bufs
* Force vport to send LOGO to fabric controller when deleting vport
* Fixes for FIP discovery
* Add missed spin_unlock in error path in lpfc_sli4_sp_handle_rcqe()
* Fix for slow discovery
* Fixed lpfc_sli4_iocb2wqe elsreq64 translation of CT fields
* Fix first remote port does not UNREG_RPI
* Fix REG_VFI failing after link reset
* Fixed device spurious INT causing disabled IRQ due to unhandled interrupts
* Fix npiv_info displays "NPIV Physical" for SLI2 HBAs
* Use fc_fs.h file from kernel tree
* Push hbalock lock/unlock down into lpfc_sli_sp_handle_rspiocb()
* Moved heartbeat mailbox command timer start after queue setup
* Fixed lpfc_sli_post_sgl_block page pairs
* Made both WQ and EQ module configurable for FCP multi-queue support
* Prevent SLI4 from issuing REG_RPI for the fabric port
* Make several calls static and remove unused lpfc_sli_get_sglq
* Remove unneeded and reversed locking around call to lpfc_rampdown_queue_depth
* Remove unnecessary PCI reads that impact performance
* Prevent error message when add_fcf mbox fail due to fcf already present
* Removed FCP default CQ for consume WQE release from slow-path handler
* Implemented FCP fast-path multiple Work Queue support
* Fix VPI and VFI base to work on port 2
* Fixed selection of address mode
* Removed unneeded SGL_ALIGN macros
* Fix missing case in sysfs mailbox read
Comment 1 Rob Evers 2009-06-04 11:21:27 EDT
*** Bug 502195 has been marked as a duplicate of this bug. ***
Comment 2 Rob Evers 2009-06-04 17:15:22 EDT
Jamie,

What is the upstream status of new features included?

* Made both WQ and EQ module configurable for FCP multi-queue support

Rob
Comment 4 Jamie Wellnitz 2009-06-04 22:10:46 EDT
Rob,

The change that entry refers to was posted to linux-scsi on 25 May as part of this patch:
http://marc.info/?l=linux-scsi&m=124301831918924&w=2

Thanks,
Jamie
Comment 8 RHEL Product and Program Management 2009-06-08 12:11:42 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 10 Rob Evers 2009-06-10 07:22:37 EDT
Hi Jamie,

A question came up during review of this patch.  Can you answer this?

Rob

> +static uint32_t
> > +lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba, struct lpfc_iocbq *piocb)
> > +{
> > +	static uint32_t fcp_qidx;
> > +
> > +	return fcp_qidx++ % phba->cfg_fcp_wq_count;
> > +}

If the number of queues is not a power of two, this would sometimes
return queue index which is not sequential. I don't think it's important,
but we better ask the authors.
Comment 11 Jamie Wellnitz 2009-06-10 12:35:03 EDT
Rob,

Thanks for pointing that out - that little function is odd in a couple of ways.  The rollover is fcp_qidx is not handled well (I believe that's what the reviewer was talking about with the non-powers-of-2).  The static count is also shared by multiple HBA ports, which could have side effects.

We'll modify it in the next lpfc update.

- Jamie
Comment 12 Rob Evers 2009-06-10 13:26:17 EDT
Thanks for the info.

Another issue also came up:

> +lpfc_sli4_eq_flush(struct lpfc_hba *phba, struct lpfc_queue *eq) +{ +
> > struct lpfc_eqe *eqe; + + /* walk all the EQ entries and drop on the
> > floor */ + while ((eqe = lpfc_sli4_eq_get(eq))) + ; + + /* Clear and
> > re-arm the EQ */ + lpfc_sli4_eq_release(eq, LPFC_QUEUE_REARM); +} 

There is a value assigned to eqe and then not used.
I don't understand what should this function do, 
can't it happen that it will loop forever ?
Comment 13 Don Zickus 2009-06-11 11:37:25 EDT
in kernel-2.6.18-153.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 15 Jamie Wellnitz 2009-06-11 13:01:21 EDT
(In reply to comment #12)
> Thanks for the info.
> 
> Another issue also came up:
> 
> > +lpfc_sli4_eq_flush(struct lpfc_hba *phba, struct lpfc_queue *eq) +{ +
> > > struct lpfc_eqe *eqe; + + /* walk all the EQ entries and drop on the
> > > floor */ + while ((eqe = lpfc_sli4_eq_get(eq))) + ; + + /* Clear and
> > > re-arm the EQ */ + lpfc_sli4_eq_release(eq, LPFC_QUEUE_REARM); +} 
> 
> There is a value assigned to eqe and then not used.
> I don't understand what should this function do, 
> can't it happen that it will loop forever ?  

The lpfc_sli4_eq_get() function cycles through a finite circular array, so the loop is bounded.  As the "flush" name implies (or is intended to imply), this function is simply clearing the event queue.  It's called when the link has gone down.

The unused variable and the comment about "drop on the floor" was intended to show that the entries are intentionally not being processed here.

Thanks, Jamie
Comment 16 Chris Ward 2009-06-14 19:21:50 EDT
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~

RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should
be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner!

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!
Comment 17 Richard A Lary 2009-06-18 10:31:43 EDT
RHEL5.4 Alpha, kernel 2.6.18-152.el5, contains lpfc version 8.2.0.43
version 8.2.0.44 is in kernel -153
Comment 19 Chris Ward 2009-07-03 14:44:46 EDT
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.
Comment 20 Chris Ward 2009-07-10 15:13:43 EDT
~~ Attention Partners - RHEL 5.4 Snapshot 1 Released! ~~

RHEL 5.4 Snapshot 1 has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular request. Please test and report back your results here, at your earliest convenience. The RHEL 5.4 exception freeze is quickly approaching.

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. 

Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.
Comment 21 Richard A Lary 2009-07-13 12:58:02 EDT
Verified EEH recovery functions as expected in RHEL5.4 snapshot 1 with lpfc driver version 8.2.0.46 on Power PC platforms.
Comment 22 Caspar Zhang 2009-07-29 01:18:22 EDT
Verified that the patch is included in kernel-2.6.18-160.el5
Comment 23 Chris Ward 2009-07-29 07:31:42 EDT
Emulex, do you expect to test and report back test results for this item?
Comment 24 laurie barry 2009-07-29 07:42:19 EDT
Verified by Emulex.

Laurie
Comment 25 Chris Ward 2009-07-29 07:54:39 EDT
Thanks. Please make sure to check that all additional emulex bugs are updated after verification.
Comment 27 errata-xmlrpc 2009-09-02 04:34:29 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html

Note You need to log in before you can comment on or make changes to this bug.