Bug 509010

Summary: [Emulex 5.4 bug] Update lpfc to version 8.2.0.48 (bug fixes only)
Product: Red Hat Enterprise Linux 5 Reporter: Jamie Wellnitz <jamie.wellnitz>
Component: kernelAssignee: Rob Evers <revers>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 5.4CC: andriusb, coughlan, cward, dzickus, jbao, laurie.barry, mjenner, revers, rlary, robert.evans, syeghiay, vaios.papadimitriou
Target Milestone: rcKeywords: OtherQA
Target Release: 5.4   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 08:34:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 461676    
Attachments:
Description Flags
Patch to update Emulex lpfc driver 8.2.0.48 (applies on top of 8.2.0.46)
none
Patch to fix pointer dereferences before NULL check in lpfc_prep_seq
none
patch to fix ctx_idx increment and rollover none

Description Jamie Wellnitz 2009-06-30 20:38:46 UTC
Created attachment 350020 [details]
Patch to update Emulex lpfc driver 8.2.0.48 (applies on top of 8.2.0.46)

Update lpfc inbox driver for RHEL 5.4 to 8.2.0.48.

This patch contains the following changes

* Changed version number to 8.2.0.48
* Wait for HBA POST completion before checking Online and UE registers
* Remove cast when using pci_read_config_dword() to access LPFC_SLIREV_CONF_WORD
* Fixed unsolicited CT commands crashing kernel
* Fixed static vport creation on SLI4 HBAs
* Fixed vport create not to send INIT_VPI before REG_VFI
* Restore behavior of lpfc_device_reset_handler to issue target reset (CR 91267)
* Remove duplicated SCSI netlink #defines from lpfc_auth_access.h
* Fixed unsolicited CT commands not being responded to
* Fixed send management command length
* Fixed driver unable to discover targets with DHCHAP enabled (CR 91073)
* Do not issue mailbox command when LPFC_HBA_ERROR with MBX_POLL mode
* Fixed accumulated total length not being filled in on SLI4 unsolicited IOCBs
* Fixed FCoE parameters in region 23 not being read correctly
* Fixed SLI3 in-band remote management (CR 91042)

Comment 1 Rob Evers 2009-07-01 19:59:35 UTC
Jamie,

Should the lines added here be inside the conditional 'if (first_iocbq)'

@@ -11031,6 +11035,8 @@ lpfc_prep_seq(struct lpfc_vport *vport, 
 	       fc_hdr->fh_s_id[2]);
 	/* Get an iocbq struct to fill in. */
 	first_iocbq = lpfc_sli_get_iocbq(vport->phba);
+	first_iocbq->vport = vport;
+	first_iocbq->iocb.unsli3.rcvsli3.acc_len = 0;
 	if (first_iocbq) {

Rob

Comment 2 Jamie Wellnitz 2009-07-01 20:12:49 UTC
Rob,

It looks that way.  Let me look into this code and I'll let you know.

Thanks, Jamie

Comment 3 Jamie Wellnitz 2009-07-01 20:32:54 UTC
Yup, that's a bug.  Sorry about that.

Do you want me to replace the patch attached here, or just give you the little patch to apply on top of it?

Comment 4 Rob Evers 2009-07-02 12:02:53 UTC
little patch would be better.

Thanks, Rob

Comment 5 Jamie Wellnitz 2009-07-02 13:19:05 UTC
Created attachment 350270 [details]
Patch to fix pointer dereferences before NULL check in lpfc_prep_seq

Rob,

Here's just the fix for the issue in comment 1.  Thanks for pointing that out.

- Jamie

Comment 8 RHEL Program Management 2009-07-02 19:03:44 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 Rob Evers 2009-07-06 11:55:16 UTC
Jamie,

Can you take a look at this.  This came up during the internal review.

Thanks, Rob

> >   
>> >> +			dfchba->ctx_idx += dfchba->ctx_idx++ % 64;
>> >> I think we should have here '=' and not '+='
>> >> ctx_idx is an index in array ct_ctx[64]; I think, so this looks suspicious
>> >>     
> >
> > Indeed it's suspicious, although merely replacing the assignment
> > is not enough: the clipping has to happen to the value after
> > the increment. So:
> >
> > 		dfchba->ctx_idx = (dfchba->ctx_idx + 1) % 64;
> >   
This is clearly better, thanks. 
The index is used this way :
	evt_dat->immed_dat = dfchba->ctx_idx % 64;
so evt_dat->immed_dat would cycle from 0 to 63,
so by chance even the '=' would work. 

> >

Comment 10 RHEL Program Management 2009-07-06 12:03:47 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 11 Jamie Wellnitz 2009-07-06 18:12:10 UTC
Rob,

I agree that increment code looks odd.  I'd like the opinion of the engineer who put it in, but he's out today.  Can I get back to you tomorrow?

Thanks, Jamie

Comment 13 Rob Evers 2009-07-06 20:44:42 UTC
Jamie,

Regarding the code review feedback...

I believe the whole patch set will be postponed to the next build, so you have a few days.

Rob

Comment 15 Rob Evers 2009-07-09 14:01:43 UTC
Hi Jamie,

The deadline for re-submission is tomorrow.  When can I expect an update?

Rob

Comment 16 Jamie Wellnitz 2009-07-09 14:25:40 UTC
Rob,

Yes we'll have an update.  Is tomorrow your internal deadline or our deadline to deliver to you (which would be sooner)?

Thanks for bearing with us. --Jamie

Comment 17 Rob Evers 2009-07-09 15:13:18 UTC
Friday afternoon is my deadline.

Rob

Comment 18 Jamie Wellnitz 2009-07-10 14:14:22 UTC
Created attachment 351263 [details]
patch to fix ctx_idx increment and rollover

This patch addresses the index increment problem pointed out in comment #9.

It also updates the lpfc version to 8.2.0.48.1p.

Comment 19 Don Zickus 2009-07-14 20:58:06 UTC
in kernel-2.6.18-158.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 22 Chris Ward 2009-08-03 15:47:32 UTC
~~ Attention Partners - RHEL 5.4 Snapshot 5 Released! ~~

RHEL 5.4 Snapshot 5 is the FINAL snapshot to be release before RC. It has been 
released on partners.redhat.com. If you have already reported your test results, 
you can safely ignore this request. Otherwise, please notice that there should be 
a fix available now that addresses this particular issue. Please test and report 
back your results here, at your earliest convenience.

If you encounter any issues while testing Beta, please describe the 
issues you have encountered and set the bug into NEED_INFO. If you 
encounter new issues, please clone this bug to open a new issue and 
request it be reviewed for inclusion in RHEL 5.4 or a later update, if it 
is not of urgent severity. If it is urgent, escalate the issue to your partner manager as soon as possible. There is /very/ little time left to get additional code into 5.4 before GA.

Partners, after you have verified, do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. 

Further questions can be directed to your Red Hat Partner Manager or other 
appropriate customer representative.

Comment 23 Chris Ward 2009-08-04 14:52:27 UTC
VERIFIED in  Bug 512266 -  [Emulex 5.4 bug] Update lpfc driver to 8.2.0.48.2p to fix multiple panics

Comment 25 errata-xmlrpc 2009-09-02 08:34:57 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html