Bug 234398

Summary: CRM1214186 - RHEL4-U4, Emulex HBA cannot see LUNs assigned id's greater than 256.
Product: Red Hat Enterprise Linux 4 Reporter: Issue Tracker <tao>
Component: kernelAssignee: Chip Coldwell <coldwell>
Status: CLOSED NOTABUG QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 4.0CC: andriusb, bloch, coughlan, james.smart, laurie.barry, sean.murphy, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-07-05 18:23:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 217214    
Attachments:
Description Flags
patch that resolves the problem none

Comment 1 Issue Tracker 2007-03-28 21:04:02 UTC
Customer has two Emulex hba's with 4 Luns assigned. Luns are designated as follows: 0, 59, 263, and 541. System can see luns 0 and 59, but not 263 and 541. I had customer add the following options to /etc/modprobe.conf and rebuild initrd:

options scsi_mod max_luns=1024 max_report_luns=1024
options lpfc lpfc_nodev_tmo=10 lpfc_max_luns=1024

This still does not allow the system to see the luns (reboot was done).
This event sent from IssueTracker by jwhiter  [Support Engineering Group]
 issue 113538

Comment 2 Issue Tracker 2007-03-28 21:04:18 UTC
This is one thing I noticed in the dmesg output, it comes from
scsi_report_lun_scan(), so it does look like the SCSI mid-layer is sending
the REPORT_LUNS command to the device. The device returns a list of LUNS
that are present. These messages below are generated because these LUN
values from the list of LUNS are greater than sdev->host->max_lun. 

scsi: host 1 channel 0 id 0 lun659 has a LUN larger than allowed by the
host adapter
scsi: host 1 channel 0 id 0 lun660 has a LUN larger than allowed by the
host adapter
scsi: host 1 channel 0 id 0 lun661 has a LUN larger than allowed by the
host adapter

scsi: host 2 channel 0 id 0 lun565 has a LUN larger than allowed by the
host adapter
scsi: host 2 channel 0 id 0 lun566 has a LUN larger than allowed by the
host adapter
scsi: host 2 channel 0 id 0 lun567 has a LUN larger than allowed by the
host adapter

Can they add this option to their current /etc/modprobe.conf along with

options scsi_mod scsi_logging_level=0x1c0              
options lpfc lpfc_nodev_tmo=10 lpfc_max_luns=1024

Can you double-check and make sure they rebuilt their initrd after
modifying /etc/modprobe.conf?

I didn't see all the scsi messages I was expecting, plus looking at the
lpfc driver it looks like the lpfc_max_luns would be the right parameter to
increase host->max_lun.




Internal Status set to 'Waiting on Support'

This event sent from IssueTracker by jwhiter  [Support Engineering Group]
 issue 113538

Comment 3 Issue Tracker 2007-03-28 21:04:24 UTC
dave,

I think i've found the problem, lpfc sets the max_luns as a readonly
attribute, not an attribute that can be changed.  This patch should fix the
problem, I'm building a test kernel for the customer to try based off of
42.0.10 so we don't have to worry about the kernel panicing.


This event sent from IssueTracker by jwhiter  [Support Engineering Group]
 issue 113538

Comment 5 Josef Bacik 2007-03-28 21:07:08 UTC
Created attachment 151164 [details]
patch that resolves the problem

this patch resolves the problem.

Comment 6 Tom Coughlan 2007-05-04 18:03:02 UTC
James, 

Please review and comment on the proposed change. 

Tom

Comment 7 James Smart 2007-05-07 14:57:21 UTC
The LPFC_ATTR_R() vs LPFC_ATTR_RW() change should only affect the sysfs
attribute for the shost. The module parameter should have allowed the change
without any code change. So, if it didn't take affect, then it likely means the
driver was loaded via initrd and the initrd image wasn't rebuilt  (my guess is,
your code change caused the driver to be rebuilt, which also caused initrd to be
updated).

From a purely technical viewpoint - yes, extending the max_lun # registered by
the driver is the right way to support larger lun #'s. You must be careful
though as, if set via the module parameter, it affects all adapters and all
targets, and will be used as the upper bound on scan loops in the midlayer,
which can result in some longer delays, and on some targets perhaps funny
behavior (as they only recognize luns in the 0-255 range).

Making the parameter RW isn't sufficient. Ultimately, you need the
shost->max_luns value to change, which the default sysfs "store" routine won't
do. Even if you create a unique store routine for max_luns, you are now
dynamically changing it on the midlayer post the scsi_add_host() call, not
always a smart thing to do.  Thus, you're starting to see why we allowed it to
be changable as a module parameter, but not as a sysfs attribute, with the
downside being how global the change is.

The other side to this change is - although you can do it, it's not always the
smartest to do so based on how it affects qualifications. I hinted at device and
scan interactions once the value is changed.  To address this part, can we get
some insight as to who the customer is, what storage they are using, so we can
determine how the "qual" part is to be resolved ?


Comment 8 RHEL Program Management 2007-05-09 05:11:43 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 Issue Tracker 2007-06-28 17:53:26 UTC
Internal Status set to 'Resolved'
Status set to: Closed by Client
Resolution set to: 'NotABug'

This event sent from IssueTracker by jpayne 
 issue 113538

Comment 11 Andrius Benokraitis 2007-07-05 18:23:37 UTC
Closing issue, since customer has closed the corresponding Issue Tracker on
their side as NOTABUG.