Bug 174441 - [EMC RHEL4 U4 bug] Making sure latest hwtable.c settings for EMC CLARiiON are picked up.
[EMC RHEL4 U4 bug] Making sure latest hwtable.c settings for EMC CLARiiON are...
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: device-mapper-multipath (Show other bugs)
All Linux
urgent Severity high
: ---
: ---
Assigned To: Ben Marzinski
: Reopened
: 175373 186581 (view as bug list)
Depends On:
Blocks: 181409 184382 239069
  Show dependency treegraph
Reported: 2005-11-28 21:36 EST by Ed Goggin
Modified: 2010-01-11 21:22 EST (History)
13 users (show)

See Also:
Fixed In Version: RHEA-2006-0513
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-05-04 14:18:45 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Ed Goggin 2005-11-28 21:36:54 EST
Description of problem:

I have made two submissions for updating parameters for the store_hwe_ext macro 
call for the EMC CLARiiON in the last few months to the upstream multipath-
tools git head for libmultipath/hwtable.c.  The first submission establishes
immediate failback as the path group failback policy for the CLARiiON.  The 
second submission changes the product parameter string of the macro call to
"^[^LUN_Z]" from "*" in order to force multipath to recognize all CLARiiON 
logical units but the one which responds to a SCSI inquiry with the model 
string "LUN_Z           ".

This bugzilla is here just to make sure that the changes introduced by these 
submissions are included in the Update 3 release of RH 4.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:

Expected results:

Additional info:
Comment 1 Ed Goggin 2005-12-08 09:25:27 EST
These 2 changes appear not to have been yet integrated into the
device-mapper-multipath rpm for RH 4 Update 3.  These changes should be 
relatively isolated and benign.  Could they be integrated soon?
Comment 2 Ben Marzinski 2006-03-20 11:13:39 EST
They are in cvs now, along with changing FAILBACK_IMMEDIATE to -FAILBACK_IMMEDIATE.
Comment 3 Ed Goggin 2006-03-20 11:19:00 EST
Ben, please note that the change to make the product field for the CLARiiON's 
entry in hwtable.c set to the string "^[^LUN_Z]" is wrong.  The correct 
settings are now in hwtable.c in the multipath-tools git head.
Comment 4 Alasdair Kergon 2006-03-24 12:18:38 EST
*** Bug 186581 has been marked as a duplicate of this bug. ***
Comment 5 Andrius Benokraitis 2006-03-24 16:21:46 EST
Can EMC create an Issue Tracker and reference this bug? Thanks!
Comment 6 Andrius Benokraitis 2006-03-24 16:23:00 EST
Link to IT created. Thanks!
Comment 11 Alasdair Kergon 2006-04-06 12:31:31 EDT
*** Bug 175373 has been marked as a duplicate of this bug. ***
Comment 15 Hari Kannan 2006-06-30 10:37:02 EDT
In test.
Comment 16 Hari Kannan 2006-06-30 14:54:47 EDT
The problem has not been fixed at all, that is, (1) dm-multipath will still 
create a multipath mapped device for a CLARiiON LUNZ logical unit, (2) any 
process issuing block IO requests to this mapped device will hang, and (3) 
there is no effective way of preventing dm-multipath from creating a multipath 
mapped device for such logical units.

The fix is (and has been for many months) in Christophe Varoqui's upstream 
multipath-tools source tree.  This solution involves (1) allowing multipath 
device blacklisting based on block device (a little SCSI affinity here maybe) 
vendor and product  attributes and (2) allowing a block device config entry in 
libmultipath/hwtable.c to setup a default blacklist entry of this type.

Andrius, please raise the priority of this issue. We will need this fixed in 
RHEL 4 U4. Again, the impact is confined to CLARiiON, so making the change 
should not have any system wide-effect.

Comment 17 Ben Marzinski 2006-06-30 17:54:54 EDT
I don't have access to a CLARiiON, but blacklisting by vendor/product works on
the devices I do have.  The upstream fix depends on a lot of changes that are
fairly disruptive to pull into RHEL4, so I couldn't just copy/paste the upstream
fix into the RHEL4 code.  However, a modified version is in.  For the 'devices'
section of the config file, you can specify a 'bl_product' value.  This will
blacklist any device that matches the 'vendor', 'product' and 'bl_product'
fields. This is different than upstream, where you only have to match the
'vendor' and 'bl_product' fields.

However, there is definitely either a bug in the documentation or a bug in
default config files for the CLARiiON.

multipath.conf.defaults states that the default CLARiiON configuration is

       device {
               vendor                  "DGC"
               product                 "*"
               bl_product              "LUN_Z"
               path_grouping_policy    group_by_prio
               getuid_callout          "/sbin/scsi_id -g -u -s"
               prio_callout            "/sbin/mpath_prio_emc /dev/%n"
               hardware_handler        "1 emc"
               features                "1 queue_if_no_path"
               path_checker            emc_clariion
               failback                immediate

However, in reality, the bl_product value in the default configuration is
"LUNZ". I though "LUNZ" was the correct value, but since I don't have a
CLARiiON, I might be wrong.  It would be helpful if you could try switching
it, and seeing if that works.

Try putting
devices {
       device {
               vendor                  "DGC"
               product                 "*"
               bl_product              "LUN_Z"
               path_grouping_policy    group_by_prio
               getuid_callout          "/sbin/scsi_id -g -u -s"
               prio_callout            "/sbin/mpath_prio_emc /dev/%n"
               hardware_handler        "1 emc"
               features                "1 queue_if_no_path"
               path_checker            emc_clariion
               failback                immediate

in the /etc/multipath.conf configuration file.  Then run
# multipath -v4

and if the LUNZ device isn't blacklisted, copy the results into this bugzilla.
Comment 20 Ben Marzinski 2006-07-07 14:39:17 EDT
I'm still trying to get access to the CLARiiON in Westford.  However, it would
still be very helpful if EMC could try configuring the multipath device to
blacklist "LUN_Z" instead of "LUNZ" using the devices entry I posted in comment
#17.  Also, posting the results of running multipath -v4 would be very useful.
Comment 21 Hari Kannan 2006-07-07 14:50:39 EDT
Hi Ben, 
Sorry about this. For some reason, my previous post to this bug did not get 
updated. [can some one see why comments 18 and 19 are missing?]

Using the modification suggested in comment 17, solved the issue for us.

Hari [eLab EMC]

Comment 22 Ben Marzinski 2006-07-07 15:40:14 EDT
O.k. The default configuration has been changed.
Comment 24 Red Hat Bugzilla 2006-08-10 17:44:41 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

Comment 26 Nick Strugnell 2006-12-06 09:40:59 EST
I can confirm that the device in bl_product should be LUNZ not LUN_Z

I have tested this on Clariion CX700 using device-mapper-multipath--0.4.5-

Comment 27 Ben Marzinski 2006-12-06 13:06:56 EST
Did the default configuration not work for you? By default, multipath blacklists
the device that returns a product name of LUN_Z. If this worked for you without
having to manually set bl_product in /etc/multipath.conf, then LUN_Z is the
correct value. If you needed to manually change bl_product to make this work...
then it appears that not all CLARiiONs return the same product name. EMC could
only get their device (I don't know what model it was) to work using LUN_Z
Comment 28 Ed Goggin 2006-12-06 13:47:02 EST
This should be LUNZ not LUN_Z.  My earlier references to LUN_Z (opening 
description and comment #3) were incorrect. 
Comment 29 Kari Hautio 2007-02-26 09:21:37 EST
At least EMC Clariion CX3-20 reports LUNZ (not LUN_Z). Maybe the default rule
should be "LUN_?Z" to catch both.
Comment 30 Ed Goggin 2007-02-26 10:26:53 EST

I submitted the "blacklisting by vendor/product" patch to multipath-tools and
tested it with CLARiiON.  The CLARiiON's bl_product string should be "LUNZ" and
nothing else.  Not sure what Hari meant by "Using the modification suggested in
comment 17, solved the issue for us." in comment #21.

Comment 31 Terje Bless 2007-05-03 17:06:39 EDT
As best I can tell, this bug is still present in device-mapper-multipath-0.4.5-
21.RHEL4 (x86_64).

/usr/share/doc/device-mapper-multipath-0.4.5/multipath.conf.defaults still 
lists 'LUN_Z' in the 'bl_product' parameter, and 'LUNZ' is not blacklisted 
until you manually add 'bl_product "LUNZ"' to the 'device' section 
of /etc/multipath.conf.

The errata (RHEA-2006-0513) referenced in Comment #24 also refers to 'LUN_Z' 
instead of 'LUNZ' (and its updated errata, RHEA-2007:0256, does not appear to 
address this issue at all).

PS, Hari: the howto from EMC (Powerlink) on Configuring Linux Native MPIO has 
some bugs in its suggested multipath.conf (s/blacklist/devnode_blacklist/), and 
it lacks this bl_product entry for LUNZ.
Comment 34 Andrius Benokraitis 2007-05-04 14:24:49 EDT
keeping this issue closed as it already has an errata attached to the bugzilla.
Cloning this bug for tracking for RHEL 4.6 in bug 239069.

Note You need to log in before you can comment on or make changes to this bug.