Bug 438761 - LTC:5.4:201049:DM-MP SCSI Hardware Handlers
LTC:5.4:201049:DM-MP SCSI Hardware Handlers
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
ppc64 All
low Severity high
: rc
: ---
Assigned To: Mike Christie
Martin Jenner
: OtherQA, Reopened
: 240548 (view as bug list)
Depends On: 460899
Blocks: 361871 RHEL5u3_relnotes
  Show dependency treegraph
 
Reported: 2008-03-24 18:08 EDT by IBM Bug Proxy
Modified: 2015-11-30 10:34 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
the SCSI device handler infrastructure (scsi_dh) has been updated, providing the following improvements: * a generic ALUA (asymmetric logical unit access) handler has been implemented. * added support for LSI RDAC SCSI based storage devices.
Story Points: ---
Clone Of:
: 550026 (view as bug list)
Environment:
Last Closed: 2009-01-20 15:25:06 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Port bus notifications infrastructure from mainline (4.47 KB, text/plain)
2008-08-12 20:40 EDT, IBM Bug Proxy
no flags Details
ABI workaround for bus notifications (7.38 KB, text/plain)
2008-08-12 20:40 EDT, IBM Bug Proxy
no flags Details
Basic infrastructure for SCSI Hardware Handler (27.45 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
ABI workaround for scsi_device data structure change (2.85 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Handler for RDAC device (18.41 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Handler for alua (23.09 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Changes in dm layer to use SCSI Hardware Handler (9.76 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Do not remove Hardware Handler code (5.40 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Do not allow arguments for SCSI DH Hardware Handlers (842 bytes, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Add a workqueue to handle events (4.07 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Remove RDAC Handler from DM layer (19.22 KB, text/plain)
2008-08-12 20:51 EDT, IBM Bug Proxy
no flags Details
Handler for EMC device (19.45 KB, text/plain)
2008-08-14 14:21 EDT, IBM Bug Proxy
no flags Details
ABI workaround for bus notifications (7.31 KB, text/plain)
2008-08-16 00:21 EDT, IBM Bug Proxy
no flags Details
Set path state to passive if the path is not owned (640 bytes, text/plain)
2008-08-22 14:42 EDT, IBM Bug Proxy
no flags Details
Move bus notification nearlier in device_add (1.25 KB, text/plain)
2008-08-22 14:42 EDT, IBM Bug Proxy
no flags Details
Fix a race situation (3.02 KB, text/plain)
2008-09-12 21:57 EDT, IBM Bug Proxy
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 43337 None None None Never

  None (edit)
Description IBM Bug Proxy 2008-03-24 18:08:59 EDT
=Comment: #0=================================================
Emily J. Ratliff <emilyr@us.ibm.com> - 2008-03-21 17:40 EDT
1. Feature Id:	[201049]
Feature Name:	DM-MP SCSI Hardware Handlers
Sponsor:	PPC
Category:	Device Drivers and IO
Request Type:	Kernel - Enhancement from IBM

2. Short Description
The SCSI HW Handler work is a community driven effort to address issues with the current dm-mp
handling of device specific errors. The current dm hardware handlers will be migrated to the SCSI
subsystem where the handlers can obtain more detailed failure information and obtain more detailed
information about the devices. This will also allow the dm-mp layer to deal with devices in a more
generic fashion and not have to decode SCSI specific data. This migration of the handlers is needed
to fully support a number of storage devices and increase the utility of Linux's multipath solution.

3. Business Case
Follow the Linux storage system strategy of having a single unifying multipath solution that
supports all devices. Currently the DM-MP hardware handler solution cannot meet customer
expectations for devices having a active / passive path device model. This results in the
utilization of out of distro multipath drivers causing delays in certifications of the storage and
increased support costs managing the out of distro drivers.

Benefits
Fully compliant and support the multipath IO with Device Mapper for IBM hardware.

4. Sponsor Priority	1
IBM Confidential:	no
Code Contribution:	IBM code
Upstream Acceptance:	In Progress
Component Version Target:	
Performance Assistance:	no

5. PM Contact:	Stephanie Glass, sglass@us.ibm.com, 512-838-9284

6. Technical contact(s):
Daisy Chang, daisyc@us.ibm.com
Chandra Seetharaman, chandra.seetharaman@us.ibm.com

7. LTC Manager: Wendel Voigt, wvoigt@us.ibm.com
Comment 1 IBM Bug Proxy 2008-03-25 08:02:59 EDT
------- Comment From sglass@us.ibm.com 2008-03-25 08:00 EDT-------
This feature needs to be deferred to RHEL 5.3
Comment 2 IBM Bug Proxy 2008-03-25 12:56:29 EDT
------- Comment From sglass@us.ibm.com 2008-03-25 12:51 EDT-------
Sorry for the confusion, this needs to be deferred to RHEL 5.4.
Comment 3 Emily Ratliff 2008-07-11 10:59:23 EDT
Code is now upstream. Added to tracker. 
Comment 4 IBM Bug Proxy 2008-07-11 11:30:26 EDT
------- Comment From sglass@us.ibm.com 2008-07-11 11:29 EDT-------
Reopening for RHEL 5.4
Comment 5 IBM Bug Proxy 2008-07-29 18:40:37 EDT
------- Comment From chandra.seetharaman@us.ibm.com 2008-07-29 18:35 EDT-------

All the code needed are this feature are now available in 2.6.27-rc1.

Ported all the patches to RHEL 5.2. Had couple of KABI issues. Made changes to
the code such that none of the existing symbols changed (of course new symbols
have been added). Currently testing the code.

Here are the commit links (in Linus's tree):
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a6a8d9f87eb8510a8f53672ea87703f62185d75f
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fbd7ab3eb53a3b88fefa7873139a62e439860155
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f6dd337ee4c440f29a873da3779eb3af44bd1623
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5e7dccad3621f6e2b572f309cf830a2c902cae80
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=cfae5c9bb66325cd32d5f2ee41f14749f062a53c
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bab7cfc733f4453a502b7491b9ee37b091440ec4
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2651f5d7d3bc5120a439e498f131e4d731f99b3e
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=cb520223d7f22c5386aff27a5856a66e2c32aaac
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=688864e29869a71a8183e4e2f96ccf9f2de1375f
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fe9233fb6914a0eb20166c967e3020f7f0fba2c9
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=89a93f2f4834f8c126e8d9dd6b368d0b9e21ec3d
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=33af79d12e0fa25545d49e86afc67ea8ad5f2f40
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=765cbc6dad16b87724803e359d6be792ddf08614
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c05ae52fcb0e27a2ee4a16d1f31f8c547fd4886
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b6ff1b14cdf4b4cb5403f3af2c3272f7e609a241
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2aef6d5c05ee5c02f2e4d737b8738deb118cf892
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ca9f0089867c9e476cf2e6d4615d2aae887171b2
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=057ea7c9683c3d684128cced796f03c179ecf1c2
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ae11b1b36da726a8a93409b896704edc6b4f3402
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7c32c7a2d36c52d2b9ed040a9171364020ecc6a2
Comment 6 IBM Bug Proxy 2008-08-12 20:40:37 EDT
Created attachment 314153 [details]
Port bus notifications infrastructure from mainline

RHEL 5.2 doesn't have support for bus notifications, which is needed
for SCSI Hardware Handlers. 

This is the direct port of the functionality from mainline
Comment 7 IBM Bug Proxy 2008-08-12 20:40:41 EDT
Created attachment 314154 [details]
ABI workaround for bus notifications

ABI workaround for bus notifications
Comment 8 IBM Bug Proxy 2008-08-12 20:51:06 EDT
Ported and tested all the patches.

Fixed all the ABI issues. This set of patches do not remove/change any existing
kernel ABI. It just adds a few.

Did _not_ attach the EMC handler and HP handler as they might affect the
user interface (they currently accept arguments, but SCSI_DH doesn't).
Comment 9 IBM Bug Proxy 2008-08-12 20:51:11 EDT
Created attachment 314155 [details]
Basic infrastructure for SCSI Hardware Handler

Basic infrastructure for SCSI Hardware Handler
Comment 10 IBM Bug Proxy 2008-08-12 20:51:14 EDT
Created attachment 314156 [details]
ABI workaround for scsi_device data structure change

ABI workaround for scsi_device data structure change
Comment 11 IBM Bug Proxy 2008-08-12 20:51:19 EDT
Created attachment 314157 [details]
Handler for RDAC device

Handler for RDAC device
Comment 12 IBM Bug Proxy 2008-08-12 20:51:23 EDT
Created attachment 314158 [details]
Handler for alua

Handler for ALUA
Comment 13 IBM Bug Proxy 2008-08-12 20:51:28 EDT
Created attachment 314159 [details]
Changes in dm layer to use SCSI Hardware Handler

Changes in dm layer to use SCSI Hardware Handler
Comment 14 IBM Bug Proxy 2008-08-12 20:51:33 EDT
Created attachment 314160 [details]
Do not remove Hardware Handler code

Do not remove Hardware Handler code for ABI and backward compatibility reasons
Comment 15 IBM Bug Proxy 2008-08-12 20:51:38 EDT
Created attachment 314161 [details]
Do not allow arguments for SCSI DH Hardware Handlers

Do not allow arguments for SCSI DH Hardware Handlers
Comment 16 IBM Bug Proxy 2008-08-12 20:51:43 EDT
Created attachment 314162 [details]
Add a workqueue to handle events

Add a workqueue to handle events
Comment 17 IBM Bug Proxy 2008-08-12 20:51:47 EDT
Created attachment 314163 [details]
Remove RDAC Handler from DM layer

Remove RDAC Handler from DM layer
Comment 19 IBM Bug Proxy 2008-08-14 14:21:56 EDT
Created attachment 314339 [details]
Handler for EMC device

Added the patch as per MikeC suggestion.
Comment 20 RHEL Product and Program Management 2008-08-14 14:24:19 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 22 Chip Coldwell 2008-08-15 10:57:16 EDT
*** Bug 240548 has been marked as a duplicate of this bug. ***
Comment 23 IBM Bug Proxy 2008-08-16 00:21:36 EDT
Created attachment 314426 [details]
ABI workaround for bus notifications

Removed a debug message and a coding error.
Comment 24 Mike Christie 2008-08-22 14:15:10 EDT
(In reply to comment #23)
> Created an attachment (id=314426) [details]
> ABI workaround for bus notifications
> 
> Removed a debug message and a coding error.

I replaced this patch:
https://bugzilla.redhat.com/attachment.cgi?id=314154

with the new one.

The updated patchset is here
http://people.redhat.com/mchristi/scsi_dh/
Comment 25 IBM Bug Proxy 2008-08-22 14:42:05 EDT
Created attachment 314828 [details]
Set path state to passive if the path is not owned
Comment 26 IBM Bug Proxy 2008-08-22 14:42:10 EDT
Created attachment 314829 [details]
Move bus notification nearlier in device_add
Comment 27 Mike Christie 2008-08-25 13:07:22 EDT
Testers,

I updated the patchset with IBM's patches here:

http://people.redhat.com/mchristi/scsi_dh/scsi-dh3/
Comment 28 Mike Christie 2008-08-28 14:51:08 EDT
Pactches sent to internal list for review. Here is the final patchset
http://people.redhat.com/mchristi/scsi_dh/scsi-dh9/
Comment 29 Don Zickus 2008-09-12 21:46:37 EDT
in kernel-2.6.18-113.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 30 IBM Bug Proxy 2008-09-12 21:57:51 EDT
Created attachment 316641 [details]
Fix a race situation

Path activation code has been moved to a workqueue, and the path to be
activated was read from current_pgpath. If there are any changes in the path,
it leads to a  current_pgpath becoming NULL and this leads to a panic.

This patch fixes the problem
Comment 31 Andrius Benokraitis 2008-09-15 00:16:31 EDT
IBM - this bug is already in MODIFIED for a previously submitted patchset (Comment #28). You'll have to post your patch attached in Comment #30 in a new bugzilla for post-Beta consideration (if it meets the post-Beta criteria).
Comment 32 John Jarvis 2008-09-15 09:22:35 EDT
This enhancement request was evaluated by the full Red Hat Enterprise Linux team
for inclusion in a Red Hat Enterprise Linux minor release.   As a result of this
evaluation, Red Hat has tentatively approved inclusion of this feature in the
next Red Hat Enterprise Linux Update minor release.   While it is a goal to
include this enhancement in the next minor release of Red Hat Enterprise Linux,
the enhancement is not yet committed for inclusion in the next minor release
pending the next phase of actual code integration and successful Red Hat and
partner testing.
Comment 33 John Jarvis 2008-09-15 09:23:59 EDT
Andrius is correct, any additional patches require a separate bug and separate review on rh-kernel list.
Comment 36 IBM Bug Proxy 2008-09-22 19:51:54 EDT
(In reply to comment #40)
> ------- Comment From dzickus@redhat.com 2008-09-12 21:46:37 EDT-------
> in kernel-2.6.18-113.el5
> You can download this test kernel from http://people.redhat.com/dzickus/el5

Tested this code level and is working as expected.

Will file a separate bug for the patch provided in Comment #39.
Comment 37 IBM Bug Proxy 2008-09-23 06:53:09 EDT
This needs to be retested once we have the official beta in October, reopening
until then.
Comment 39 Ryan Lerch 2008-11-02 17:47:17 EST
Release note added for 5.3.

Please edit the "Release Notes" field above with any alterations / additions to this release note.
Comment 40 Ryan Lerch 2008-11-02 17:47:17 EST
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
the SCSI device handler infrastructure (scsi_dh) has been updated, providing the following improvements:

* a generic ALUA (asymmetric logical unit access) handler has been implemented.

* added support for LSI RDAC SCSI based storage devices.
Comment 41 IBM Bug Proxy 2008-11-06 12:32:07 EST
Feature is verified in Beta1
Comment 42 Don 2008-11-17 11:05:58 EST
While testing a Clariion in Alua mode with 5.3 Beta 1 I have run into the following problem.

While doing a multipath -ll and I received nothing back.


I have had multiple engineers look at this and these are the finding.


> The reason you get no output back from the multipath -ll when 
> the array is in ALUA mode AND the ALUA lines are active is 
> probably related to the illegal CDB issue we saw yesterday 
> (the "rtpg sense code 05/24/00"). Is it possible to get a 
> fibre trace of that?

A Finisar trace of a multipath -ll command showed

>An illegal report target port groups CDB.
>Bytes 2-5 should be reserved and set to 0s.  Instead, it looks
>like the Inq Page 0x83 CDB is re-used without zeroing, as bytes
>2 and 4 contain 83 (page code) and 3C (allocation length).

>The problem is in the RHEL 5.3 beta (2.6.18-120.el5) code. I pulled that code >and there's no blk_rq_init() call in blk_alloc_request(), which would explain >the symptoms you're seeing. James Bottomley added blk_rq_init() in 2.6.26 to >fix several uninitialized buffer issues, including this one.
Comment 44 Andrius Benokraitis 2008-11-17 11:27:38 EST
Wayne, can you find out if Comment #42 is a new bug? IBM has verified this bugzilla earlier... if so, a new bug will be required and this put back to VERIFIED.
Comment 45 Andrius Benokraitis 2008-11-17 12:08:17 EST
I'm going on the hunch and a few conversations with folks and say the item in comment #42 is a new bug, and requires a new bugzilla. Donald (@EMC), would you create one?
Comment 46 Jerry Levy 2008-11-17 12:19:17 EST
Bear in mind that this bug has been fixed in newer code:

The problem is in the RHEL 5.3 beta (2.6.18-120.el5) code. I pulled that code and there's no blk_rq_init() call in blk_alloc_request(), which would explain the symptoms you're seeing. James Bottomley added blk_rq_init() in 2.6.26 to fix several uninitialized buffer issues, including this one.

jml
Comment 47 Don 2008-11-17 12:33:47 EST
per request I open 471920
Comment 51 errata-xmlrpc 2009-01-20 15:25:06 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html
Comment 52 Wayne Berthiaume 2015-11-30 10:34:06 EST
Ack

Note You need to log in before you can comment on or make changes to this bug.