=Comment: #0================================================= Emily J. Ratliff <emilyr.com> - 2008-03-21 17:40 EDT 1. Feature Id: [201049] Feature Name: DM-MP SCSI Hardware Handlers Sponsor: PPC Category: Device Drivers and IO Request Type: Kernel - Enhancement from IBM 2. Short Description The SCSI HW Handler work is a community driven effort to address issues with the current dm-mp handling of device specific errors. The current dm hardware handlers will be migrated to the SCSI subsystem where the handlers can obtain more detailed failure information and obtain more detailed information about the devices. This will also allow the dm-mp layer to deal with devices in a more generic fashion and not have to decode SCSI specific data. This migration of the handlers is needed to fully support a number of storage devices and increase the utility of Linux's multipath solution. 3. Business Case Follow the Linux storage system strategy of having a single unifying multipath solution that supports all devices. Currently the DM-MP hardware handler solution cannot meet customer expectations for devices having a active / passive path device model. This results in the utilization of out of distro multipath drivers causing delays in certifications of the storage and increased support costs managing the out of distro drivers. Benefits Fully compliant and support the multipath IO with Device Mapper for IBM hardware. 4. Sponsor Priority 1 IBM Confidential: no Code Contribution: IBM code Upstream Acceptance: In Progress Component Version Target: Performance Assistance: no 5. PM Contact: Stephanie Glass, sglass.com, 512-838-9284 6. Technical contact(s): Daisy Chang, daisyc.com Chandra Seetharaman, chandra.seetharaman.com 7. LTC Manager: Wendel Voigt, wvoigt.com
------- Comment From sglass.com 2008-03-25 08:00 EDT------- This feature needs to be deferred to RHEL 5.3
------- Comment From sglass.com 2008-03-25 12:51 EDT------- Sorry for the confusion, this needs to be deferred to RHEL 5.4.
Code is now upstream. Added to tracker.
------- Comment From sglass.com 2008-07-11 11:29 EDT------- Reopening for RHEL 5.4
------- Comment From chandra.seetharaman.com 2008-07-29 18:35 EDT------- All the code needed are this feature are now available in 2.6.27-rc1. Ported all the patches to RHEL 5.2. Had couple of KABI issues. Made changes to the code such that none of the existing symbols changed (of course new symbols have been added). Currently testing the code. Here are the commit links (in Linus's tree): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a6a8d9f87eb8510a8f53672ea87703f62185d75f http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fbd7ab3eb53a3b88fefa7873139a62e439860155 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f6dd337ee4c440f29a873da3779eb3af44bd1623 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5e7dccad3621f6e2b572f309cf830a2c902cae80 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=cfae5c9bb66325cd32d5f2ee41f14749f062a53c http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bab7cfc733f4453a502b7491b9ee37b091440ec4 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2651f5d7d3bc5120a439e498f131e4d731f99b3e http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=cb520223d7f22c5386aff27a5856a66e2c32aaac http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=688864e29869a71a8183e4e2f96ccf9f2de1375f http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fe9233fb6914a0eb20166c967e3020f7f0fba2c9 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=89a93f2f4834f8c126e8d9dd6b368d0b9e21ec3d http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=33af79d12e0fa25545d49e86afc67ea8ad5f2f40 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=765cbc6dad16b87724803e359d6be792ddf08614 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c05ae52fcb0e27a2ee4a16d1f31f8c547fd4886 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b6ff1b14cdf4b4cb5403f3af2c3272f7e609a241 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2aef6d5c05ee5c02f2e4d737b8738deb118cf892 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ca9f0089867c9e476cf2e6d4615d2aae887171b2 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=057ea7c9683c3d684128cced796f03c179ecf1c2 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ae11b1b36da726a8a93409b896704edc6b4f3402 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7c32c7a2d36c52d2b9ed040a9171364020ecc6a2
Created attachment 314153 [details] Port bus notifications infrastructure from mainline RHEL 5.2 doesn't have support for bus notifications, which is needed for SCSI Hardware Handlers. This is the direct port of the functionality from mainline
Created attachment 314154 [details] ABI workaround for bus notifications ABI workaround for bus notifications
Ported and tested all the patches. Fixed all the ABI issues. This set of patches do not remove/change any existing kernel ABI. It just adds a few. Did _not_ attach the EMC handler and HP handler as they might affect the user interface (they currently accept arguments, but SCSI_DH doesn't).
Created attachment 314155 [details] Basic infrastructure for SCSI Hardware Handler Basic infrastructure for SCSI Hardware Handler
Created attachment 314156 [details] ABI workaround for scsi_device data structure change ABI workaround for scsi_device data structure change
Created attachment 314157 [details] Handler for RDAC device Handler for RDAC device
Created attachment 314158 [details] Handler for alua Handler for ALUA
Created attachment 314159 [details] Changes in dm layer to use SCSI Hardware Handler Changes in dm layer to use SCSI Hardware Handler
Created attachment 314160 [details] Do not remove Hardware Handler code Do not remove Hardware Handler code for ABI and backward compatibility reasons
Created attachment 314161 [details] Do not allow arguments for SCSI DH Hardware Handlers Do not allow arguments for SCSI DH Hardware Handlers
Created attachment 314162 [details] Add a workqueue to handle events Add a workqueue to handle events
Created attachment 314163 [details] Remove RDAC Handler from DM layer Remove RDAC Handler from DM layer
Created attachment 314339 [details] Handler for EMC device Added the patch as per MikeC suggestion.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
*** Bug 240548 has been marked as a duplicate of this bug. ***
Created attachment 314426 [details] ABI workaround for bus notifications Removed a debug message and a coding error.
(In reply to comment #23) > Created an attachment (id=314426) [details] > ABI workaround for bus notifications > > Removed a debug message and a coding error. I replaced this patch: https://bugzilla.redhat.com/attachment.cgi?id=314154 with the new one. The updated patchset is here http://people.redhat.com/mchristi/scsi_dh/
Created attachment 314828 [details] Set path state to passive if the path is not owned
Created attachment 314829 [details] Move bus notification nearlier in device_add
Testers, I updated the patchset with IBM's patches here: http://people.redhat.com/mchristi/scsi_dh/scsi-dh3/
Pactches sent to internal list for review. Here is the final patchset http://people.redhat.com/mchristi/scsi_dh/scsi-dh9/
in kernel-2.6.18-113.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
Created attachment 316641 [details] Fix a race situation Path activation code has been moved to a workqueue, and the path to be activated was read from current_pgpath. If there are any changes in the path, it leads to a current_pgpath becoming NULL and this leads to a panic. This patch fixes the problem
IBM - this bug is already in MODIFIED for a previously submitted patchset (Comment #28). You'll have to post your patch attached in Comment #30 in a new bugzilla for post-Beta consideration (if it meets the post-Beta criteria).
This enhancement request was evaluated by the full Red Hat Enterprise Linux team for inclusion in a Red Hat Enterprise Linux minor release. As a result of this evaluation, Red Hat has tentatively approved inclusion of this feature in the next Red Hat Enterprise Linux Update minor release. While it is a goal to include this enhancement in the next minor release of Red Hat Enterprise Linux, the enhancement is not yet committed for inclusion in the next minor release pending the next phase of actual code integration and successful Red Hat and partner testing.
Andrius is correct, any additional patches require a separate bug and separate review on rh-kernel list.
(In reply to comment #40) > ------- Comment From dzickus 2008-09-12 21:46:37 EDT------- > in kernel-2.6.18-113.el5 > You can download this test kernel from http://people.redhat.com/dzickus/el5 Tested this code level and is working as expected. Will file a separate bug for the patch provided in Comment #39.
This needs to be retested once we have the official beta in October, reopening until then.
Release note added for 5.3. Please edit the "Release Notes" field above with any alterations / additions to this release note.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: the SCSI device handler infrastructure (scsi_dh) has been updated, providing the following improvements: * a generic ALUA (asymmetric logical unit access) handler has been implemented. * added support for LSI RDAC SCSI based storage devices.
Feature is verified in Beta1
While testing a Clariion in Alua mode with 5.3 Beta 1 I have run into the following problem. While doing a multipath -ll and I received nothing back. I have had multiple engineers look at this and these are the finding. > The reason you get no output back from the multipath -ll when > the array is in ALUA mode AND the ALUA lines are active is > probably related to the illegal CDB issue we saw yesterday > (the "rtpg sense code 05/24/00"). Is it possible to get a > fibre trace of that? A Finisar trace of a multipath -ll command showed >An illegal report target port groups CDB. >Bytes 2-5 should be reserved and set to 0s. Instead, it looks >like the Inq Page 0x83 CDB is re-used without zeroing, as bytes >2 and 4 contain 83 (page code) and 3C (allocation length). >The problem is in the RHEL 5.3 beta (2.6.18-120.el5) code. I pulled that code >and there's no blk_rq_init() call in blk_alloc_request(), which would explain >the symptoms you're seeing. James Bottomley added blk_rq_init() in 2.6.26 to >fix several uninitialized buffer issues, including this one.
Wayne, can you find out if Comment #42 is a new bug? IBM has verified this bugzilla earlier... if so, a new bug will be required and this put back to VERIFIED.
I'm going on the hunch and a few conversations with folks and say the item in comment #42 is a new bug, and requires a new bugzilla. Donald (@EMC), would you create one?
Bear in mind that this bug has been fixed in newer code: The problem is in the RHEL 5.3 beta (2.6.18-120.el5) code. I pulled that code and there's no blk_rq_init() call in blk_alloc_request(), which would explain the symptoms you're seeing. James Bottomley added blk_rq_init() in 2.6.26 to fix several uninitialized buffer issues, including this one. jml
per request I open 471920
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html
Ack