Bug 520065 - [FOCUS] [MRG-1.2] When the dev_loss_tmo fires don't remove devices by default.
Summary: [FOCUS] [MRG-1.2] When the dev_loss_tmo fires don't remove devices by default.
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 1.2
Hardware: x86_64
OS: All
Target Milestone: 1.3
: ---
Assignee: John Kacur
QA Contact: David Sommerseth
Depends On: 514541
TreeView+ depends on / blocked
Reported: 2009-08-28 08:46 UTC by David Sommerseth
Modified: 2016-05-22 23:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 514541
Last Closed: 2010-09-15 09:52:24 UTC

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
IBM Linux Technology Center 54353 None None None Never

Description David Sommerseth 2009-08-28 08:46:55 UTC
+++ This bug was initially created as a clone of Bug #514541 +++

This bug got both MRG-1.1 and MRG-1.2 bits into it.  Splitting it up now, to have one bug for each version.   The original, bug #514541 will be for the MRG/R 1.1.8 release.  The cloned bug will be for the coming MRG-1.2 release.

In R2 if you pull a Fibre Channel cable and the dev_loss_tmo expires
the fc transport class will remove the attached scsi devices (/dev/sdX
is deleted).  This behavior is altered in RHEL with the  BZ 215797 patch.

The problem with removing scsi devices is that the userspace 
(udev, multipathd) is not handling the hotplug events quick enough causing
various issues.

This defect is to port the patch published as part of Redhat BZ 215797 
and validate it.

Even with this patch, looks like we may still need to comment out the udev rule which invokes
/sbin/multipath on add path udev event.

This patch doesn't stop path add udev events delivering to the user-level udev device.
Hence the multipathd and multipath both are acting in un-synchronized fashion.

I think this patch can only avoid the need for modified multipathd (increased delay).
Testing is underway to verify that.

Looks like I drew conclusions little early. Looking at the dump and also the logs, looks like the
system is running fine..but very very slow in responding as the root and blast 
running on single path.

Patch is working fine..and the events generated by the patch are "change" events
not "new" events. Hence we can even remove the comment in the udev rules.

24 hour tests gave promising results. 

- I enabled the udev rule so that it goes back to default.
- I also replaced the multipathd to the one came with RHEL 5.2 installation.

Applied the ported patch, rebuilt the kernel, booted.
With this , blast runs fine, and port bounces are not causing any
disturbances in mpath paths.

This looks good and requesting to mirror this for RH inclusion.

--- Additional comment from bugproxy@us.ibm.com on 2009-07-29 11:22:05 EDT ---

Created an attachment (id=355565)
MRG 1.1 patch

--- Additional comment from lgoncalv@redhat.com on 2009-08-03 19:17:35 EDT ---

Created an attachment (id=356084)
diff -u -p version of the last patch

JV, could you please check if this patch is a good translation of yours? I needed the patch with diff -u (and if possible -p).


--- Additional comment from bugproxy@us.ibm.com on 2009-08-21 17:40:45 EDT ---

------- Comment From jvrao@us.ibm.com 2009-08-21 17:37 EDT-------
Strange.. I don't see this patch on MRG 1.2 either...but I remember seeing this sometime back in one of the early releases of MRG 1.2...anyway. We need this patch
on MRG 1.2. Please absorb it.


--- Additional comment from jkacur@redhat.com on 2009-08-25 18:51:47 EDT ---

Created an attachment (id=358647)
scsi-fc-transport-removal-of-target-configurable.patch ported to MRG 1.2

--- Additional comment from bugproxy@us.ibm.com on 2009-08-25 22:00:43 EDT ---

------- Comment From jvrao@us.ibm.com 2009-08-25 21:56 EDT-------
(In reply to comment #19)
> Created an attachment (id=47783) [details]
> scsi-fc-transport-removal-of-target-configurable.patch ported to MRG 1.2
> ------- Comment (attachment only) From jkacur@redhat.com 2009-08-25 18:51:47
> EDT-------

I have ported the patch to MRG 1.2 and compared with this...and it matches properly.
I have tested my patch successfully which implies this is also tested. :)

--- Additional comment from bugproxy@us.ibm.com on 2009-08-26 00:41:56 EDT ---

Created an attachment (id=358667)
MRG1.2-rc6 patch

------- Comment on attachment From jvrao@us.ibm.com 2009-08-26 00:31 EDT-------

This is what I have done..and looks similar to what jkacur has done.

Comment 1 Luis Claudio R. Goncalves 2009-08-28 11:53:28 UTC
Patch added to kernel-rt-2.6.31-rc7-rt7-mrg12 by Jkacur.

Comment 2 Clark Williams 2010-09-14 20:23:18 UTC
Do we know if this is still an issue for the MRG 1.3 kernel? if not we should close this with CURRENTRELEASE status

Note You need to log in before you can comment on or make changes to this bug.