Bug 237691 - multipathd can die under the right circumstances
Summary: multipathd can die under the right circumstances
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: device-mapper-multipath
Version: 4.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Ben Marzinski
QA Contact: Corey Marthaler
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-04-24 19:45 UTC by nate.dailey
Modified: 2010-01-12 02:28 UTC (History)
11 users (show)

Fixed In Version: RHEA-2007-0808
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-15 16:16:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2007:0808 0 normal SHIPPED_LIVE device-mapper-multipath enhancement update 2007-11-14 21:45:30 UTC

Description nate.dailey 2007-04-24 19:45:45 UTC
Description of problem:

multipathd dies.


Version-Release number of selected component (if applicable):

I hit this with both the RHEL4 update 4 and update 5 (RC1) version of
device-mapper-multipath.


How reproducible:

Every time, following the instructions below


Steps to Reproduce:

1. Server is connected via a switch to an EMC CX300, zoned so that one HBA in
the server can talk to SP A in the CX300, the other HBA in the server can talk
to SP B in the CX300.

2. Pull the fibre channel cable from one SP in the CX300.

3. Run "multipath -l" repeatedly until it hangs. Kill it via ctrl-c, then run
"multipath -l" again (it will hang again). Don't know if running it a second
time is important or not, but killing one invocation is important--no error if
you don't do this.

4. After a little while, the multipath -l command will un-hang itself; at this
point, multipathd dies.

I hit this under update 4 originally. I also hit the same problem in update 4
with Stratus ftScalable Storage, by shutting down and restarting both RAID
controllers.

Things are a bit different under update 5, in that if I did "multipath -ll" I
wasn't able to control-c the command, and in that case multipathd survived. But
with "multipath -l" I was able to reproduce the problem. I did my update 5
testing with Stratus ftScalable Storage, because I didn't have an update 5
system connected to a Clariion.
  

Actual results:

multipathd dies


Expected results:

multipathd does not die


Additional info:

The following patch fixes the problem. This is from
http://git.kernel.org/gitweb.cgi?p=linux%2Fstorage%2Fmultipath-tools%2F.git;a=log



[multipathd] ignore SIGPIPE



Christophe Varoqui [Thu, 16 Mar 2006 08:07:02 +0000 (09:07 +0100)]



Maxim Kozover reported daemon segfault when breaking the recieving side

of the unix socket. Credits.



diff --git a/multipathd/main.c b/multipathd/main.c



--- a/multipathd/main.c

+++ b/multipathd/main.c

@@ -1215,6 +1215,7 @@ signal_init(void)

        signal_set(SIGINT, sigend);

        signal_set(SIGTERM, sigend);

        signal_set(SIGKILL, sigend);

+       signal(SIGPIPE, SIG_IGN);

 }



 static void

Comment 1 Ben Marzinski 2007-04-26 17:52:42 UTC
Patch applied. Thanks!

Comment 2 RHEL Program Management 2007-05-09 04:36:05 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 errata-xmlrpc 2007-11-15 16:16:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2007-0808.html



Note You need to log in before you can comment on or make changes to this bug.