Bug 494582

Summary: Stopping multipathd hangs the machine
Product: Red Hat Enterprise Linux 5 Reporter: RHEL Program Management <pm-rhel>
Component: device-mapper-multipathAssignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.2CC: agk, bmarzins, bmr, christophe.varoqui, cmarthal, dwysocha, edamato, egoggin, heinzm, jpayne, jplans, junichi.nomura, kueda, lmb, mbroz, msnitzer, pm-eus, prockai, tao, tis, tranlan
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-17 13:36:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 459629    
Bug Blocks:    

Description RHEL Program Management 2009-04-07 14:35:38 UTC
This bug has been copied from bug #459629 and has been proposed
to be backported to 5.3 z-stream (EUS).

Comment 3 Mike Snitzer 2009-04-07 15:48:32 UTC
Shutting down multipathd hangs the machine fairly regularly on s390x
under RHEL5.2 and RHEL5.3.  Other architectures could theoretically
experience the hang too (in practice this has not been common).

When multipathd is shutting down the main process signals all the waiter 
threads to stop.  However, they can't stop until the main process
unlocks it's mutex.  However, immediately after the main process does
this it destroys the mutex.  This is a problem because unless all the
waiter threads can get their work done before the mutex is destroyed
they will be attempting to lock and unlock a destroyed mutex.

A known workaround is to send SIGKILL to multipathd so that it doesn't
attempt the graceful thread shutdown which could potentially hang.  The
following demonstrates the workaround:

# service multipathd status
multipathd (pid 2031) is running...
# killall -KILL multipathd
# service multipathd restart
Stopping multipathd daemon:                                [FAILED]
Starting multipathd daemon:                                [  OK  ]
# service multipathd status
multipathd (pid 2450) is running...

Comment 4 Ben Marzinski 2009-04-07 21:03:14 UTC
The fix has been backported.

Comment 7 errata-xmlrpc 2009-04-17 13:36:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0432.html