181734 – multipathd out-of-sync with /etc/multipath.conf settings after reconfiguration

Bug 181734 - multipathd out-of-sync with /etc/multipath.conf settings after reconfiguration

Summary: multipathd out-of-sync with /etc/multipath.conf settings after reconfiguration

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	device-mapper-multipath
Sub Component:
Version:	4.3
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Alasdair Kergon
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	181409
TreeView+	depends on / blocked

Reported:	2006-02-16 02:09 UTC by Lan Tran
Modified:	2010-01-12 02:24 UTC (History)
CC List:	10 users (show)
Fixed In Version:	RHEA-2006-0513
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-08-10 21:46:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
re-add multipath signal multipathd mechanism (2.54 KB, patch) 2006-02-16 02:13 UTC, Lan Tran	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2006:0513	0	normal	SHIPPED_LIVE	device-mapper-multipath enhancement update	2006-08-10 04:00:00 UTC

Description Lan Tran 2006-02-16 02:09:41 UTC

Description of problem:
If I update my multipath device settings/parameters (ie. change
/etc/multipath.conf and run 'multipath'), the device-mapper maps get updated (as
verified through 'dmsetup table') but the changes in the map gets reloaded with
stale values by multipathd when uevent handlers are triggered (e.g. during error
injection, a path goes down). In particular, I saw this while playing around 
with the 'no_path_retry' settings. As a user, I would expect that whenever I use
'multipath' to update the settings on my multipath devices, they should stay
persistent until I update them again. 


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.5-12.0.RHEL4

How reproducible:
always

Steps to Reproduce:
1. configure dm-mp devices with 'no_path_retry queue' - run 
'multipath' and '/etc/init.d/multipathd start' 
   --> check 'dmsetup table' to make sure maps have queueing on 
2. edit /etc/multipath.conf with 'no_path_retry fail' and run 'multipath' 
   --> check 'dmsetup table' to make sure maps were updated (queueing is off)
4. disable a port, which causes half the paths to fail
 
Actual results:
'dmsetup table' output reflects that queuing is on

Expected results:
'dmsetup table' should reflect that the maps still have queueing off 

Additional info: 
I was using no_path_retry, but I expect the problem of keeping multipathd's
stored settings/maps in-sync would be a more general problem with any dm map
and/or multipathd setting that is controlled through /etc/multipath.conf. It
seems to me there are several ways to deal with this: 
1. whenever use 'multipath' to configure, also use mulitpathd cli to explicitly
'reconfigure' the daemon  (but from a user-friendly perspective, I don't think
this is good) 
2. return to the "multipath signal multipathd" mechanism so that multipathd info
is not stale
3. multipathd is notified of any map changes/reloads in dm 

I think #3 is probably ideal and perhaps this is what the NETLINK_DM mechanism
is intended for. But since that is not quite ready yet for RHEL4, I quickly
tried using the original signal code again from older multipath-tools (in the
attached patch) and that seemed to work.

Comment 1 Lan Tran 2006-02-16 02:13:42 UTC

Created attachment 124734 [details]
re-add multipath signal multipathd mechanism

Comment 2 Lan Tran 2006-02-16 02:18:11 UTC

Oh yeah, I also saw this same issue with multipath-tools git head from Feb. 10
and a 2.6.16 based kernel, but for the upstream multipath-tools I'm assuming
netlink (once it's ready) is going to be used?

Comment 3 Kiyoshi Ueda 2006-02-16 19:25:21 UTC

The patch in comment#1 may produce deadlock when the thread
receiving the signal has taken the vecs->lock.
For example, if multipath(8) is executed during multipathd(8) is
stopped, child() can jump to sighup() with taking the vecs->lock.

I think that fork/exec 'multipath -k"reconfigure"' in multipath(8)
is better way to notify multipathd(8).

Comment 4 Ben Marzinski 2006-02-28 21:03:43 UTC

I posted a long comment in bugzilla #181309 that hopefully clears up why
multipathd, not multipath is the authorative source for the no_path_retry value.
There is really no reason to run multipath at all after updating
multipath.conf, running:

# multipath -k"reconfigure"

will correctly update the dm device.  You can then run mulipath -l to verify this
if you want.

Comment 5 Ben Marzinski 2006-04-19 21:19:15 UTC

I edited the multipath usage documentation to state that multipathd must be
restarted after configuration file changes.

Comment 11 Red Hat Bugzilla 2006-08-10 21:46:07 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2006-0513.html

Note You need to log in before you can comment on or make changes to this bug.