Description of problem:
If I update my multipath device settings/parameters (ie. change
/etc/multipath.conf and run 'multipath'), the device-mapper maps get updated (as
verified through 'dmsetup table') but the changes in the map gets reloaded with
stale values by multipathd when uevent handlers are triggered (e.g. during error
injection, a path goes down). In particular, I saw this while playing around
with the 'no_path_retry' settings. As a user, I would expect that whenever I use
'multipath' to update the settings on my multipath devices, they should stay
persistent until I update them again.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. configure dm-mp devices with 'no_path_retry queue' - run
'multipath' and '/etc/init.d/multipathd start'
--> check 'dmsetup table' to make sure maps have queueing on
2. edit /etc/multipath.conf with 'no_path_retry fail' and run 'multipath'
--> check 'dmsetup table' to make sure maps were updated (queueing is off)
4. disable a port, which causes half the paths to fail
'dmsetup table' output reflects that queuing is on
'dmsetup table' should reflect that the maps still have queueing off
I was using no_path_retry, but I expect the problem of keeping multipathd's
stored settings/maps in-sync would be a more general problem with any dm map
and/or multipathd setting that is controlled through /etc/multipath.conf. It
seems to me there are several ways to deal with this:
1. whenever use 'multipath' to configure, also use mulitpathd cli to explicitly
'reconfigure' the daemon (but from a user-friendly perspective, I don't think
this is good)
2. return to the "multipath signal multipathd" mechanism so that multipathd info
is not stale
3. multipathd is notified of any map changes/reloads in dm
I think #3 is probably ideal and perhaps this is what the NETLINK_DM mechanism
is intended for. But since that is not quite ready yet for RHEL4, I quickly
tried using the original signal code again from older multipath-tools (in the
attached patch) and that seemed to work.
Created attachment 124734 [details]
re-add multipath signal multipathd mechanism
Oh yeah, I also saw this same issue with multipath-tools git head from Feb. 10
and a 2.6.16 based kernel, but for the upstream multipath-tools I'm assuming
netlink (once it's ready) is going to be used?
The patch in comment#1 may produce deadlock when the thread
receiving the signal has taken the vecs->lock.
For example, if multipath(8) is executed during multipathd(8) is
stopped, child() can jump to sighup() with taking the vecs->lock.
I think that fork/exec 'multipath -k"reconfigure"' in multipath(8)
is better way to notify multipathd(8).
I posted a long comment in bugzilla #181309 that hopefully clears up why
multipathd, not multipath is the authorative source for the no_path_retry value.
There is really no reason to run multipath at all after updating
# multipath -k"reconfigure"
will correctly update the dm device. You can then run mulipath -l to verify this
if you want.
I edited the multipath usage documentation to state that multipathd must be
restarted after configuration file changes.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.