Description of problem: If I update my multipath device settings/parameters (ie. change /etc/multipath.conf and run 'multipath'), the device-mapper maps get updated (as verified through 'dmsetup table') but the changes in the map gets reloaded with stale values by multipathd when uevent handlers are triggered (e.g. during error injection, a path goes down). In particular, I saw this while playing around with the 'no_path_retry' settings. As a user, I would expect that whenever I use 'multipath' to update the settings on my multipath devices, they should stay persistent until I update them again. Version-Release number of selected component (if applicable): device-mapper-multipath-0.4.5-12.0.RHEL4 How reproducible: always Steps to Reproduce: 1. configure dm-mp devices with 'no_path_retry queue' - run 'multipath' and '/etc/init.d/multipathd start' --> check 'dmsetup table' to make sure maps have queueing on 2. edit /etc/multipath.conf with 'no_path_retry fail' and run 'multipath' --> check 'dmsetup table' to make sure maps were updated (queueing is off) 4. disable a port, which causes half the paths to fail Actual results: 'dmsetup table' output reflects that queuing is on Expected results: 'dmsetup table' should reflect that the maps still have queueing off Additional info: I was using no_path_retry, but I expect the problem of keeping multipathd's stored settings/maps in-sync would be a more general problem with any dm map and/or multipathd setting that is controlled through /etc/multipath.conf. It seems to me there are several ways to deal with this: 1. whenever use 'multipath' to configure, also use mulitpathd cli to explicitly 'reconfigure' the daemon (but from a user-friendly perspective, I don't think this is good) 2. return to the "multipath signal multipathd" mechanism so that multipathd info is not stale 3. multipathd is notified of any map changes/reloads in dm I think #3 is probably ideal and perhaps this is what the NETLINK_DM mechanism is intended for. But since that is not quite ready yet for RHEL4, I quickly tried using the original signal code again from older multipath-tools (in the attached patch) and that seemed to work.
Created attachment 124734 [details] re-add multipath signal multipathd mechanism
Oh yeah, I also saw this same issue with multipath-tools git head from Feb. 10 and a 2.6.16 based kernel, but for the upstream multipath-tools I'm assuming netlink (once it's ready) is going to be used?
The patch in comment#1 may produce deadlock when the thread receiving the signal has taken the vecs->lock. For example, if multipath(8) is executed during multipathd(8) is stopped, child() can jump to sighup() with taking the vecs->lock. I think that fork/exec 'multipath -k"reconfigure"' in multipath(8) is better way to notify multipathd(8).
I posted a long comment in bugzilla #181309 that hopefully clears up why multipathd, not multipath is the authorative source for the no_path_retry value. There is really no reason to run multipath at all after updating multipath.conf, running: # multipath -k"reconfigure" will correctly update the dm device. You can then run mulipath -l to verify this if you want.
I edited the multipath usage documentation to state that multipathd must be restarted after configuration file changes.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2006-0513.html