Red Hat Bugzilla – Bug 194411
[RHEL4 U5] dm-multipath: multipath command fails when a path is added to a map with failed path.
Last modified: 2010-01-11 21:35:25 EST
Description of problem:
When multipathd(8) is running and a map has failed path,
multipath(8) for path addition into the map fails.
Version-Release number of selected component:
Steps to Reproduce:
1. Prepare a storage which has more than 1 path.
(e.g. /dev/sda and /dev/sdb are multipath.)
2. Start multipathd.
# /etc/init.d/multipathd start
3. Remove one path.
# echo 1 > /sys/block/sdb/device/delete
4. Create a multipath map using remained path.
(The multipath map should be consisted of only /dev/sda,
for this example.)
5. Make the remaind path in the map fail.
# echo offline > /sys/block/sda/device/state
6. Hot-add the removed path.
# echo "scsi add-single-device <host> <channel> <bus> <lun>" \
7. Run multipath to add the hot-added path to the map.
multipath command fails with the following message.
device-mapper: reload ioctl failed: Invalid argument
multipath command succeeds.
multipath(8) is trying to reload table which includes falied path
(in the case above, /dev/sda), and it is rejected by kernel.
The code path which the table includes failed path is:
wwid of failed path (/dev/sda) is loaded in cache_load() and
it is removed once in path_discovery(). But in disassemble_map(),
it is copied from mpp->wwid again.
Therefore, the failed path (/dev/sda) is used in coalesce_paths().
By the way, if this bug is fixed, path addition will cause
the failed path being removed from existing multipath map
(silently and automatically by hotplug script).
So, even when the failed path comes back online, it will not be
a part of multipath map any longer.
This could be seen as regression from users.
So these problems above must be fixed at a time.
Proposed fix for multipath:
Exclude the failed path from the table in coalesce_paths().
Proposed fix for multipathd:
Monitor the failed path, even if the failed path isn't included
in any map, if wwid of the failed path is same as wwid of a map
which is monitored. (This behavior is already implemented.)
And when the failed path becomes online, fork() and exec() multipath(8).
Created attachment 130709 [details]
proposed patch for multipath
Created attachment 130710 [details]
proposed patch for multipathd
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
I'm not totally happy with this solution.
1. It makes multipathd exec multipath, and ideally we're trying to make
multipathd more and more self sufficient, and the multipath program more of just
a call in to it. This heads in the opposite direction.
2. More importantly, I don't think that failed paths should dissappear from the
map when you add new ones.
Alasdair, Is there a reason why the kernel cannot allow you to create a
multipath map with a failed path in it?
As a workaround, I belive that customers that wants to add a new path while
there is a failed one can kill multipathd, rerun multipath (without multipathd
running, multipath will do exactly what the patch causes. It will remove the
failed path, and add the new path), and start multipathd back up. Forcing this
sort of manual intervention will keep the customer from being surprised by
losing the path. It is pretty unsightly, I admit, and I'd rather just be able to
reload the map with the failed path.
I completely agree with the Ben's comment#5.
Being able to reload a map with failed path is a nice idea,
but it is probably not preferred in the kernel side.
Though I still want this situation being handled automatically
by multipathd, if you can't fix it in RHEL4.5, please make sure to
include the documentation about the workaround either in release note
or man page.
This bugzilla had previously been approved for engineering
consideration but Red Hat Product Management is currently reevaluating
this issue for inclusion in RHEL4.6.
This is not making 4.6
Unfortunately this bugzilla was not resolved in time for RHEL 4.7 Beta.
It has now been proposed for inclusion in RHEL 4.8 but must regain Product
The benefit associated with this fix does not outweigh the risk at this stage in the life of RHEL 4. I am moving this to RHEL 5.
Actually this problem can be seen on only RHEL4.
This is a design problem of multipathd(8) of RHEL4,
so I understand this problem isn't fixed in RHEL4.
But, there is a workaround of this problem.
If Red Hat doesn't fix this problem, I want Red Hat to
make some documents about the workaround for users.
So this bugzilla is for a documentation issue in RHEL4.
Please see Comment#5 and Comment#6 for details of
As noted above, this problem is a RHEL 4 only issue. I've cloned this bug to the RHEL4 bug 487443.