Bug 1814858 - Multipath inconsistent with partition table
Summary: Multipath inconsistent with partition table
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: device-mapper-multipath
Version: 8.2
Hardware: ppc64le
OS: Linux
Target Milestone: rc
: 8.0
Assignee: Ben Marzinski
QA Contact: Lin Li
Depends On:
TreeView+ depends on / blocked
Reported: 2020-03-18 20:13 UTC by Chris Mackowski
Modified: 2021-09-06 15:24 UTC (History)
10 users (show)

Fixed In Version: device-mapper-multipath-0.8.4-3.el8
Doc Type: Bug Fix
Doc Text:
Cause: In some cases, multipathd wasn't reloading devices with no valid paths, and kpartx wasn't updating the partitions when a device gained its first valid path. Consequence: If multipathd couldn't verify that any paths were valid, reloading a device didn't cause kpartx to update its partitions Fix: multipathd will now always reload a device with no active paths, and kpartx will run when the first path becomes valid Result: reloading a multipath device will always update the kpartx partitions.
Clone Of:
Last Closed: 2020-11-04 01:59:31 UTC
Type: Bug
Target Upstream Version:

Attachments (Terms of Use)
strace log (8.53 KB, text/plain)
2020-03-18 20:14 UTC, Chris Mackowski
no flags Details

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:4540 0 None None None 2020-11-04 01:59:59 UTC

Description Chris Mackowski 2020-03-18 20:13:11 UTC
As partitions are created on multipath device I may or may not get multipath partition changes. In this case, I can see changes to /proc/partitions for sda/sdb (mpatha) but I can not see partitions for mpatha. The only consistent way (without reboot) to refresh multipath is interactive fdisk to print partition table (partprobe, partx -u, multipath -F/r, dmsetup reload will not work).  I have only seen this issue on arch ppc64le.

	[root@p9-node9 ~]# multipath -ll
	mpatha (3600601605e203f00540b79373856e911) dm-2 DGC,VRAID
	size=105G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
	|-+- policy='service-time 0' prio=50 status=active
	| `- 1:0:0:0 sda 8:0  active ready running
	`-+- policy='service-time 0' prio=10 status=enabled
	  `- 1:0:1:0 sdb 8:16 active ready running

	[root@p9-node9 ~]# cat /proc/partitions
	major minor  #blocks  name

	   8       32   12582912 sdc
	   8       33       4096 sdc1
	   8       34    1048576 sdc2
	   8       35   11529216 sdc3
	   8        0  110100480 sda
	   8        1   55050220 sda1
	   8        2   55050220 sda2
	   8       16  110100480 sdb
	   8       17   55050220 sdb1
	   8       18   55050220 sdb2
	  11        0     613800 sr0
	 253        0   10264576 dm-0
	 253        1    1261568 dm-1
	 253        2  110100480 dm-2

dmsetup not reporting partitions 
	[root@p9-node9 ~]# dmsetup ls
	mpatha	(253:2)
	rhel-swap	(253:1)
	rhel-root	(253:0)

using dmsetup to reload multipath:

        [root@p9-node8 ~]# time dmsetup -v reload /dev/mapper/mpatha

        real        11m6.701s
        user        0m0.004s
        sys        0m0.000s

Using partx:
	[root@p9-node9 ~]# ls /dev/mapper/mpatha*
	[root@p9-node9 ~]# partx -u /dev/mapper/mpatha
	[root@p9-node9 ~]# ls /dev/mapper/mpatha*

Reloading multipath:
	[root@p9-node9 ~]# ls /dev/mapper/mpatha*
	[root@p9-node9 ~]# multipath -F;  multipath -r
	[root@p9-node9 ~]# ls /dev/mapper/mpatha*
	[root@p9-node9 ~]# multipath -F;  multipath -r
	[root@p9-node9 ~]# ls /dev/mapper/mpatha*

Comment 1 Chris Mackowski 2020-03-18 20:14:59 UTC
Created attachment 1671208 [details]
strace log

Comment 6 Ben Marzinski 2020-03-19 19:16:12 UTC
Sorry for not looking at this sooner. If you could re-set up a reproducer, that would be helpful. I can't see this on the RHEL8 machines I've tried on.

Comment 9 Ben Marzinski 2020-03-20 04:46:58 UTC
So what's happening is that in RHEL8, the multipath command doesn't always do the reload command itself, if multipathd is running, multipath delegates the reload to multipathd, to guarantee that the daemon state is updated. The issue is that on your machines, it is taking too long for the checker to return when multipathd is reconfiguring the device. Since multipathd is reconfiguring, it doesn't trust the old device state.  It waits 100 ms for the path checkers to return something, but that doesn't happen. In this case, multipathd doesn't reload the device, because it doesn't see any known usable paths.  This is the way multipathd has always done things, but device-mapper has allowed reloads with failed paths (or even no paths) for years, so there's no point in this check anymore. Also, if you look at your logs, multipathd drops briefly into recovery mode until the checkers finally return. Actually this message is misleading. The paths are always active for the kernel.

However, this is only an issue here because you are relying on reloading the multipath device to trigger a kpartx update as a side effect.  If you run

kpartx -u /dev/mapper/<device>

Which updates the kpartx partitions, they will get updated correctly. Multipath has never automatically updated its kpartx partition devices, if the paths have their partitions directly changed, because it doesn't get any notification of it. Instead, it has always updated the devices when there was a (non-path state change) uevent for the multipath device itself (which happens when you reload the multipath device).

I'm going to create patches to both allow reloads when all the paths are down (or at least when some of the paths are pending,
which is a state that only really happens when a device first appears or multipathd is reconfigured, so it has no prior state to
go by) and to make multipathd could pending paths as up when checking if it needs to start recovery mode. But until then, you can just run "kpartx -u" if you want to update the kpartx devices (This is, in fact, what you should run, if you just want to update kpartx devices).  That's what's getting run when a multipath device is reloaded, just by udev.

Comment 13 Ben Marzinski 2020-05-28 18:42:02 UTC
multipath and multipathd should now reload devices correctly, even when there are no active paths.  Also, make kpartx run on a device when the first path appears, even if it otherwise wouldn't be.

Comment 27 Ben Marzinski 2020-07-07 15:32:50 UTC
kpartx now uses directio to read the devices, to avoid the issue of a stale cache (due to the partitions being updated on another machine).

Comment 32 errata-xmlrpc 2020-11-04 01:59:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (device-mapper-multipath bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.