Cause: If multipathd detects a path's wwid has changed, multipathd wasn't checking if the path belonged to a multipath device before dereferencing the multipath device's alias
Consequence: multipathd could crash if disable_changed_wwids is set, and a path not belonging to a multipath device has a wwid change
Fix: multipathd now checks if a path with a changed wwid belongs to a multipath device, before trying to dereference that device pointer
Result: multipathd no longer crashes when disable_changed_wwids is set, and a path not belonging to a multipath device has its wwid change.
Description of problem:
Customer is running 7.9 and had to enable disable_changed wwids. After enabling this multipathd started coredumping.
Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-134.el7_9.x86_64
How reproducible:
Easy to reproduce at customer
Steps to Reproduce:
1.Enable disable_changed_wwids
2.multipathd segfaults
3.
Actual results:
multipathd segfaults
Expected results:
Known fix needed to prevent this
Additional info:
Analyzing the core file and why multipathd is segfaulting
Program terminated with signal 11, Segmentation fault.
#0 uev_update_path (vecs=0x55ce6e3d7c80, uev=0x7ff2bc000b50) at main.c:859
859 dm_fail_path(pp->mpp->alias, pp->dev_t);
(gdb) list
854 condlog(0, "%s: path wwid changed from '%s' to '%s'. disallowing", uev->kernel, wwid, pp->wwid);
855 strcpy(pp->wwid, wwid);
856 if (!pp->wwid_changed) {
857 pp->wwid_changed = 1;
858 pp->tick = 1;
859 dm_fail_path(pp->mpp->alias, pp->dev_t);
860 }
861 }
862 else {
863 pp->wwid_changed = 0;
(gdb) p pp
$1 = (struct path *) 0x55ce6e402900
(gdb) p pp->mpp->alias
Cannot access memory at address 0x1c0
(gdb) p pp->mpp->alias
Cannot access memory at address 0x1c0
(gdb) p pp->mpp
$2 = (struct multipath *) 0x0
Seems to be a known bug with disable_changed_wwids
https://listman.redhat.com/archives/dm-devel/2017-March/msg00189.html
Its not patched
I will log a BZ to get it patched, but its due to the disable_changed _wwids being enabled
Best Regards
Red Hat
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (device-mapper-multipath bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2021:3333
Description of problem: Customer is running 7.9 and had to enable disable_changed wwids. After enabling this multipathd started coredumping. Version-Release number of selected component (if applicable): device-mapper-multipath-0.4.9-134.el7_9.x86_64 How reproducible: Easy to reproduce at customer Steps to Reproduce: 1.Enable disable_changed_wwids 2.multipathd segfaults 3. Actual results: multipathd segfaults Expected results: Known fix needed to prevent this Additional info: Analyzing the core file and why multipathd is segfaulting Program terminated with signal 11, Segmentation fault. #0 uev_update_path (vecs=0x55ce6e3d7c80, uev=0x7ff2bc000b50) at main.c:859 859 dm_fail_path(pp->mpp->alias, pp->dev_t); (gdb) list 854 condlog(0, "%s: path wwid changed from '%s' to '%s'. disallowing", uev->kernel, wwid, pp->wwid); 855 strcpy(pp->wwid, wwid); 856 if (!pp->wwid_changed) { 857 pp->wwid_changed = 1; 858 pp->tick = 1; 859 dm_fail_path(pp->mpp->alias, pp->dev_t); 860 } 861 } 862 else { 863 pp->wwid_changed = 0; (gdb) p pp $1 = (struct path *) 0x55ce6e402900 (gdb) p pp->mpp->alias Cannot access memory at address 0x1c0 (gdb) p pp->mpp->alias Cannot access memory at address 0x1c0 (gdb) p pp->mpp $2 = (struct multipath *) 0x0 Seems to be a known bug with disable_changed_wwids https://listman.redhat.com/archives/dm-devel/2017-March/msg00189.html Its not patched I will log a BZ to get it patched, but its due to the disable_changed _wwids being enabled Best Regards Red Hat