Bug 1635819

Summary: mpathpersist crashes when unable to complete persistent registrations quickly
Product: Red Hat Enterprise Linux 7 Reporter: Ben Marzinski <bmarzins>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Lin Li <lilin>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.6CC: agk, bmarzins, heinzm, lilin, loberman, msnitzer, prajnoha, rhandlin, sbradley
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-124.el7 Doc Type: Bug Fix
Doc Text:
Cause: When a reservation conflict occurred, mpathpersist was not correctly iterating over all of the registration threads, and it was not correctly rolling back the registration of multiple devices Consequence: mpathpersist could crash on reservation conflicts. Fix: mpathpersist now correctly iterates over all of the registration threads, and correctly issues the rollback request Result: mpathpersist no longer crashes when registrations fail with reservation conflicts.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-06 12:56:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1577173    

Description Ben Marzinski 2018-10-03 18:14:34 UTC
Description of problem:
mpathpersist can fail with a crash that often looks like this:

Program terminated with signal 11, Segmentation fault.
#0  udev_list_entry_delete (entry=0xbad7aad20, entry@entry=0x558bad7aad20)
    at src/libudev/libudev-list.c:243
243             free(entry->value);

However the crash could look very different on other systems. The common factor is that variables on the process stack are corrupted. This happens because
mpathpersist creates threads that access variables on the creating thread's
stack.  If the creating function fails to wait for the created threads, it
exits while the threads still have access to the stack variables. This means
that they can overwrite stack variables used by later functions.

Version-Release number of selected component (if applicable):


How reproducible:
Reliably with the right setup.

Steps to Reproduce:
run 
# /usr/sbin/mpathpersist -o --register --param-sark=<key> -d <dev>
while access to the array is slowed for some reason.

Actual results:
mpathpersist crashes

Expected results:
mpathpersist completes without crashing

Additional info:

Comment 4 Ben Marzinski 2019-02-11 21:11:02 UTC
Fixed crash during mpathpersist when there is a reservation conflict.

Comment 6 Ben Marzinski 2019-02-11 23:09:44 UTC
To reproduce this, you need a multipath device with at least 3 active paths, but the more paths, the easier it is to hit. Then you need to use sgpersist to add a registration to the last path in the multipath device table. Finally you need to use mpathpersist to register a different key.

Comment 13 errata-xmlrpc 2019-08-06 12:56:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2138