RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1459370 - Segfault when reading all_devs device with no_path_retry 4
Summary: Segfault when reading all_devs device with no_path_retry 4
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: device-mapper-multipath
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Lin Li
Marek Suchánek
URL:
Whiteboard:
: 1462134 (view as bug list)
Depends On:
Blocks: 1298243 1420851 1469559 1510837
TreeView+ depends on / blocked
 
Reported: 2017-06-07 00:37 UTC by Nir Soffer
Modified: 2021-09-03 12:06 UTC (History)
14 users (show)

Fixed In Version: device-mapper-multipath-0.4.9-112.el7
Doc Type: Bug Fix
Doc Text:
DM Multipath no longer crashes when adding a feature to an empty string Previously, the DM Multipath service terminated unexpectedly when it attempted to add a feature to the features string of a built-in device configuration that had no features string. With this update, DM Multipath first checks if the features string exists, and creates one if necessary. As a result, DM Multipath no longer crashes when trying to modify a nonexistent features string.
Clone Of:
: 1510837 (view as bug list)
Environment:
Last Closed: 2018-04-10 16:10:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Output of "multipathd show config" with the scratch build (20.52 KB, text/plain)
2017-06-09 15:54 UTC, Nir Soffer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3082081 0 None None None 2017-06-16 16:55:48 UTC
Red Hat Product Errata RHEA-2018:0884 0 normal SHIPPED_LIVE device-mapper-multipath bug fix and enhancement update 2018-04-10 13:47:14 UTC

Description Nir Soffer 2017-06-07 00:37:30 UTC
Description of problem:

Using this multipath.conf:

$ cat /etc/multpath.conf

# cat /etc/multipath.conf 
# VDSM REVISION 1.4

defaults {
    polling_interval            5
    no_path_retry               4
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}

devices {
    device {
        all_devs                yes
        no_path_retry           4
    }
}


# multipath -ll
Segmentation fault (core dumped)


# gdb multipath
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/multipath...Reading symbols from /usr/sbin/multipath...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Missing separate debuginfos, use: debuginfo-install device-mapper-multipath-0.4.9-99.el7_3.1.x86_64
(gdb) run -ll
Starting program: /usr/sbin/multipath -ll
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff727b4ab in __strstr_sse42 () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff727b4ab in __strstr_sse42 () from /lib64/libc.so.6
#1  0x00007ffff751ff1c in add_feature () from /lib64/libmultipath.so.0
#2  0x00007ffff751e2e6 in factorize_hwtable () from /lib64/libmultipath.so.0
#3  0x00007ffff751efc8 in load_config () from /lib64/libmultipath.so.0
#4  0x0000000000402009 in main ()

Version-Release number of selected component (if applicable):
# rpm -qa | grep device-mapper
device-mapper-libs-1.02.135-1.el7_3.3.x86_64
device-mapper-persistent-data-0.6.3-1.el7.x86_64
device-mapper-multipath-libs-0.4.9-99.el7_3.1.x86_64
device-mapper-event-libs-1.02.135-1.el7_3.3.x86_64
device-mapper-1.02.135-1.el7_3.3.x86_64
device-mapper-event-1.02.135-1.el7_3.3.x86_64
device-mapper-multipath-0.4.9-99.el7_3.1.x86_64

How reproducible:
Always

Setting no_path_retry in the all_devs device to fail, multipath works normally.

Comment 2 Ben Marzinski 2017-06-07 20:54:50 UTC
Can you try the rpms at

http://download-node-02.eng.bos.redhat.com/brewroot/scratch/bmarzins/task_13374131/

and see if they fix the issue.  There is a bug in add_features when trying to add a new feature to a device configuration that doesn't already have a feature.

Comment 4 Nir Soffer 2017-06-09 15:54:08 UTC
Created attachment 1286465 [details]
Output of "multipathd show config" with the scratch build

Comment 5 Ben Marzinski 2017-06-09 22:53:15 UTC
Great. We're in the blockers only phase of rhel-7.4, so how urgent is this bugfix for you?

Comment 6 Yaniv Lavi 2017-06-11 09:58:52 UTC
(In reply to Ben Marzinski from comment #5)
> Great. We're in the blockers only phase of rhel-7.4, so how urgent is this
> bugfix for you?

This can wait to 7.4.z.

Comment 7 Nir Soffer 2017-06-11 11:11:42 UTC
(In reply to Yaniv Lavi from comment #6)
> (In reply to Ben Marzinski from comment #5)
> > Great. We're in the blockers only phase of rhel-7.4, so how urgent is this
> > bugfix for you?
> 
> This can wait to 7.4.z.

This configuration was tested by RHV QE on 2016-07-28:
https://bugzilla.redhat.com/show_bug.cgi?id=1335176#c31

We are recommending the "no_path_retry 4" option for about a year in the users
mailing list:
http://lists.ovirt.org/pipermail/users/2016-August/041949.html

So this seems to be a regression in 7.3.

I don't know about customers cases yet, but I don't think we should wait for them.

I would like this fix in 7.3.z.

Comment 8 Yaniv Lavi 2017-06-12 10:09:46 UTC
(In reply to Nir Soffer from comment #7)
> (In reply to Yaniv Lavi from comment #6)
> > (In reply to Ben Marzinski from comment #5)
> > > Great. We're in the blockers only phase of rhel-7.4, so how urgent is this
> > > bugfix for you?
> > 
> > This can wait to 7.4.z.
> 
> This configuration was tested by RHV QE on 2016-07-28:
> https://bugzilla.redhat.com/show_bug.cgi?id=1335176#c31
> 
> We are recommending the "no_path_retry 4" option for about a year in the
> users
> mailing list:
> http://lists.ovirt.org/pipermail/users/2016-August/041949.html
> 
> So this seems to be a regression in 7.3.
> 
> I don't know about customers cases yet, but I don't think we should wait for
> them.
> 
> I would like this fix in 7.3.z.

Nir, is correct. Me comment was under the assumption this isn't a regression.
Bronce, can you mark as blocker?

Comment 9 Ben Marzinski 2017-06-12 21:56:47 UTC
For what it's worth, this isn't a regression from rhel-7.3.  It was broken there too. It was working in rhel-7.2, however.

Comment 10 Ben Marzinski 2017-06-16 16:55:49 UTC
*** Bug 1462134 has been marked as a duplicate of this bug. ***

Comment 14 Ben Marzinski 2017-09-20 00:07:56 UTC
multipath wasn't correctly adding features to a configuration if the current features string was NULL.  It now handles this correctly.

Comment 17 Lin Li 2017-11-14 12:49:26 UTC
Reproduced on device-mapper-multipath-0.4.9-111.el7
1, # rpm -qa | grep multipath
device-mapper-multipath-0.4.9-111.el7.x86_64
device-mapper-multipath-libs-0.4.9-111.el7.x86_64

2, edit /etc/multipath.conf
# cat /etc/multipath.conf
defaults {
    polling_interval            5
    no_path_retry               4
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}

devices {
    device {
        all_devs                yes
        no_path_retry           4
    }
}

3, # multipath -ll
Segmentation fault

4,# service multipathd reload
Redirecting to /bin/systemctl reload multipathd.service
Job for multipathd.service failed because a fatal signal was delivered to the control process. See "systemctl status multipathd.service" and "journalctl -xe" for details.

5,# systemctl status multipathd.service
● multipathd.service - Device-Mapper Multipath Device Controller
   Loaded: loaded (/usr/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
   Active: failed (Result: signal) since Tue 2017-11-14 07:09:22 EST; 42s ago
  Process: 13257 ExecReload=/sbin/multipathd reconfigure (code=exited, status=0/SUCCESS)
 Main PID: 1189 (code=killed, signal=SEGV)

Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c56786f: stop event chec...84)
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c567871: stop event chec...16)
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c567873: stop event chec...48)
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c567875: stop event chec...80)
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[13257]: error receiving packet
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: multipathd.service: main process exited, code=killed, s...SEGV
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: PID 1189 read from file /run/multipathd/multipathd.pid ...bie.
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: Reload failed for Device-Mapper Multipath Device Controller.
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: Unit multipathd.service entered failed state.
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: multipathd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.


6, # journalctl -xe
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com polkitd[1521]: Registered Authentication Agent for unix-process:13241:655805
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: reconfigure (operator)
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c56786d: stop event checker thre
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c56786f: stop event checker thre
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c567871: stop event checker thre
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c567873: stop event checker thre
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[1189]: 360a98000324669436c2b45666c567875: stop event checker thre
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com kernel: multipathd[1222]: segfault at 0 ip 00007f6091eb2f5b sp 00007f6093431
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com multipathd[13257]: error receiving packet
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: multipathd.service: main process exited, code=killed, status=11/
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: PID 1189 read from file /run/multipathd/multipathd.pid does not 
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: Reload failed for Device-Mapper Multipath Device Controller.
-- Subject: Unit multipathd.service has finished reloading its configuration
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit multipathd.service has finished reloading its configuration
-- 
-- The result is failed.
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: Unit multipathd.service entered failed state.
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com systemd[1]: multipathd.service failed.
Nov 14 07:09:22 storageqe-06.rhts.eng.bos.redhat.com polkitd[1521]: Unregistered Authentication Agent for unix-process:13241:6558
Nov 14 07:09:36 storageqe-06.rhts.eng.bos.redhat.com kernel: multipath[13271]: segfault at 0 ip 00007fb9904d1f5b sp 00007ffde827d






Verified on device-mapper-multipath-0.4.9-116
1, # rpm -qa | grep multipath
device-mapper-multipath-libs-0.4.9-116.el7.x86_64
device-mapper-multipath-devel-0.4.9-116.el7.x86_64
device-mapper-multipath-debuginfo-0.4.9-116.el7.x86_64
device-mapper-multipath-0.4.9-116.el7.x86_64
device-mapper-multipath-sysvinit-0.4.9-116.el7.x86_64

2, edit /etc/multipath.conf
# cat /etc/multipath.conf
defaults {
    polling_interval            5
    no_path_retry               4
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}

devices {
    device {
        all_devs                yes
        no_path_retry           4
    }
}

3,  # multipath -ll
360a98000324669436c2b45666c56786d dm-0 NETAPP  ,LUN             
size=20G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sdg 8:96  active ready running
| `- 4:0:0:0 sdl 8:176 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:0 sdb 8:16  active ready running
  `- 4:0:1:0 sdq 65:0  active ready running
360a98000324669436c2b45666c567875 dm-8 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:4 sdk 8:160 active ready running
| `- 4:0:0:4 sdp 8:240 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:4 sdf 8:80  active ready running
  `- 4:0:1:4 sdu 65:64 active ready running
360a98000324669436c2b45666c567873 dm-7 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:3 sdj 8:144 active ready running
| `- 4:0:0:3 sdo 8:224 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:3 sde 8:64  active ready running
  `- 4:0:1:3 sdt 65:48 active ready running
360a98000324669436c2b45666c567871 dm-6 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:2 sdi 8:128 active ready running
| `- 4:0:0:2 sdn 8:208 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:2 sdd 8:48  active ready running
  `- 4:0:1:2 sds 65:32 active ready running
360a98000324669436c2b45666c56786f dm-5 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:1 sdh 8:112 active ready running
| `- 4:0:0:1 sdm 8:192 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:1 sdc 8:32  active ready running
  `- 4:0:1:1 sdr 65:16 active ready running


4, # service multipathd reload
Reloading multipathd configuration (via systemctl):  [  OK  ]

5, # multipath -ll
360a98000324669436c2b45666c56786d dm-0 NETAPP  ,LUN             
size=20G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sdg 8:96  active ready running
| `- 4:0:0:0 sdl 8:176 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:0 sdb 8:16  active ready running
  `- 4:0:1:0 sdq 65:0  active ready running
360a98000324669436c2b45666c567875 dm-8 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:4 sdk 8:160 active ready running
| `- 4:0:0:4 sdp 8:240 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:4 sdf 8:80  active ready running
  `- 4:0:1:4 sdu 65:64 active ready running
360a98000324669436c2b45666c567873 dm-7 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:3 sdj 8:144 active ready running
| `- 4:0:0:3 sdo 8:224 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:3 sde 8:64  active ready running
  `- 4:0:1:3 sdt 65:48 active ready running
360a98000324669436c2b45666c567871 dm-6 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:2 sdi 8:128 active ready running
| `- 4:0:0:2 sdn 8:208 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:2 sdd 8:48  active ready running
  `- 4:0:1:2 sds 65:32 active ready running
360a98000324669436c2b45666c56786f dm-5 NETAPP  ,LUN             
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:1 sdh 8:112 active ready running
| `- 4:0:0:1 sdm 8:192 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:1 sdc 8:32  active ready running
  `- 4:0:1:1 sdr 65:16 active ready running



Test result:
multipath no longer crashes when trying to modify the features string of built-in device configurations with no feature string.

Comment 23 errata-xmlrpc 2018-04-10 16:10:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0884


Note You need to log in before you can comment on or make changes to this bug.