Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1069584

Summary: multipathd core dumped if the fast_io_fail_tmo is set but no value is provided.
Product: Red Hat Enterprise Linux 7 Reporter: Xiaowei Li <xiaoli>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: yanfu,wang <yanwang>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: agk, bdonahue, bmarzins, heinzm, jcastillo, msnitzer, prajnoha, qcai, zkabelac
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-68.el7 Doc Type: Bug Fix
Doc Text:
Cause: Multipath wasn't checking if the pointer for the fail_io_fail_tmo value was NULL before using it as a string. Consequence: Multipath would crash on configuration if the fast_io_fail_tmo option was added to /etc/multipath.conf with no value Fix: Multipath now makes sure to check these pointers before trying to use them. Result: Multipath will not crash if users add the fail_io_fail_tmo option to /etc/multipath.conf and forget to add a value.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 08:25:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
propsed patch v1
none
propsed patch v2 none

Description Xiaowei Li 2014-02-25 10:51:30 UTC
Description of problem:


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-64.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. multipath.conf, set fast_io_fail_tmo but forget to provide the value.
defaults {
    user_friendly_names     yes
    fast_io_fail_tmo 
}
2. multipathd reconfig
error receiving packet

3.

Actual results:
Feb 25 05:48:52 sun-x2270m2-01 kernel: [  689.330054] multipathd[1592]: segfault at 0 ip 00007f14ea615a61 sp 00007f14ebb68ae8 error 4 in libc-2.17.so[7f14ea4b3000+1b6000]
Feb 25 05:48:52 sun-x2270m2-01 kernel: multipathd[1592]: segfault at 0 ip 00007f14ea615a61 sp 00007f14ebb68ae8 error 4 in libc-2.17.so[7f14ea4b3000+1b6000]
Feb 25 05:48:52 sun-x2270m2-01 abrt-hook-ccpp: Saved core dump of pid 1589 (/usr/sbin/multipathd) to /var/tmp/abrt/ccpp-2014-02-25-05:48:52-1589 (3534848 bytes)

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/sbin/multipathd'.
Program terminated with signal 11, Segmentation fault.
#0  __strlen_sse2_pminub () at ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:38
38		movdqu	(%rdi), %xmm1
(gdb) bt
#0  __strlen_sse2_pminub () at ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:38
#1  0x00007f0476634a32 in hw_fast_io_fail_handler (strvec=<optimized out>) at dict.c:964
#2  0x00007f047662673c in process_stream (keywords=0x7f0468028fb0) at parser.c:515
#3  0x00007f0476626756 in process_stream (keywords=0x7f0468038550) at parser.c:519
#4  0x00007f0476626756 in process_stream (keywords=0x7f0468038530) at parser.c:519
#5  0x00007f04766267ec in init_data (conf_file=conf_file@entry=0x7f0477543b2b "/etc/multipath.conf", 
    init_keywords=0x7f0476637540 <init_keywords>) at parser.c:571
#6  0x00007f047662c3b6 in load_config (file=file@entry=0x7f0477543b2b "/etc/multipath.conf", 
    udev=<optimized out>) at config.c:581
#7  0x00007f047753e006 in reconfigure (vecs=0x7f047796f000) at main.c:1414
#8  0x00007f0477541157 in parse_cmd (cmd=cmd@entry=0x7f04680827d0 "reconfig \n", 
    reply=reply@entry=0x7f04774ecd58, len=len@entry=0x7f04774ecd40, data=data@entry=0x7f047796f000)
    at cli.c:399
#9  0x00007f047753c7f5 in uxsock_trigger (str=0x7f04680827d0 "reconfig \n", 
    reply=reply@entry=0x7f04774ecd58, len=len@entry=0x7f04774ecd40, 
    trigger_data=trigger_data@entry=0x7f047796f000) at main.c:744
#10 0x00007f0477540052 in uxsock_listen (
    uxsock_trigger=uxsock_trigger@entry=0x7f047753c7b0 <uxsock_trigger>, 
    trigger_data=trigger_data@entry=0x7f047796f000) at uxlsnr.c:168
#11 0x00007f047753d19f in uxlsnrloop (ap=0x7f047796f000) at main.c:905
#12 0x00007f04770ffdf3 in start_thread (arg=0x7f04774ed700) at pthread_create.c:308
#13 0x00007f0475f2d39d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) l
33		mov	%edi, %ecx
34		and	$0x3f, %ecx
35		pxor	%xmm0, %xmm0
36		cmp	$0x30, %ecx
37		ja	L(next)
38		movdqu	(%rdi), %xmm1
39		pcmpeqb	%xmm1, %xmm0
40		pmovmskb %xmm0, %edx
41		test	%edx, %edx
42		jnz	L(exit_less16)

Expected results:


Additional info:

Comment 3 Jose Castillo 2014-02-25 12:12:50 UTC
Created attachment 867376 [details]
propsed patch v1

Comment 5 Jose Castillo 2014-02-25 15:51:53 UTC
Created attachment 867494 [details]
propsed patch v2

Comment 6 Jose Castillo 2014-02-25 18:18:14 UTC
Xiaowei, I've created a RHEL 7 x86_64 build with the latest version of the patch. It is here:

https://brewweb.devel.redhat.com/taskinfo?taskID=7105361

In case you want to try  it. When I ran some tests, I got:

[root@client abrt]# multipathd reconfigure
fail
[root@client abrt]# systemctl status multipathd.service
multipathd.service - Device-Mapper Multipath Device Controller
   Loaded: loaded (/usr/lib/systemd/system/multipathd.service; enabled)
   Active: active (running) since Tue 2014-02-25 19:11:26 CET; 30s ago
  Process: 4104 ExecStart=/sbin/multipathd (code=exited, status=0/SUCCESS)
  Process: 4103 ExecStartPre=/sbin/modprobe dm-multipath (code=exited, status=0/SUCCESS)
 Main PID: 4108 (multipathd)
   CGroup: /system.slice/multipathd.service
           └─4108 /sbin/multipathd

Feb 25 19:11:25 client.example.org modprobe[4103]: Executing: /sbin/modprobe dm-multipath
Feb 25 19:11:26 client.example.org multipathd[4104]: Executing: /sbin/multipathd
Feb 25 19:11:44 client.example.org multipathd[4108]: path checkers start up
Feb 25 19:11:44 client.example.org multipathd[4108]: reconfigure (operator)
Feb 25 19:11:44 client.example.org multipathd[4108]: error parsing config file

Comment 7 Xiaowei Li 2014-02-26 02:16:33 UTC
(In reply to Jose Castillo from comment #2)
> I think the patch attached solve the problem for both fast_io_fail_tmo and
> dev_loss_tmo. Xiaowei, did you have any problems with dev_loss_tmo as well?
> 
> The patch is based on the same approach for the options "rr_min_io_rq" and
> "flush_on_last_del", but lets see if Ben think is correct.

when testing the dev_loss_tmo, get the following result. please let me know if it's the same issue and if you want me to file another bug for the dev_loss_tmo.

1. does not provide the value to dev_loss_tmo section -- pass
for example:
defaults {
    user_friendly_names     yes
    dev_loss_tmo  
}

# multipathd reconfig
fail

# service multipathd status
>>> snip >>>
: error parsing config file
>>> snip >>>

2. provide the wrong string value to dev_loss_tmo section -- fail, coredump
for example:
defaults {
    user_friendly_names     yes
    dev_loss_tmo  aaa
}

Core was generated by `/sbin/multipathd'.
Program terminated with signal 11, Segmentation fault.
#0  __GI___libc_free (mem=0x63687461706d) at malloc.c:2903
2903	  if (chunk_is_mmapped(p))                       /* release mmapped memory. */
(gdb) bt
#0  __GI___libc_free (mem=0x63687461706d) at malloc.c:2903
#1  0x00007fca831f4a9a in xfree (string=<optimized out>) at ../xfree.c:49
#2  0x00007fca82b7d09e in free_config (conf=conf@entry=0x7fca7400f920) at config.c:473
#3  0x00007fca83a8f033 in reconfigure (vecs=0x7fca840acf00) at main.c:1418
#4  0x00007fca83a92157 in parse_cmd (cmd=cmd@entry=0x7fca7400cee0 "reconfig \n", 
    reply=reply@entry=0x7fca83a3dd58, len=len@entry=0x7fca83a3dd40, data=data@entry=0x7fca840acf00)
    at cli.c:399
#5  0x00007fca83a8d7f5 in uxsock_trigger (str=0x7fca7400cee0 "reconfig \n", 
    reply=reply@entry=0x7fca83a3dd58, len=len@entry=0x7fca83a3dd40, 
    trigger_data=trigger_data@entry=0x7fca840acf00) at main.c:744
#6  0x00007fca83a91052 in uxsock_listen (
    uxsock_trigger=uxsock_trigger@entry=0x7fca83a8d7b0 <uxsock_trigger>, 
    trigger_data=trigger_data@entry=0x7fca840acf00) at uxlsnr.c:168
#7  0x00007fca83a8e19f in uxlsnrloop (ap=0x7fca840acf00) at main.c:905
#8  0x00007fca83650df3 in start_thread (arg=0x7fca83a3e700) at pthread_create.c:308
#9  0x00007fca8247e39d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) l
2898	  if (mem == 0)                              /* free(0) has no effect */
2899	    return;
2900	
2901	  p = mem2chunk(mem);
2902	
2903	  if (chunk_is_mmapped(p))                       /* release mmapped memory. */
2904	  {
2905	    /* see if the dynamic brk/mmap threshold needs adjusting */
2906	    if (!mp_.no_dyn_threshold
2907		&& p->size > mp_.mmap_threshold

Comment 8 Xiaowei Li 2014-02-26 02:29:53 UTC
also hit the similar core dump when passing the wrong string value to the fast_io_fail_tmo.

Comment 9 Xiaowei Li 2014-02-26 02:58:27 UTC
(In reply to Jose Castillo from comment #6)
> Xiaowei, I've created a RHEL 7 x86_64 build with the latest version of the
> patch. It is here:
> 
> https://brewweb.devel.redhat.com/taskinfo?taskID=7105361
> 
> In case you want to try  it. When I ran some tests, I got:
> 
> [root@client abrt]# multipathd reconfigure
> fail
> [root@client abrt]# systemctl status multipathd.service
> multipathd.service - Device-Mapper Multipath Device Controller
>    Loaded: loaded (/usr/lib/systemd/system/multipathd.service; enabled)
>    Active: active (running) since Tue 2014-02-25 19:11:26 CET; 30s ago
>   Process: 4104 ExecStart=/sbin/multipathd (code=exited, status=0/SUCCESS)
>   Process: 4103 ExecStartPre=/sbin/modprobe dm-multipath (code=exited,
> status=0/SUCCESS)
>  Main PID: 4108 (multipathd)
>    CGroup: /system.slice/multipathd.service
>            └─4108 /sbin/multipathd
> 
> Feb 25 19:11:25 client.example.org modprobe[4103]: Executing: /sbin/modprobe
> dm-multipath
> Feb 25 19:11:26 client.example.org multipathd[4104]: Executing:
> /sbin/multipathd
> Feb 25 19:11:44 client.example.org multipathd[4108]: path checkers start up
> Feb 25 19:11:44 client.example.org multipathd[4108]: reconfigure (operator)
> Feb 25 19:11:44 client.example.org multipathd[4108]: error parsing config
> file

tested the rpm. the original issue was fixed. but we still have the following issues:

1. the patch cannot fix the issue that fast_io_fail_tmo is in device sector not in the default section

devices {
        device {
                vendor "DGC"
                product ".*"
                product_blacklist "LUNZ"
                fast_io_fail_tmo
        }

}

2. the patch cannot fix the issue in comment 7.

also, let me know if you want me to file the new bug to track the new issues since your patch can fix the original issue.

Comment 10 Jose Castillo 2014-02-26 18:15:02 UTC
(In reply to Xiaowei Li from comment #9)
> tested the rpm. the original issue was fixed. but we still have the
> following issues:
> 
> 1. the patch cannot fix the issue that fast_io_fail_tmo is in device sector
> not in the default section
> 
> devices {
>         device {
>                 vendor "DGC"
>                 product ".*"
>                 product_blacklist "LUNZ"
>                 fast_io_fail_tmo
>         }
> 
> }
> 

This may be a new, different problem, but see below first.

> 2. the patch cannot fix the issue in comment 7.
> 
Were you using in this test the patched version or the original one? I ask you because I have the feeling that is the patch itself what causes the coredump in comment #7. 

> also, let me know if you want me to file the new bug to track the new issues
> since your patch can fix the original issue.

Fill a new bugzilla only if the test in #7 was done with the original multipath package, not the patched one. If that is not the case, then I'll fix the patch to avoid this problem.

Comment 11 Xiaowei Li 2014-03-19 10:11:21 UTC
since it's not a blocker and RHEL 7.0 is going to RC we should move it to 7.1.

Comment 13 Ben Marzinski 2014-08-20 15:18:15 UTC
Patch applied. Thanks!

Comment 19 errata-xmlrpc 2015-03-05 08:25:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0367.html