RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1349376 - multipath -ll crashes
Summary: multipath -ll crashes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-multipath
Version: 6.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Lin Li
Milan Navratil
URL:
Whiteboard:
Depends On:
Blocks: 1269194 1324930
TreeView+ depends on / blocked
 
Reported: 2016-06-23 10:51 UTC by nikhil kshirsagar
Modified: 2021-09-09 11:52 UTC (History)
15 users (show)

Fixed In Version: device-mapper-multipath-0.4.9-94.el6
Doc Type: Bug Fix
Doc Text:
multipath no longer crashes due to *libdevmapper* version mismatches Previously, the multipath code was not linking to the correct libraries during a part of compilation. As a consequence, if *device-mapper-multipath* was used with a newer version of the *libdevmapper* library than it was compiled with, multipath sometimes terminated unexpectedly. Now, multipath correctly links to the necessary libraries during compilation. As a result, multipath no longer crashes due to the library version mismatch.
Clone Of:
Environment:
Last Closed: 2017-03-21 10:51:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0697 0 normal SHIPPED_LIVE device-mapper-multipath bug fix and enhancement update 2017-03-21 12:39:06 UTC

Description nikhil kshirsagar 2016-06-23 10:51:12 UTC
Description of problem:
 
# multipath -ll
3514f0c5f27e0019c dm-1 XtremIO,XtremApp
size=500G features='0' hwhandler='0' wp=rw
`-+- policy='queue-length 0' prio=1 status=active
  |- 3:0:0:1 sdb 8:16  active ready running
  |- 3:0:1:1 sdc 8:32  active ready running
  |- 3:0:2:1 sdd 8:48  active ready running
  |- 3:0:4:1 sdf 8:80  active ready running
  |- 3:0:3:1 sde 8:64  active ready running
  |- 3:0:5:1 sdg 8:96  active ready running
  |- 3:0:6:1 sdh 8:112 active ready running
  |- 3:0:7:1 sdi 8:128 active ready running
  |- 4:0:0:1 sdj 8:144 active ready running
  |- 4:0:1:1 sdk 8:160 active ready running
  |- 4:0:2:1 sdl 8:176 active ready running
  |- 4:0:3:1 sdm 8:192 active ready running
  |- 4:0:4:1 sdn 8:208 active ready running
  |- 4:0:5:1 sdo 8:224 active ready running
  |- 4:0:6:1 sdp 8:240 active ready running
  `- 4:0:7:1 sdq 65:0  active ready running
*** glibc detected *** multipath: free(): invalid next size (fast): 0x0000000000e12a50 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e91a75e66]
/lib64/libc.so.6[0x3e91a789b3]
/lib64/libmultipath.so(free_multipath+0x5d)[0x32dda16f5d]
/lib64/libmultipath.so(free_multipathvec+0x4d)[0x32dda1701d]
multipath[0x402195]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3e91a1ed5d]
multipath[0x401b99]


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-80.el6.x86_64



Additional info:


]# valgrind --tool=memcheck multipath -ll
==7783== Memcheck, a memory error detector
==7783== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==7783== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==7783== Command: multipath -ll
==7783==
Jun 22 09:58:31 | can't set open fds limit to 8192 : Operation not permitted

==7783== Invalid write of size 8
==7783==    at 0x32DD641581: dm_task_get_info (string3.h:85)
==7783==    by 0x32DDA10492: dm_get_info (devmapper.c:1048)
==7783==    by 0x32DDA10E09: dm_get_maps (devmapper.c:803)
==7783==    by 0x402385: main (main.c:171)
==7783==  Address 0x4cbfce8 is 0 bytes after a block of size 40 alloc'd
==7783==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==7783==    by 0x32DDA0E955: zalloc (memory.c:41)
==7783==    by 0x32DDA10459: dm_get_info (devmapper.c:1019)
==7783==    by 0x32DDA10E09: dm_get_maps (devmapper.c:803)
==7783==    by 0x402385: main (main.c:171)
==7783==
==7783== Invalid write of size 4
==7783==    at 0x32DD6415D9: dm_task_get_info (libdm-iface.c:702)
==7783==    by 0x32DDA10492: dm_get_info (devmapper.c:1048)
==7783==    by 0x32DDA10E09: dm_get_maps (devmapper.c:803)
==7783==    by 0x402385: main (main.c:171)
==7783==  Address 0x4cbfcec is 4 bytes after a block of size 40 alloc'd
==7783==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==7783==    by 0x32DDA0E955: zalloc (memory.c:41)
==7783==    by 0x32DDA10459: dm_get_info (devmapper.c:1019)
==7783==    by 0x32DDA10E09: dm_get_maps (devmapper.c:803)
==7783==    by 0x402385: main (main.c:171)
==7783==
==7783== Invalid write of size 4
==7783==    at 0x32DD6415DC: dm_task_get_info (libdm-iface.c:701)
==7783==    by 0x32DDA10492: dm_get_info (devmapper.c:1048)
==7783==    by 0x32DDA10E09: dm_get_maps (devmapper.c:803)
==7783==    by 0x402385: main (main.c:171)
==7783==  Address 0x4cbfce8 is 0 bytes after a block of size 40 alloc'd
==7783==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==7783==    by 0x32DDA0E955: zalloc (memory.c:41)
==7783==    by 0x32DDA10459: dm_get_info (devmapper.c:1019)
==7783==    by 0x32DDA10E09: dm_get_maps (devmapper.c:803)
==7783==    by 0x402385: main (main.c:171)
==7783==
3514f0c5f27e0019c dm-1 XtremIO,XtremApp
size=500G features='0' hwhandler='0' wp=rw
`-+- policy='queue-length 0' prio=1 status=active
  |- 3:0:0:1 sdb 8:16  active ready running
  |- 3:0:1:1 sdc 8:32  active ready running
  |- 3:0:2:1 sdd 8:48  active ready running
  |- 3:0:4:1 sdf 8:80  active ready running
  |- 3:0:3:1 sde 8:64  active ready running
  |- 3:0:5:1 sdg 8:96  active ready running
  |- 3:0:6:1 sdh 8:112 active ready running
  |- 3:0:7:1 sdi 8:128 active ready running
  |- 4:0:0:1 sdj 8:144 active ready running
  |- 4:0:1:1 sdk 8:160 active ready running
  |- 4:0:2:1 sdl 8:176 active ready running
  |- 4:0:3:1 sdm 8:192 active ready running
  |- 4:0:4:1 sdn 8:208 active ready running
  |- 4:0:5:1 sdo 8:224 active ready running
  |- 4:0:6:1 sdp 8:240 active ready running
  `- 4:0:7:1 sdq 65:0  active ready running
==7783==
==7783== HEAP SUMMARY:
==7783==     in use at exit: 5,556 bytes in 17 blocks
==7783==   total heap usage: 1,770 allocs, 1,753 frees, 666,062 bytes allocated
==7783==
==7783== LEAK SUMMARY:
==7783==    definitely lost: 0 bytes in 0 blocks
==7783==    indirectly lost: 0 bytes in 0 blocks
==7783==      possibly lost: 0 bytes in 0 blocks
==7783==    still reachable: 5,556 bytes in 17 blocks
==7783==         suppressed: 0 bytes in 0 blocks
==7783== Rerun with --leak-check=full to see details of leaked memory
==7783==
==7783== For counts of detected and suppressed errors, rerun with: -v
==7783== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 17 from 9)


-----------------------------------------------------------------------------



The core just shows the point at which the system has detected the corruption causing the segmentation fault. The gdb stack trace will not help analyse where the corruption has occurred. 

When you malloc some memory, malloc allocates a little more memory before and after your memory requested, for its internal bookkeeping and memory management. We see the mpp->dmi as a valid structure with a valid address. (non null). Free() has no problems freeing this area. But when you free a chunk of memory, free() checks if the previous chunk is free and whether the following chunk is free. If either of those are free then free() must merge the chunk you just freed with adjacent free chunks. For non free chunks, it needs to be able to figure out how big it is.

This is where free seems to be crashing because the information storing the size of the next chunk has been corrupted. 

(gdb) where
#0  0x0000003e91a32625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003e91a33e05 in abort () at abort.c:92
#2  0x0000003e91a70537 in __libc_message (do_abort=2, fmt=0x3e91b58780 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3  0x0000003e91a75e66 in malloc_printerr (action=3, str=0x3e91b58aa8 "free(): invalid next size (fast)", ptr=<value optimized out>) at malloc.c:6336
#4  0x0000003e91a789b3 in _int_free (av=0x3e91d8fe80, p=0xf89a40, have_lock=0) at malloc.c:4832
#5  0x00000032dda16f5d in free_multipath (mpp=0xf832e0, free_paths=0) at structs.c:241
#6  0x00000032dda1701d in free_multipathvec (mpvec=0xf7f1b0, free_paths=0) at structs.c:285
#7  0x0000000000402195 in configure (argc=<value optimized out>, argv=<value optimized out>) at main.c:373
#8  main (argc=<value optimized out>, argv=<value optimized out>) at main.c:586


(gdb) p *mpp->dmi
$4 = {exists = 1, suspended = 0, live_table = 1, inactive_table = 0, open_count = 1, event_nr = 1, major = 253, minor = 1, read_only = 0, target_count = 1}
(gdb) ptype mpp->dmi
type = struct dm_info {
    int exists;
    int suspended;
    int live_table;
    int inactive_table;
    int32_t open_count;
    uint32_t event_nr;
    uint32_t major;
    uint32_t minor;
    int read_only;
    int32_t target_count;
} *
(gdb) 


int dm_task_get_info(struct dm_task *dmt, struct dm_info *info)
{
        if (!dmt->dmi.v4)
                return 0;

        memset(info, 0, sizeof(*info));  

        info->exists = dmt->dmi.v4->flags & DM_EXISTS_FLAG ? 1 : 0;
        if (!info->exists)
                return 1;

        info->suspended = dmt->dmi.v4->flags & DM_SUSPEND_FLAG ? 1 : 0;
        info->read_only = dmt->dmi.v4->flags & DM_READONLY_FLAG ? 1 : 0; 
        info->live_table = dmt->dmi.v4->flags & DM_ACTIVE_PRESENT_FLAG ? 1 : 0;
        info->inactive_table = dmt->dmi.v4->flags & DM_INACTIVE_PRESENT_FLAG ?
            1 : 0;
        info->deferred_remove = dmt->dmi.v4->flags & DM_DEFERRED_REMOVE;
        info->internal_suspend = (dmt->dmi.v4->flags & DM_INTERNAL_SUSPEND_FLAG) ? 1 : 0;
        info->target_count = dmt->dmi.v4->target_count;
        info->open_count = dmt->dmi.v4->open_count;
        info->event_nr = dmt->dmi.v4->event_nr;
        info->major = MAJOR(dmt->dmi.v4->dev);
        info->minor = MINOR(dmt->dmi.v4->dev);

        return 1;
}

Comment 10 Jerrin Jose 2016-06-28 16:05:16 UTC
Customer has updated that, he did upgrade the device-mapper-multipath package using the "yum update device-mapper-multipath" and the issue got resolved.


=========================================================================================================================
 Package                                  Arch               Version                  Repository                    Size
=========================================================================================================================
Updating:
 device-mapper-multipath                  x86_64             0.4.9-93.el6             rhel6-local-repo             128 k
Updating for dependencies:
 device-mapper-multipath-libs             x86_64             0.4.9-93.el6             rhel6-local-repo             198 k
 kpartx                                   x86_64             0.4.9-93.el6             rhel6-local-repo              67 k

Transaction Summary
=========================================================================================================================

Comment 14 Ben Marzinski 2016-10-17 13:56:30 UTC
updated makefile to link to link to the necessary libraries during compilation.

Comment 25 errata-xmlrpc 2017-03-21 10:51:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0697.html


Note You need to log in before you can comment on or make changes to this bug.