RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 895110 - multipathd crash when having many interfaces
Summary: multipathd crash when having many interfaces
Keywords:
Status: CLOSED DUPLICATE of bug 880121
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-multipath
Version: 6.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Bruno Goncalves
URL:
Whiteboard:
: 723169 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-14 15:33 UTC by Bruno Goncalves
Modified: 2015-10-14 16:13 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-14 16:13:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
multipath_malloc error (114.89 KB, text/plain)
2013-01-14 15:35 UTC, Bruno Goncalves
no flags Details
multipath.conf (617 bytes, text/plain)
2013-02-01 08:03 UTC, Bruno Goncalves
no flags Details

Description Bruno Goncalves 2013-01-14 15:33:15 UTC
Description of problem:
Running the system with 100 iSCSI interfaces caused:
*** glibc detected *** /sbin/multipathd: malloc(): smallbin double linked list corrupted: 0x00007ff5046861a0 ***

Version-Release number of selected component (if applicable):
rpm -q device-mapper-multipath
device-mapper-multipath-0.4.9-63.el6.x86_64

rpm -q device-mapper
device-mapper-1.02.77-7.el6.x86_64

uname -r
2.6.32-353.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1.Create 100 interfaces and discovery target
for i in {1..100}; do 
iscsiadm -m iface -o new -I multi_iqn_$i; 
iscsiadm -m iface -I multi_iqn_$i -o update -n iface.initiatorname -v iqn.1994-05.com.redhat:multi-iqn-1; 
iscsiadm -m discovery -t st -p <portal> -I multi_iqn_$i -o new;
done


2.login then to target
iscsiadm -m node -l

3. check the messages on console
  
Actual results:
multipathd fails


Additional info:
sd 1085:0:0:0: [sdge] Attached SCSI disk
sd 1093:0:0:0: [sdgh] Attached SCSI disk
sd 1108:0:0:0: [sdgs] Attached SCSI disk
*** glibc detected *** /sbin/multipathd: malloc(): smallbin double linked list corrupted: 0x00007ff5046861a0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x760e6)[0x7ff522d910e6]
/lib64/libc.so.6(+0x79e9f)[0x7ff522d94e9f]
/lib64/libc.so.6(__libc_malloc+0x71)[0x7ff522d95911]
/lib64/libc.so.6(__strdup+0x22)[0x7ff522d9c042]
/lib64/libdevmapper.so.1.02(+0xca93)[0x7ff523fa0a93]
/lib64/libmultipath.so(dm_type+0x47)[0x7ff5236f8d67]
/lib64/libmultipath.so(dm_get_maps+0xc9)[0x7ff5236f9199]
/lib64/libmultipath.so(dm_get_name+0x32)[0x7ff5236f9292]
/lib64/libmultipath.so(__setup_multipath+0xb8)[0x7ff5237167c8]
/lib64/libmultipath.so(add_map_without_path+0x3d)[0x7ff5237171cd]
/sbin/multipathd[0x408470]
/sbin/multipathd(uev_trigger+0x262)[0x408af2]
/lib64/libmultipath.so(service_uevq+0x64)[0x7ff52370f214]
/lib64/libmultipath.so(+0x262b7)[0x7ff52370f2b7]
/lib64/libpthread.so.0(+0x7851)[0x7ff5230b5851]
/lib64/libc.so.6(clone+0x6d)[0x7ff522e0390d]

Comment 1 Bruno Goncalves 2013-01-14 15:35:07 UTC
Created attachment 678263 [details]
multipath_malloc error

Comment 2 RHEL Program Management 2013-01-18 06:47:14 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Ben Marzinski 2013-01-30 20:24:03 UTC
How many LUNs are on the server?  I'm not able to recreate this with the identical packages, using 100 interfaces to 10 LUNs

Comment 4 Bruno Goncalves 2013-01-31 09:36:13 UTC
I was able to reproduce it using tgtd with 1 target with 1 LUN.

cat /etc/tgt/targets.conf
default-driver iscsi
<target iqn.2009-10.com.redhat:storage-1>
    write-cache off
    allow-in-use yes
    <backing-store /var/lib/tgtd/loop-disk-1-1>
        scsi_sn 1
        scsi_id 1
        lun 1
    </backing-store>
</target>

-------------
for i in {1..100}; do 
iscsiadm -m iface -o new -I multi_iqn_$i; 
iscsiadm -m iface -I multi_iqn_$i -o update -n iface.initiatorname -v iqn.1994-05.com.redhat:multi-iqn-1; 
iscsiadm -m discovery -t st -p 127.0.0.1 -I multi_iqn_$i -o new;
done

# multipath -l
#

# iscsiadm -m node -l

# multipath -l | grep sd | wc
     38     303    1824
#


NOTE: I could not see the error being logged in any file, only on the console. It also showed:

mp->params too small
mp->params too small
mp->params too small

Comment 5 Ben Marzinski 2013-01-31 17:17:46 UTC
Hrm.. That was the first thing I tried.  I didn't have the targets on the same node, however.  I'll try that.

Could you also post your /etc/multipath.conf.

Comment 6 Bruno Goncalves 2013-02-01 08:03:57 UTC
Created attachment 691422 [details]
multipath.conf

Comment 7 Ben Marzinski 2013-02-01 18:02:24 UTC
I don't have much hope for this, but can you try the packages at:

http://download.devel.redhat.com/brewroot/scratch/bmarzins/task_5351856/

Those will fix the "mp->params too small" messages.

Comment 8 Bruno Goncalves 2013-02-04 08:04:44 UTC
rpm -q device-mapper-multipath
device-mapper-multipath-0.4.9-64.el6.bz895110.x86_64

With this build the is no error messages, but it seems it still can only handle 51 paths.

iscsiadm -m session -P3 | grep sd | wc
    100     600    4274

multipath -l | grep sd | wc
     51     408    2448

There are some error messages when login out with iscsiadm -m node -u

device-mapper: table: 253:2: multipath: error getting device
device-mapper: table: 253:2: multipath: error getting device

Comment 9 Ben Marzinski 2013-02-04 15:23:41 UTC
So, did multipathd crash?  Would it be possible for me to get on this machine to take a look at it myself?

Comment 16 Ben Marzinski 2013-02-07 17:06:27 UTC
Actually, with just some debugging info added, everything worked just fine, with one small issue: "multipath -l" doesn't correctly display a multipath device that big.

# multipath -l | grep sd | wc -l
55

similar to your Comment 8 result.

However looking at the output

# multipath -l

...
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 59:0:0:1  sdau 66:224 active undef running
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 80:0:0:1  sdbh 67:176 active undef running
|-+- policy='round-robin 0' prio=0 s


It's pretty clearly cut off mid-line

Using multipathd to look at the device

# multipathd show topology map mpathc | grep sd | wc -l
100

does show all hundred paths, and look at /var/log/messages

Feb  7 11:21:08 storageqe-12 multipathd: mpathc: sdcp - directio checker reports path is up
Feb  7 11:21:08 storageqe-12 multipathd: 69:208: reinstated
Feb  7 11:21:08 storageqe-12 multipathd: mpathc: remaining active paths: 100

so the device does have 100 active paths.

I'll fix the multipath -l size issue.

The original corruption issue is more worrying.  I did expand a buffer that wasn't big enough, but I don't see where multipathd writes to it without size limiting, so unless I'm missing something, there still should be a overwrite, or a write after free, or something.  However, I've been totally unable to reproduce it, even with running valgrind with all the memory zeroing options.  That doesn't mean it's not there anymore, it just means the only way to find it is to notice it while reading the code.

If you can reproduce the corruption, please let me know.

Comment 17 Bruno Goncalves 2013-02-08 07:56:10 UTC
thanks, if I reproduce the crash I'll let you know, and when login out of the sessions did you reproduce this error?

iscsiadm: initiator reported error (9 - internal error)
iscsiadm: Could not logout of [sid: 119, target: iqn.2009-10.com.redhat:storage-
1, portal: 127.0.0.1,3260].
iscsiadm: initiator reported error (9 - internal error)
iscsiadm: Could not logout of [sid: 118, target: iqn.2009-10.com.redhat:storage-
1, portal: 127.0.0.1,3260].
iscsiadm: initiator reported error (9 - internal error)
iscsiadm: Could not logout of [sid: 117, target: iqn.2009-10.com.redhat:storage-
1, portal: 127.0.0.1,3260].

Comment 19 Ben Marzinski 2015-10-06 02:20:36 UTC
*** Bug 723169 has been marked as a duplicate of this bug. ***

Comment 20 Ben Marzinski 2015-10-14 16:13:04 UTC
This should be fixed by the fix for 880121

*** This bug has been marked as a duplicate of bug 880121 ***


Note You need to log in before you can comment on or make changes to this bug.