Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Cause: If multipathd failed to add a multipath device, in some circumstances, it was freeing the alias, and then accessing it and attempting to free it again.
Consequence: multipathd would crash if it tried to add a multipath device that was too large for it to handle (and far too large to be practical in a real world application)
Fix: multipathd no longer frees the alias twice, or attempts to access the freed alias.
Result: multipathd no longer crashes when it fails to add a multipath device.
Description of problem:
When testing multipath over iscsi, multipathd crashed. And it also prevent the daemon restart with this error messages:
===
multipathd: mpathdx: error getting map status string
===
The crashdump file is attahced.
Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-62
How reproducible:
only hit it twice.
Steps to Reproduce:
1. Create a multipath over iSCSI.
2. Try to clone it's iscsi iface and iscsi node.
===
for X in `seq 1 129`;do
iscsiadm -m iface -o new -I gris_tmp_iface_$X;
iscsiadm -m iface -I gris_tmp_iface_$X -o update \
-n iface.initiatorname -v iqn.1994-05.com.redhat:gris-dev-2;
iscsiadm -m iface -I gris_tmp_iface_$X -o update \
-n iface.transport_name -v tcp;
iscsiadm -m node -T iqn.1992-08.com.netapp:sn.151753773 \
-I gris_tmp_iface_$X -p 10.16.43.127 -o new;
iscsiadm -m node -T iqn.1992-08.com.netapp:sn.151753773 \
-I gris_tmp_iface_$X -l;
done
===
3. waitudev
Actual results:
multipathd crash
Expected results:
multipathd not crash.
Additional info:
The reproduce scripts is not the code I am using when testing, It's just a translate from perl code.
The backtrace from gdb:
===============================
(gdb) bt
#0 0x00007fa4f12db8a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007fa4f12dd085 in abort () at abort.c:92
#2 0x00007fa4f13197b7 in __libc_message (do_abort=2,
fmt=0x7fa4f1400f80 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3 0x00007fa4f131f0e6 in malloc_printerr (action=3, str=0x7fa4f14012c0 "double free or corruption (out)",
ptr=<value optimized out>) at malloc.c:6311
#4 0x00007fa4f1321c13 in _int_free (av=0x7fa4f1637e80, p=0x7fa4cc06e170, have_lock=0) at malloc.c:4811
#5 0x0000000000408501 in ev_add_map (dev=0x7fa4cc001d10, vecs=<value optimized out>) at main.c:304
#6 0x0000000000408af2 in uev_add_map (uev=0x7fa4e80009f0, trigger_data=0x163df00) at main.c:235
#7 uev_trigger (uev=0x7fa4e80009f0, trigger_data=0x163df00) at main.c:731
#8 0x00007fa4f1c9d214 in service_uevq () at uevent.c:109
#9 0x00007fa4f1c9d2b7 in uevq_thread (et=<value optimized out>) at uevent.c:135
#10 0x00007fa4f1643851 in start_thread (arg=0x7fa4f2911700) at pthread_create.c:301
#11 0x00007fa4f139190d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
===============================
Comment 5RHEL Program Management
2012-12-14 07:24:46 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
I got the same error message from multipathd while trying to increase the number of iscsi sessions to ~1000.
I doubled the number of sessions a few times using:
for i in /sys/devices/platform/host*/session* ; do echo $i ; iscsiadm -m session -r $i -o new ; done
[root@lg509 ~]# multipath -ll
Jan 29 10:25:31 | 3514f0c532e00002b: error getting map status string
[root@lg509 ~]# /etc/init.d/multipathd status
multipathd dead but pid file exists
[root@lg509 ~]# /etc/init.d/multipathd restart
ux_socket_connect: Connection refused
Stopping multipathd daemon: [FAILED]
Starting multipathd daemon: [ OK ]
[root@lg509 ~]# multipath -ll
Jan 29 10:26:39 | 3514f0c532e00002b: error getting map status string
[root@lg509 ~]#
Working with 6.5, rpms version:
[root@lg509 ~]# uname -r
2.6.32-431.el6.x86_64
[root@lg509 ~]# rpm -qa | grep multipath
device-mapper-multipath-0.4.9-80.el6.x86_64
device-mapper-multipath-libs-0.4.9-80.el6.x86_64
[root@lg509 ~]#
The cause of this seems pretty likely to be that multipathd can't handle a device-mapper table/status line that big, and instead of failing gracefully, it's failing badly, and multipathd looks to be overwriting memory. It certainly fix the memory corruption. However, I might end up putting a limit on the number of paths that multipath will create in the first place, I can't see any real practical use for 128 paths to a device.
I've fixed the memory corruption issue. But like I mentioned earlier, I did not make multipath able to handle arbitrarily large device tables. Multipath will still fail to create tables that are too large. This happens somewhere between 256 and 1024 paths, depending on how the device is configured.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2015-1391.html
Description of problem: When testing multipath over iscsi, multipathd crashed. And it also prevent the daemon restart with this error messages: === multipathd: mpathdx: error getting map status string === The crashdump file is attahced. Version-Release number of selected component (if applicable): device-mapper-multipath-0.4.9-62 How reproducible: only hit it twice. Steps to Reproduce: 1. Create a multipath over iSCSI. 2. Try to clone it's iscsi iface and iscsi node. === for X in `seq 1 129`;do iscsiadm -m iface -o new -I gris_tmp_iface_$X; iscsiadm -m iface -I gris_tmp_iface_$X -o update \ -n iface.initiatorname -v iqn.1994-05.com.redhat:gris-dev-2; iscsiadm -m iface -I gris_tmp_iface_$X -o update \ -n iface.transport_name -v tcp; iscsiadm -m node -T iqn.1992-08.com.netapp:sn.151753773 \ -I gris_tmp_iface_$X -p 10.16.43.127 -o new; iscsiadm -m node -T iqn.1992-08.com.netapp:sn.151753773 \ -I gris_tmp_iface_$X -l; done === 3. waitudev Actual results: multipathd crash Expected results: multipathd not crash. Additional info: The reproduce scripts is not the code I am using when testing, It's just a translate from perl code. The backtrace from gdb: =============================== (gdb) bt #0 0x00007fa4f12db8a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007fa4f12dd085 in abort () at abort.c:92 #2 0x00007fa4f13197b7 in __libc_message (do_abort=2, fmt=0x7fa4f1400f80 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 #3 0x00007fa4f131f0e6 in malloc_printerr (action=3, str=0x7fa4f14012c0 "double free or corruption (out)", ptr=<value optimized out>) at malloc.c:6311 #4 0x00007fa4f1321c13 in _int_free (av=0x7fa4f1637e80, p=0x7fa4cc06e170, have_lock=0) at malloc.c:4811 #5 0x0000000000408501 in ev_add_map (dev=0x7fa4cc001d10, vecs=<value optimized out>) at main.c:304 #6 0x0000000000408af2 in uev_add_map (uev=0x7fa4e80009f0, trigger_data=0x163df00) at main.c:235 #7 uev_trigger (uev=0x7fa4e80009f0, trigger_data=0x163df00) at main.c:731 #8 0x00007fa4f1c9d214 in service_uevq () at uevent.c:109 #9 0x00007fa4f1c9d2b7 in uevq_thread (et=<value optimized out>) at uevent.c:135 #10 0x00007fa4f1643851 in start_thread (arg=0x7fa4f2911700) at pthread_create.c:301 #11 0x00007fa4f139190d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 ===============================