Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
Segfault in multipathd nine hours into failover/failback test.
Message log:
Apr 12 03:19:29 hb131223 multipathd: mpathd: remaining active paths: 2
Apr 12 03:19:29 hb131223 multipathd: mpathc: sdc - rdac checker reports path is up
Apr 12 03:19:29 hb131223 kernel: multipathd[1646]: segfault at 170 ip 00000000004073f1 sp 00007fcaf062cc60 error 4 in multipathd[400000+10000]
Apr 12 03:19:30 hb131223 abrt[23282]: saved core dump of pid 1616 (/sbin/multipathd) to /var/spool/abrt/ccpp-1302599970-1616.new/coredump (2273280 bytes)
Backtrace:
Core was generated by `/sbin/multipathd'.
Program terminated with signal 11, Segmentation fault.
#0 0x00000000004073f1 in update_prio (pp=0x7fcad80009b0, refresh_all=1) at main.c:964
964 vector_foreach_slot (pp->mpp->pg, pgp, i) {
Missing separate debuginfos, use: debuginfo-install device-mapper-libs-1.02.62-3.el6.x86_64 glibc-2.12-1.24.1.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libselinux-2.0.94-4.el6.x86_64 libsepol-2.0.41-3.el6.x86_64 libudev-147-2.35.el6.x86_64 ncurses-libs-5.7-3.20090208.el6.x86_64 readline-6.0-3.el6.x86_64
(gdb)
(gdb)
(gdb) bt
#0 0x00000000004073f1 in update_prio (pp=0x7fcad80009b0, refresh_all=1) at main.c:964
#1 0x0000000000407aff in check_path (vecs=0x111ec50, pp=0x7fcacc01ca70) at main.c:1116
#2 0x0000000000407db5 in checkerloop (ap=0x111ec50) at main.c:1159
#3 0x00000035c56077e1 in start_thread () from /lib64/libpthread.so.0
#4 0x00000035c4ae678d in clone () from /lib64/libc.so.6
(gdb)
(gdb) p *pp
$1 = {dev = '\000' <repeats 255 times>, dev_t = "8:144", '\000' <repeats 27 times>, sysdev = 0x0, scsi_id = {dev_id = 0,
host_unique_id = 0, host_no = 0}, sg_id = {host_no = -1, channel = -1, scsi_id = -1, lun = -1, h_cmd_per_lun = 0,
d_queue_depth = 0, unused1 = 0, unused2 = 0}, wwid = "3600a0b8000477a9c000080a34d54312a", '\000' <repeats 94 times>,
vendor_id = "\000\000\000\000\000\000\000\000", product_id = '\000' <repeats 16 times>, rev = "\000\000\000\000",
serial = '\000' <repeats 63 times>, tgt_node_name = '\000' <repeats 223 times>, size = 0, checkint = 0, tick = 0, bus = 0,
offline = 0, state = 0, dmstate = 1, failcount = 1, priority = 1, pgindex = 1, getuid = 0x0, prio_args = 0x0, prio = 0x111e670,
checker = {node = {next = 0x0, prev = 0x0}, fd = 0, sync = 0, timeout = 0, disable = 0, name = '\000' <repeats 15 times>,
message = '\000' <repeats 255 times>, wwid = '\000' <repeats 127 times>, context = 0x0, mpcontext = 0x0, check = 0, init = 0,
free = 0}, mpp = 0x0, fd = 8, hwe = 0x0}
(gdb)
(gdb) l
959 int oldpriority;
960 struct pathgroup * pgp;
961 int i, j, changed = 0;
962
963 if (refresh_all) {
964 vector_foreach_slot (pp->mpp->pg, pgp, i) {
965 vector_foreach_slot (pgp->paths, pp, j) {
966 oldpriority = pp->priority;
967 pathinfo(pp, conf->hwtable, DI_PRIO);
968 if (pp->priority != oldpriority)
OS was on multipath device so system died after one more cycle of fault injection because all paths to root device were lost.
Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-40.el6.x86_64
How reproducible:
Unknown
Steps to Reproduce:
1. Three multipath LUNs on RDAC array, OS installed on one, plus two test LUNs;
Two host-side and two storage-side paths.
2. Run I/O to two test LUNs
3. Run fault injection (switch port disable/enable) on storage-side paths:
Repeat
{
disable port A
sleep 10 minutes
enable port A
sleep 8 minutes
disable port B
sleep 10 minutes
enable port B
sleep 8 minutes
}
Actual results:
Segfault in multipathd and system died after nine hours of fault injection.
Expected results:
Additional info:
Message log showed errors like this throughout the duration of the fault injection:
Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:5: multipath: error getting device
Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table
Apr 12 02:43:18 hb131223 multipathd: mpathd: failed in domap for removal of path sdl
Apr 12 02:43:18 hb131223 multipathd: uevent trigger error
Apr 12 02:43:18 hb131223 multipathd: sdj: remove path (uevent)
Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:5: multipath: error getting device
Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table
Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:4: multipath: error getting device
Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table
Apr 12 02:43:18 hb131223 multipathd: mpathc: failed in domap for removal of path sdj
Apr 12 02:43:18 hb131223 multipathd: uevent trigger error
Apr 12 02:43:18 hb131223 multipathd: sdf: failed to get sysfs information
Apr 12 02:43:18 hb131223 multipathd: sdf: failed to get sysfs information
Apr 12 02:43:18 hb131223 multipathd: sdf: no sysfs information
Apr 12 02:43:18 hb131223 multipathd: checker failed path 8:80 in map mpathc
Apr 12 02:43:18 hb131223 multipathd: mpathc: remaining active paths: 0
Apr 12 02:43:18 hb131223 multipathd: sdk: failed to get sysfs information
Apr 12 02:43:18 hb131223 multipathd: sdk: failed to get sysfs information
Apr 12 02:43:18 hb131223 multipathd: sdk: no sysfs information
Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:4: multipath: error getting device
Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table
Apr 12 02:43:18 hb131223 multipathd: 8:144: mark as failed
Comment 2RHEL Program Management
2011-04-13 06:00:32 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.
Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.
Description of problem: Segfault in multipathd nine hours into failover/failback test. Message log: Apr 12 03:19:29 hb131223 multipathd: mpathd: remaining active paths: 2 Apr 12 03:19:29 hb131223 multipathd: mpathc: sdc - rdac checker reports path is up Apr 12 03:19:29 hb131223 kernel: multipathd[1646]: segfault at 170 ip 00000000004073f1 sp 00007fcaf062cc60 error 4 in multipathd[400000+10000] Apr 12 03:19:30 hb131223 abrt[23282]: saved core dump of pid 1616 (/sbin/multipathd) to /var/spool/abrt/ccpp-1302599970-1616.new/coredump (2273280 bytes) Backtrace: Core was generated by `/sbin/multipathd'. Program terminated with signal 11, Segmentation fault. #0 0x00000000004073f1 in update_prio (pp=0x7fcad80009b0, refresh_all=1) at main.c:964 964 vector_foreach_slot (pp->mpp->pg, pgp, i) { Missing separate debuginfos, use: debuginfo-install device-mapper-libs-1.02.62-3.el6.x86_64 glibc-2.12-1.24.1.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libselinux-2.0.94-4.el6.x86_64 libsepol-2.0.41-3.el6.x86_64 libudev-147-2.35.el6.x86_64 ncurses-libs-5.7-3.20090208.el6.x86_64 readline-6.0-3.el6.x86_64 (gdb) (gdb) (gdb) bt #0 0x00000000004073f1 in update_prio (pp=0x7fcad80009b0, refresh_all=1) at main.c:964 #1 0x0000000000407aff in check_path (vecs=0x111ec50, pp=0x7fcacc01ca70) at main.c:1116 #2 0x0000000000407db5 in checkerloop (ap=0x111ec50) at main.c:1159 #3 0x00000035c56077e1 in start_thread () from /lib64/libpthread.so.0 #4 0x00000035c4ae678d in clone () from /lib64/libc.so.6 (gdb) (gdb) p *pp $1 = {dev = '\000' <repeats 255 times>, dev_t = "8:144", '\000' <repeats 27 times>, sysdev = 0x0, scsi_id = {dev_id = 0, host_unique_id = 0, host_no = 0}, sg_id = {host_no = -1, channel = -1, scsi_id = -1, lun = -1, h_cmd_per_lun = 0, d_queue_depth = 0, unused1 = 0, unused2 = 0}, wwid = "3600a0b8000477a9c000080a34d54312a", '\000' <repeats 94 times>, vendor_id = "\000\000\000\000\000\000\000\000", product_id = '\000' <repeats 16 times>, rev = "\000\000\000\000", serial = '\000' <repeats 63 times>, tgt_node_name = '\000' <repeats 223 times>, size = 0, checkint = 0, tick = 0, bus = 0, offline = 0, state = 0, dmstate = 1, failcount = 1, priority = 1, pgindex = 1, getuid = 0x0, prio_args = 0x0, prio = 0x111e670, checker = {node = {next = 0x0, prev = 0x0}, fd = 0, sync = 0, timeout = 0, disable = 0, name = '\000' <repeats 15 times>, message = '\000' <repeats 255 times>, wwid = '\000' <repeats 127 times>, context = 0x0, mpcontext = 0x0, check = 0, init = 0, free = 0}, mpp = 0x0, fd = 8, hwe = 0x0} (gdb) (gdb) l 959 int oldpriority; 960 struct pathgroup * pgp; 961 int i, j, changed = 0; 962 963 if (refresh_all) { 964 vector_foreach_slot (pp->mpp->pg, pgp, i) { 965 vector_foreach_slot (pgp->paths, pp, j) { 966 oldpriority = pp->priority; 967 pathinfo(pp, conf->hwtable, DI_PRIO); 968 if (pp->priority != oldpriority) OS was on multipath device so system died after one more cycle of fault injection because all paths to root device were lost. Version-Release number of selected component (if applicable): device-mapper-multipath-0.4.9-40.el6.x86_64 How reproducible: Unknown Steps to Reproduce: 1. Three multipath LUNs on RDAC array, OS installed on one, plus two test LUNs; Two host-side and two storage-side paths. 2. Run I/O to two test LUNs 3. Run fault injection (switch port disable/enable) on storage-side paths: Repeat { disable port A sleep 10 minutes enable port A sleep 8 minutes disable port B sleep 10 minutes enable port B sleep 8 minutes } Actual results: Segfault in multipathd and system died after nine hours of fault injection. Expected results: Additional info: Message log showed errors like this throughout the duration of the fault injection: Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:5: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 multipathd: mpathd: failed in domap for removal of path sdl Apr 12 02:43:18 hb131223 multipathd: uevent trigger error Apr 12 02:43:18 hb131223 multipathd: sdj: remove path (uevent) Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:5: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:4: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 multipathd: mpathc: failed in domap for removal of path sdj Apr 12 02:43:18 hb131223 multipathd: uevent trigger error Apr 12 02:43:18 hb131223 multipathd: sdf: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdf: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdf: no sysfs information Apr 12 02:43:18 hb131223 multipathd: checker failed path 8:80 in map mpathc Apr 12 02:43:18 hb131223 multipathd: mpathc: remaining active paths: 0 Apr 12 02:43:18 hb131223 multipathd: sdk: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdk: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdk: no sysfs information Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:4: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 multipathd: 8:144: mark as failed