| Summary: | segfault in multipathd update_prio() | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Don Dettke <ddettke> |
| Component: | device-mapper-multipath | Assignee: | LVM and device-mapper development team <lvm-team> |
| Status: | CLOSED DUPLICATE | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.1 | CC: | agk, bmarzins, coughlan, ddettke, dwysocha, heinzm, mbroz, prajnoha, prockai, revers, zkabelac |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-04-15 18:22:22 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. *** This bug has been marked as a duplicate of bug 693524 *** |
Description of problem: Segfault in multipathd nine hours into failover/failback test. Message log: Apr 12 03:19:29 hb131223 multipathd: mpathd: remaining active paths: 2 Apr 12 03:19:29 hb131223 multipathd: mpathc: sdc - rdac checker reports path is up Apr 12 03:19:29 hb131223 kernel: multipathd[1646]: segfault at 170 ip 00000000004073f1 sp 00007fcaf062cc60 error 4 in multipathd[400000+10000] Apr 12 03:19:30 hb131223 abrt[23282]: saved core dump of pid 1616 (/sbin/multipathd) to /var/spool/abrt/ccpp-1302599970-1616.new/coredump (2273280 bytes) Backtrace: Core was generated by `/sbin/multipathd'. Program terminated with signal 11, Segmentation fault. #0 0x00000000004073f1 in update_prio (pp=0x7fcad80009b0, refresh_all=1) at main.c:964 964 vector_foreach_slot (pp->mpp->pg, pgp, i) { Missing separate debuginfos, use: debuginfo-install device-mapper-libs-1.02.62-3.el6.x86_64 glibc-2.12-1.24.1.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libselinux-2.0.94-4.el6.x86_64 libsepol-2.0.41-3.el6.x86_64 libudev-147-2.35.el6.x86_64 ncurses-libs-5.7-3.20090208.el6.x86_64 readline-6.0-3.el6.x86_64 (gdb) (gdb) (gdb) bt #0 0x00000000004073f1 in update_prio (pp=0x7fcad80009b0, refresh_all=1) at main.c:964 #1 0x0000000000407aff in check_path (vecs=0x111ec50, pp=0x7fcacc01ca70) at main.c:1116 #2 0x0000000000407db5 in checkerloop (ap=0x111ec50) at main.c:1159 #3 0x00000035c56077e1 in start_thread () from /lib64/libpthread.so.0 #4 0x00000035c4ae678d in clone () from /lib64/libc.so.6 (gdb) (gdb) p *pp $1 = {dev = '\000' <repeats 255 times>, dev_t = "8:144", '\000' <repeats 27 times>, sysdev = 0x0, scsi_id = {dev_id = 0, host_unique_id = 0, host_no = 0}, sg_id = {host_no = -1, channel = -1, scsi_id = -1, lun = -1, h_cmd_per_lun = 0, d_queue_depth = 0, unused1 = 0, unused2 = 0}, wwid = "3600a0b8000477a9c000080a34d54312a", '\000' <repeats 94 times>, vendor_id = "\000\000\000\000\000\000\000\000", product_id = '\000' <repeats 16 times>, rev = "\000\000\000\000", serial = '\000' <repeats 63 times>, tgt_node_name = '\000' <repeats 223 times>, size = 0, checkint = 0, tick = 0, bus = 0, offline = 0, state = 0, dmstate = 1, failcount = 1, priority = 1, pgindex = 1, getuid = 0x0, prio_args = 0x0, prio = 0x111e670, checker = {node = {next = 0x0, prev = 0x0}, fd = 0, sync = 0, timeout = 0, disable = 0, name = '\000' <repeats 15 times>, message = '\000' <repeats 255 times>, wwid = '\000' <repeats 127 times>, context = 0x0, mpcontext = 0x0, check = 0, init = 0, free = 0}, mpp = 0x0, fd = 8, hwe = 0x0} (gdb) (gdb) l 959 int oldpriority; 960 struct pathgroup * pgp; 961 int i, j, changed = 0; 962 963 if (refresh_all) { 964 vector_foreach_slot (pp->mpp->pg, pgp, i) { 965 vector_foreach_slot (pgp->paths, pp, j) { 966 oldpriority = pp->priority; 967 pathinfo(pp, conf->hwtable, DI_PRIO); 968 if (pp->priority != oldpriority) OS was on multipath device so system died after one more cycle of fault injection because all paths to root device were lost. Version-Release number of selected component (if applicable): device-mapper-multipath-0.4.9-40.el6.x86_64 How reproducible: Unknown Steps to Reproduce: 1. Three multipath LUNs on RDAC array, OS installed on one, plus two test LUNs; Two host-side and two storage-side paths. 2. Run I/O to two test LUNs 3. Run fault injection (switch port disable/enable) on storage-side paths: Repeat { disable port A sleep 10 minutes enable port A sleep 8 minutes disable port B sleep 10 minutes enable port B sleep 8 minutes } Actual results: Segfault in multipathd and system died after nine hours of fault injection. Expected results: Additional info: Message log showed errors like this throughout the duration of the fault injection: Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:5: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 multipathd: mpathd: failed in domap for removal of path sdl Apr 12 02:43:18 hb131223 multipathd: uevent trigger error Apr 12 02:43:18 hb131223 multipathd: sdj: remove path (uevent) Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:5: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:4: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 multipathd: mpathc: failed in domap for removal of path sdj Apr 12 02:43:18 hb131223 multipathd: uevent trigger error Apr 12 02:43:18 hb131223 multipathd: sdf: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdf: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdf: no sysfs information Apr 12 02:43:18 hb131223 multipathd: checker failed path 8:80 in map mpathc Apr 12 02:43:18 hb131223 multipathd: mpathc: remaining active paths: 0 Apr 12 02:43:18 hb131223 multipathd: sdk: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdk: failed to get sysfs information Apr 12 02:43:18 hb131223 multipathd: sdk: no sysfs information Apr 12 02:43:18 hb131223 kernel: device-mapper: table: 253:4: multipath: error getting device Apr 12 02:43:18 hb131223 kernel: device-mapper: ioctl: error adding target to table Apr 12 02:43:18 hb131223 multipathd: 8:144: mark as failed