Bug 986767

Summary: multipathd deamon process exit with segfault
Product: Red Hat Enterprise Linux 6 Reporter: yangjun <yang.jun32>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: yanfu,wang <yanwang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.2CC: acathrow, agk, bdonahue, bmarzins, dwysocha, heinzm, jcastillo, msnitzer, prajnoha, prockai, yang.jun32, yanwang, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-69.el6 Doc Type: Bug Fix
Doc Text:
Cause: Multipath wasn't blacklisting tapdev, which cannot be multipathed. It tried to multipath them, but their sysfs format is different from expected path devices. Consequence: Multipathd could crash if multiple tadpdev devices were removed from the system. Fix: Multipath now blacklists /dev/td[a-z].* device by default. Result: Multipath will no longer try to setup multipath devices on tapdev devices, and will not crash when they are removed.
Story Points: ---
Clone Of:
: 1036503 (view as bug list) Environment:
Last Closed: 2013-11-21 07:50:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
the core dump file none

Description yangjun 2013-07-22 03:38:41 UTC
Created attachment 776672 [details]
the core dump file

Description of problem:

the multipathd deamon process exit with segfault


reproduce:
1.create the block devices(/dev/td*)
2.delete the block devices(/dev/td*)
3.after minutes,the multipathd exit with segfault
cat /var/log/messages:
kernel: multipathd[32369]: segfault at 7f57a9dd6ff8 ip 0000003574a2d3c9 sp 00007f57a9dd7000 error 6 in libmultipath.so[3574a00000+3b000]

Comment 2 yangjun 2013-07-22 04:43:25 UTC
the rpm version:
[root@localhost ~]# rpm -qa|grep multipath
device-mapper-multipath-debuginfo-0.4.9-56.el6.x86_64
device-mapper-multipath-0.4.9-56.el6.x86_64
device-mapper-multipath-libs-0.4.9-56.el6.x86_64

Comment 4 Jose Castillo 2013-07-24 13:18:05 UTC
Could you let us know the specific steps you followed to create these block devices, and how reproducible is the problem?

Comment 5 Ben Marzinski 2013-07-24 17:05:21 UTC
I have a dumb question. What kind of devices are td devices?  Are they supposed to be multipathed? multipath certainly doesn't blacklist them by default, but perhaps it should.

The issue is that these td devices all have the sysfs parent of /devices/virtual/block/.  All devices that multipath currently works with have unique parents. What's happening here is that for all paths devices, multipath stores their sysfs devices and their parents in a cache, since it needs to reference them to grab sysfs values.  When the second td* path device is added by multipath, it uses the cached version of the parent from the first device instead of creating it's own device in the cache.  When multipath removes the first device, it deletes that cached parent device.  When it removes the second device, it again tries to delete the cached parent device, but it already has done so.

If these devices aren't supposed to be multipathed, then the workaround is to add

blacklist {
        devnode "^td"
}

to /etc/multipath.conf, and I can add this to the default blacklist, so that starting in RHEL-6.5, multipath will automatically blacklist these devices.

If these devices really are supposed to be blacklisted, that's a trickier problem.  Multipath would need to either cache separate parents for these devices, or it would need to keep reference counts and only delete the device when the last user drops it.  Unfortunately, the way multipath uses these cached devices, neither solution is particularly straightforward.

Comment 6 yangjun 2013-08-05 02:20:22 UTC
I'm sorry. last few days, I'm on holidays .so didn't reply the bug.

/dev/td* are tapdev devices:
[root@localhost ~]# cat /proc/devices |grep tap
251 blktap2
253 tapdev
[root@localhost ~]# ll /dev/tda 
brw-rw---- 1 root disk 253, 0 Aug  5 09:19 /dev/tda
brw-rw---- 1 root disk 253, 1 Aug  5 09:19 /dev/tdb

/dev/td* did not support the multipath.

Comment 7 Ben Marzinski 2013-08-05 18:28:21 UTC
So, I'll go ahead and make multipath automatically blacklist these devices. Can you verify that the workaround of adding

blacklist {
        devnode "^td"
}

to /etc/multipath.conf solves the issue for you?

Comment 8 Ben Marzinski 2013-08-13 15:45:57 UTC
I've updated the default blacklist to exclude tapdev devices.

Comment 10 yanfu,wang 2013-10-11 06:56:13 UTC
Reproduced on device-mapper-multipath-0.4.9-64.el6:
[root@storageqe-17 ~]# multipathd show conf|grep td
[root@storageqe-17 ~]# echo $?
1

Verified on device-mapper-multipath-0.4.9-71.el6, "td[a-z]" devices are now blacklisted by default.
[root@storageqe-17 ~]# multipathd show conf|grep td
	devnode "^(td|hd)[a-z]"

Comment 12 errata-xmlrpc 2013-11-21 07:50:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1574.html