Bug 680480

Summary: multipathd SEGV in sysfs_get_timeout() during double path failure
Product: Red Hat Enterprise Linux 6 Reporter: Mike Snitzer <msnitzer>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Gris Ge <fge>
Severity: high Docs Contact:
Priority: urgent    
Version: 6.0CC: agk, bdonahue, berthiaume_wayne, bmarzins, bugproxy, coughlan, dwysocha, eddie.williams, fge, heinzm, mbroz, msnitzer, prajnoha, prockai, zkabelac
Target Milestone: rc   
Target Release: 6.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-40.el6 Doc Type: Bug Fix
Doc Text:
During a double path failure, the sysfs device file is removed and the sysdev path attribute is set to NULL. The sysfs device cache is indexed by the actual sysfs directory, and /sys/block/pathname is a symlink. Prior to this update, if the path was deleted, multipathd was not able to find the actual directory, which /sys/block/pathname pointed to, and searched the cache. With this update, multipathd verifies that sysdev has NULL value before updating it.
Story Points: ---
Clone Of: 680140 Environment:
Last Closed: 2011-05-19 14:13:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
coredump info from /var/spool/abrt
none
messages
none
sosreport none

Description Mike Snitzer 2011-02-25 17:25:37 UTC
+++ This bug was initially created as a clone of Bug #680140 +++

Description of problem:

--- Additional comment from eddie.williams on 2011-02-25 10:46:49 EST ---

Created attachment 481021 [details]
gdb output from multipathd coredump

during double-path failure testing either by physically powering off a fibre channel switch or simulating it the multipathd will sometimes coredump.  The failure is a segmentation fault in sysfs_get_timeout() in libmultipath/discovery.c

Comment 2 Eddie Williams 2011-02-25 18:00:46 UTC
Similar to bug 680140 this is related to devices being removed due to failure. 
In each case that I have seen this core dump devices are being removed due to
failure.  When the device is removed the dereferencing pointers will get errors
like this.  In one case:

Feb 24 17:13:36 fiji kernel: device-mapper: multipath: Failing path 67:160.
Feb 24 17:13:36 fiji kernel: device-mapper: multipath: Failing path 67:176.
Feb 24 17:13:36 fiji kernel: device-mapper: multipath: Failing path 68:208.
Feb 24 17:13:36 fiji kernel: device-mapper: multipath: Failing path 67:224.
Feb 24 17:13:36 fiji kernel: multipathd[10655]: segfault at 8 ip
0000003716447ff7 sp 00007f90f170e020 error 4 in libc-2.12.so[3716400000+175000]
Feb 24 17:13:36 fiji abrt[6329]: saved core dump of pid 10354
(/sbin/multipathd) to /var/spool/abrt/ccpp-1298585616-10354.new/coredump
(5931008 bytes)


In another:

Feb 25 09:59:53 fiji multipathd: sdap: failed to get sysfs information
Feb 25 09:59:53 fiji multipathd: sdap: failed to get sysfs information
Feb 25 09:59:53 fiji kernel: device-mapper: multipath: Failing path 66:176.
Feb 25 09:59:53 fiji kernel: __ratelimit: 5196 callbacks suppressed
Feb 25 09:59:53 fiji kernel: multipathd[4808]: segfault at 8 ip
0000003716447ff7 sp 00007ffe4d8e4020 error 4 in libc-2.12.so[3716400000+175000]

In the second case the sysfs entry is missing but then it looks like
sysfs_get_timeout() tries to dereference the sysfs entry at:

if (safe_sprintf(attr_path, "%s/device", dev->devpath))

where dev is suppose to point to the sysfs entry (libmultipath/discovery.c).

Comment 3 Ben Marzinski 2011-02-28 19:20:34 UTC
Can you reliably reproduce this?  It's pretty straightforward to slove this crash, however, I don't understand the root cause.  Every indication points to the sysfs device file being removed, causing the sysdev path attribute to be set to NULL. Then sysfs_get_timeout() doesn't check for it being NULL and crashes.  So far so good.  The only issue is that multipathd maintains a a sysdev cache to avoid just such an issue.  The sysdev is only removed from the cache when the path is removed.  This is done under a lock that is also held by the code that is crashing, so these two things couldn't happen at the same time. and the sysdev isn't removed until the actual path is removed, so it should never get checked after the sysdev is removed.

That leaves two options, the sysdev isn't getting added to the cache when it should be (which looks pretty unlikely), and the sysdev is getting removed from the cache when it shouldn't be.

If you can reliably reproduce this, I can send you a test package that doesn't have the code to remove the sysdev from the cache.  This should allow us to check if the device really is getting removed when it shouldn't be.

Comment 4 Eddie Williams 2011-02-28 20:35:56 UTC
I can fairly reliably reproduce it.  :-)

In most cases when I am running some tests and cause two paths to fail
I will see the problem around 50% of the time.  However it has at
times gone 4 or 5  tests without duplicating.  So the window seems to
be fairly small.

I will be happy to test it.

If I can go through 10 tests without duplicating the problem I will be
satisfied that the problem has been resolved.

I have two servers setup so the chances of hitting this have greatly improved, so while not 100% it is close that one or the other will hit it.

Comment 5 Eddie Williams 2011-03-02 18:48:18 UTC
Any estimate when a package may be available for me to test?

Comment 6 Ben Marzinski 2011-03-03 21:41:13 UTC
There are debug packages available at:

http://people.redhat.com/bmarzins/device-mapper-multipath/rpms/RHEL6/x86_64/

and

http://people.redhat.com/bmarzins/device-mapper-multipath/rpms/RHEL6/i686/

These packages will not ever remove a sysfs device from the cache.  If this solves your problem, then it's pretty obvious that these devices are getting removed when they shouldn't be.

Comment 7 Eddie Williams 2011-03-03 22:18:53 UTC
Created attachment 482169 [details]
coredump info from /var/spool/abrt

Updated multipath packages to 38.el6.bz680480 and retried test with multiple paths failing (simulating a switch failure).  Multipathd core dumped.  Note I ran into core dumps on both systems being tested the first attempt.  I may have been just lucky or with the new bits the failure rate is higher and maybe 100%.

Comment 8 Ben Marzinski 2011-03-10 19:23:19 UTC
O.k. This all makes sense now.  I forgot that the sysfs device cache is indexed by the actual sysfs directory, and /sys/block/<pathname> is now a symlink, so if the path is deleted, it isn't able to find the actual directory that /sys/block/<pathname> points to, so it can't search the cache. Multipath is still able to clear the cache, because the remove events come in from udev using the actual sysfs directory.

So the answer is simply to not bother to update sysdev if it's not NULL, and we won't have to worry if it goes away before these checks.

Comment 9 Ben Marzinski 2011-03-15 04:41:48 UTC
Fixed issue.

Comment 11 Gris Ge 2011-03-15 06:36:16 UTC
Reproduced this issue and got same coredump as reported:

#0  0x0000003716447ff7 in _IO_vfprintf_internal (s=<value optimized out>, format=<value optimized out>, ap=<value optimized out>) at vfprintf.c:1593
1593              process_string_arg (((struct printf_spec *) NULL));
Missing separate debuginfos, use: debuginfo-install libaio-0.3.107-10.el6.x86_64 libselinux-2.0.94-2.el6.x86_64 libsepol-2.0.41-3.el6.x86_64 libudev-147-2.29.el6.x86_64
(gdb) br
Breakpoint 1 at 0x3716447ff7: file vfprintf.c, line 1593.
(gdb) list
1588
1589          /* Process current format.  */
1590          while (1)
1591            {
1592              process_arg (((struct printf_spec *) NULL));
1593              process_string_arg (((struct printf_spec *) NULL));
1594
1595            LABEL (form_unknown):
1596              if (spec == L_('\0'))
1597                {


Test enviroment:
device-mapper-multipath-0.4.9-39 (hit the problem)
device-mapper-multipath-0.4.9-40 (fix the problem)
Boot from SAN with 4 multibus path (2 HBA ports x 2 controler ports).
Bring 1 HBA ports down from switch.

I have tested device-mapper-multipath-0.4.9-40 for 10 times link bouncing, PASS.

Verified this bug.

Comment 12 Eddie Williams 2011-03-15 12:18:36 UTC
Can you send me a pointer to the new package for me to test with?

Comment 13 Ben Marzinski 2011-03-15 15:44:29 UTC
The device-mapper-multipath-0.4.9-40.el6 packages are available at:

http://people.redhat.com/bmarzins/device-mapper-multipath/rpms/RHEL6/

I've put up packages for all supported architectures.

Comment 14 Eddie Williams 2011-03-15 18:43:46 UTC
I downloaded the x86_64 bits and they resolved the problem.  THANKS

Comment 17 Ben Marzinski 2011-03-31 15:47:49 UTC
*** Bug 691658 has been marked as a duplicate of this bug. ***

Comment 18 IBM Bug Proxy 2011-03-31 15:56:34 UTC
Created attachment 489160 [details]
messages

Comment 19 IBM Bug Proxy 2011-03-31 15:56:42 UTC
Created attachment 489161 [details]
sosreport

Comment 20 IBM Bug Proxy 2011-04-05 07:32:12 UTC
------- Comment From christian_may.com 2011-04-05 03:21 EDT-------
All three blades (i386, x86_64, ppc64) were setup with RHEL6.1 Snapshot 1.
I/O and port bounce scenario started. Currently 50 cycles have been successfully passed for all architecures. multipathd still up'n running. Looks good...50 cycles still remaining...

Comment 21 IBM Bug Proxy 2011-04-06 11:23:31 UTC
------- Comment From christian_may.com 2011-04-06 07:10 EDT-------
I've stopped the test after appr. 70 cycles. Multipath daemon still running.
Problem fixed with device-mapper packages from Snapshot 1.
Bug can be closed.

Comment 22 IBM Bug Proxy 2011-04-11 06:41:47 UTC
------- Comment From prem.karat.ibm.com 2011-04-11 02:34 EDT-------
(In reply to comment #30)
> I've stopped the test after appr. 70 cycles. Multipath daemon still running.
> Problem fixed with device-mapper packages from Snapshot 1.
> Bug can be closed.

Since the bug is resolved and the fix is included in RHEL 6.1 Snap1, am closing this one out as per previous comment.

Cheers,
Prem

Comment 23 Eva Kopalova 2011-05-02 13:53:18 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
During a double path failure, the sysfs device file is removed and the sysdev path attribute is set to NULL. The sysfs device cache is indexed by the actual sysfs directory, and /sys/block/pathname is a symlink. Prior to this update, if the path was deleted, multipathd was not able to find the actual directory, which /sys/block/pathname pointed to, and searched the cache. With this update, multipathd verifies that sysdev has NULL value before updating it.

Comment 24 errata-xmlrpc 2011-05-19 14:13:00 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0725.html