Red Hat Bugzilla – Bug 994277
multipath: fix handling of transport-offline states
Last modified: 2014-03-02 18:41:47 EST
Description of problem: The iscsi layer uses a long iscsi device state, transport-offline, and the multipath tools state buffer reading code cannot handle that large a buffer. This is a request to bring in this patch: https://www.redhat.com/archives/dm-devel/2013-February/msg00058.html from upstream. Without this patch the logs fill up with messages about not being able to read the file when the path is down. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Patch applied. Thanks.
QA, To test this just login to a iscsi target, create a multipath device using the iscsi paths, then pull a cable for longer than the iscsi replacement/recovery timeout setting (default is 2 minutes but modifyable in /etc/iscsi/iscsid.conf and the isccsiadm -m node -o update command for existing targgets). When the iscsi replacement/recovery timeout has expired then you should see session recovery timed out after %d secs in /var/log/messages, and if you cat /sys/devices/platform/host3/session1/target3:0:0/3:0:0:0/state it will say transport-offline. In /var/log/messages then you will then see these messages start to appear: Jul 17 10:00:52 IONr8RED2950 multipathd: overflow in attribute '/sys/devices/platform/host3/session1/target3:0:0/3:0:0:0/state' With the fix those messages should not appear.
Reproduced on device-mapper-multipath-0.4.9-64.el6: test setting up a multipath device on top of an iscsi device: [root@storageqe-17 ~]# multipath -l mpathc (1IET 00010001) dm-6 IET,VIRTUAL-DISK size=500M features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=0 status=active `- 10:0:0:1 sdd 8:48 active undef running ... [root@storageqe-17 ~]# cat /sys/devices/platform/host10/session3/target10\:0\:0/10\:0\:0\:1/state running update the iscsi replacement/recovery timeout setting: [root@storageqe-17 ~]# iscsiadm -m node -T iqn.2013-09.com.redhat:target1 |grep timeout node.session.timeo.replacement_timeout = 120 [root@storageqe-17 ~]# iscsiadm -m node -T iqn.2013-09.com.redhat:target1 -o update -n node.session.timeo.replacement_timeout -v 180 [root@storageqe-17 ~]# iscsiadm -m node -T iqn.2013-09.com.redhat:target1 |grep timeout node.session.timeo.replacement_timeout = 180 Down network in iscsi target: [root@storageqe-19 ~]# /etc/init.d/network stop Shutting down interface eth0: [ OK ] Shutting down loopback interface: [ OK ] When the iscsi replacement/recovery timeout has expired, got below expected message: Oct 14 03:09:04 storageqe-17 kernel: session3: session recovery timed out after 180 secs Oct 14 03:09:05 storageqe-17 iscsid: connect to 10.16.67.51:3260 failed (No route to host) Oct 14 03:09:05 storageqe-17 kernel: sd 10:0:0:1: rejecting I/O to offline device Oct 14 03:09:05 storageqe-17 kernel: device-mapper: multipath: Failing path 8:48. Oct 14 03:09:05 storageqe-17 multipathd: overflow in attribute '/sys/devices/platform/host10/session3/target10:0:0/10:0:0:1/state' Oct 14 03:09:05 storageqe-17 multipathd: mpathc: sdd - directio checker reports path is down Oct 14 03:09:05 storageqe-17 multipathd: checker failed path 8:48 in map mpathc Oct 14 03:09:05 storageqe-17 multipathd: mpathc: remaining active paths: 0 Oct 14 03:09:10 storageqe-17 kernel: sd 10:0:0:1: rejecting I/O to offline device Oct 14 03:09:10 storageqe-17 multipathd: overflow in attribute '/sys/devices/platform/host10/session3/target10:0:0/10:0:0:1/state' [root@storageqe-17 ~]# cat /sys/devices/platform/host10/session3/target10\:0\:0/10\:0\:0\:1/state transport-offline Verified on the fixed version without above problem.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1574.html