Description of problem: We have observed that IO resumption takes a long time (around 3 mins) when one of the controller paths is down. Following is our setup 1. iscsi is configured on the host as well as on the target 1. 10 LUNs from each controller head are mapped to the host 2. multipath is configured on the host side, there are 4 paths (ie, 4 iscsi sessions) available to the target 3. LV is created on top of these multipathed iscsi device 4. IO is performed on the LV When one of the target port is configured down to simulate path failure, we observe that IO resumption on the remaining 3 paths takes around 3 minutes. Is it an expected behavior? Version-Release number of selected component (if applicable): iscsi-initiator-utils-6.2.0.871-0.16.el5 device-mapper-multipath-0.4.7-34.el5 kernel-2.6.18-194.el5 How reproducible: Always Steps to Reproduce: 1. Start IO on multipathed iscsi device 2. Configure a port down on the target 3. IO takes around 3 minutes to resume Actual results: IO resumption takes around 3 mins Expected results: If we have multiple paths to the target, even if one path goes down, IO should not stall for such a long time. Additional info: Following is our multipath.conf file defaults { user_friendly_names no max_fds max queue_without_daemon no } blacklist { devnode "^hd[a-z]" devnode " ^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode " ^cciss!c[0-9]d[0-9]*" wwid (SIBM-ESXSMAT3073NC_FAAR9P5A0ADG6) } devices { device { vendor "NETAPP" product "LUN" path_grouping_policy group_by_prio features "1 queue_if_no_path" prio_callout "/sbin/mpath_prio_ontap /dev/%n" path_checker directio failback immediate hardware_handler "0" rr_weight uniform rr_min_io 128 getuid_callout "/sbin/scsi_id -g -u -s /block/%n" flush_on_last_del yes } }
Can you please attach the output of running # multipath -ll and then reproduce the issue, and attach /var/log/messages If you could make a note of when you failed the paths, and when you noticed the IO begin again, that would be even more helpful.
Attaching the required infto. 41 LUNs (20 from each one and 21 from another head) are mapped to the host and there are 3 paths to the target. mutipath.conf is same as above. I configured the port down at Jun 22 21:46:38 and IO resumed at Jun 22 21:51:08 . We use our application dt to perform IO. It logged there was no progress of IO for 215 seconds on following mount point, ie, IO was stalled for 215 seconds dt (16298): No progress made during open() on /mnt/qa/n_vg/nvol-1/aa-16298 for 215 seconds! (03m35.00s)
Created attachment 426012 [details] multipath -ll o/p
Created attachment 426016 [details] log messages
In this bug, there is a 1 minute 46 second gap between Jun 22 21:49:13 IBMx336-200-143 multipathd: sdau: directio checker reports path is down and Jun 22 21:50:59 IBMx336-200-143 multipathd: checker failed path 66:224 in map 36 0a98000486e58394434584838684771
One more thing I am observing here is that session recovery takes almost 50 seconds to time out. Jul 2 11:09:00 IBMx3250-200-174 kernel: connection9:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4528903155, last ping 4528908155, now 4528913155 Jul 2 11:09:00 IBMx3250-200-174 kernel: connection9:0: detected conn error (1011) Jul 2 11:09:01 IBMx3250-200-174 iscsid: Kernel reported iSCSI connection 9:0 error (1011) state (3) Jul 2 11:09:55 IBMx3250-200-174 kernel: session9: session recovery timed out after 5 secs It is seen only when host is performing IO, in idle scenario recovery times out in 5 seconds. Jul 2 11:03:18 IBMx3250-200-174 kernel: connection9:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4528561473, last ping 4528566473, now 4528571473 Jul 2 11:03:18 IBMx3250-200-174 kernel: connection9:0: detected conn error (1011) Jul 2 11:03:18 IBMx3250-200-174 iscsid: Kernel reported iSCSI connection 9:0 error (1011) state (3) Jul 2 11:03:23 IBMx3250-200-174 kernel: session9: session recovery timed out after 5 secs
That looks like it's probably Bug 570681/583898.
I'm going to build you a debug package that will add some printouts to the functions between the two multipathd messages listed in comment 5, so that I can narrow down where that stall is happening, since it looks to be a seperate issue from the iscsi timeout.
You can get the debug packages at http://people.redhat.com/bmarzins/device-mapper-multipath/rpms/ If you could install these, recreate the issue, and attach the output from /var/log/messages to this bugzilla, that would be very helpful.
Created attachment 430260 [details] Log messages after installing debug package Attaching the syslog messages after installing debug package.
Huh? Well, it's not getting stuck in the checkerloop code. Jul 8 12:26:29 IBMx336-200-143 multipathd: updating multipath strings for 360a9 8000486e53774f34585a73434b6c Jul 8 12:27:26 IBMx336-200-143 multipathd: calling dm_get_map for 360a98000486e 53774f34585a73434b6c There is almost a minute gap between these two messages: however, the second messge is the first thing that happens in a function that is called immediately after the first message is printed. So something unrelated is either keeping the checkerloop code from running, or keeping the log daemon code from logging the messages. Are there any console messages that aren't getting picked up in /var/log/messages?
(In reply to comment #11) > > So something unrelated is either keeping the checkerloop code from running, or > keeping the log daemon code from logging the messages. Are there any console > messages that aren't getting picked up in /var/log/messages? I am not seeing any missed messages on console.
So, is this problem unable to be recreated when writing to the multipath device itself, instead of the filesystem?
(In reply to comment #13) > So, is this problem unable to be recreated when writing to the multipath device > itself, instead of the filesystem? Yes, delayed IO resumption is not seen when FS is not used.
I've been able to recreate a 3 minute IO stall with a similar iSCSI setup. I'm digging into what's going on right now.
Could you please try reproducing this with the kernel at http://people.redhat.com/bmarzins/kernels/kernel-2.6.18-195.el5.x86_64.rpm This kernel has the fix for Bug 570681. With kernel-2.6.18-194.el5, if I have the node.session.timeo.replacement_timeout set to 5 seconds, the longest I have seen an IO stall is for 85 seconds. With kernel-2.6.18-195.el5, I'm not sure what the longest stall was, but dt was set to warn if there was an IO stall of more than 30 seconds, and it never did.
Initial test looks good and I am not seeing any delay in IO resumption. Will run a full fledged test and update the bugzilla.
Our regression tests passed successfully. We are not seeing any delay in IO resumption after upgrading the kernel to above test kernel.
*** This bug has been marked as a duplicate of bug 570681 ***