606801 – [NetApp 5.6 bug] IO resumption takes a long time when one of the controller paths is down

Bug 606801 - [NetApp 5.6 bug] IO resumption takes a long time when one of the controller paths is down

Summary: [NetApp 5.6 bug] IO resumption takes a long time when one of the controller p...

Keywords:
Status:	CLOSED DUPLICATE of bug 570681
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	device-mapper-multipath
Sub Component:
Version:	5.5
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	beta
Target Release:	5.6
Assignee:	Ben Marzinski
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	557597
TreeView+	depends on / blocked

Reported:	2010-06-22 14:12 UTC by Tanvi
Modified:	2010-11-09 12:16 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-08-02 14:45:23 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
multipath -ll o/p (10.84 KB, text/plain) 2010-06-22 16:48 UTC, Tanvi	no flags	Details
log messages (184.15 KB, text/plain) 2010-06-22 16:49 UTC, Tanvi	no flags	Details
Log messages after installing debug package (355.63 KB, text/plain) 2010-07-08 07:28 UTC, Tanvi	no flags	Details
View All

Description Tanvi 2010-06-22 14:12:53 UTC

Description of problem:
We have observed that IO resumption takes a long time (around 3 mins) when one of the controller paths is down. Following is our setup
1. iscsi is configured on the host as well as on the target
1. 10 LUNs from each controller head are mapped to the host
2. multipath is configured on the host side, there are 4 paths (ie, 4 iscsi sessions) available to the target
3. LV is created on top of these multipathed iscsi device
4. IO is performed on the LV
 When one of the target port is configured down to simulate path failure, we observe that IO resumption on the remaining 3 paths takes around 3 minutes. Is it an expected behavior? 


Version-Release number of selected component (if applicable):
iscsi-initiator-utils-6.2.0.871-0.16.el5
device-mapper-multipath-0.4.7-34.el5
kernel-2.6.18-194.el5

How reproducible:
Always

Steps to Reproduce:
1. Start IO on multipathed iscsi device
2. Configure a port down on the target
3. IO takes around 3 minutes to resume
  
Actual results:
IO resumption takes around 3 mins 

Expected results:
If we have multiple paths to the target, even if one path goes down, IO should not stall for such a long time.

Additional info:
Following is our multipath.conf file

defaults {
        user_friendly_names no
        max_fds             max
        queue_without_daemon no

}
blacklist {
devnode "^hd[a-z]"
    devnode " ^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
    devnode " ^cciss!c[0-9]d[0-9]*"
    wwid (SIBM-ESXSMAT3073NC_FAAR9P5A0ADG6)

}
devices {
device {
       vendor "NETAPP"
       product "LUN"
       path_grouping_policy group_by_prio
       features "1 queue_if_no_path"
       prio_callout "/sbin/mpath_prio_ontap /dev/%n"

       path_checker directio
       failback immediate
       hardware_handler "0"
       rr_weight uniform
     rr_min_io   128
     getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
     flush_on_last_del yes
}
}

Comment 1 Ben Marzinski 2010-06-22 15:44:42 UTC

Can you please attach the output of running

# multipath -ll

and then reproduce the issue, and attach

/var/log/messages

If you could make a note of when you failed the paths, and when you noticed the IO  begin again, that would be even more helpful.

Comment 2 Tanvi 2010-06-22 16:48:05 UTC

Attaching the required infto. 41 LUNs (20 from each one and 21 from another head) are mapped to the host and there are 3 paths to the target. mutipath.conf is same as above.

I configured the port down at Jun 22 21:46:38 and IO resumed at Jun 22 21:51:08 . We use our application dt to perform IO. It logged there was no progress of IO for 215 seconds on following mount point, ie, IO was stalled for 215 seconds

dt (16298): No progress made during open() on /mnt/qa/n_vg/nvol-1/aa-16298 for 215 seconds! (03m35.00s)

Comment 3 Tanvi 2010-06-22 16:48:46 UTC

Created attachment 426012 [details]
multipath -ll o/p

Comment 4 Tanvi 2010-06-22 16:49:31 UTC

Created attachment 426016 [details]
log messages

Comment 5 Ben Marzinski 2010-06-30 06:23:05 UTC

In this bug, there is a 1 minute 46 second gap between

Jun 22 21:49:13 IBMx336-200-143 multipathd: sdau: directio checker reports path 
is down 

and

Jun 22 21:50:59 IBMx336-200-143 multipathd: checker failed path 66:224 in map 36
0a98000486e58394434584838684771

Comment 6 Tanvi 2010-07-02 05:42:51 UTC

One more thing I am observing here is that session recovery takes almost 50 seconds to time out. 

Jul  2 11:09:00 IBMx3250-200-174 kernel:  connection9:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4528903155, last ping 4528908155, now 4528913155
Jul  2 11:09:00 IBMx3250-200-174 kernel:  connection9:0: detected conn error (1011)
Jul  2 11:09:01 IBMx3250-200-174 iscsid: Kernel reported iSCSI connection 9:0 error (1011) state (3)
Jul  2 11:09:55 IBMx3250-200-174 kernel:  session9: session recovery timed out after 5 secs

It is seen only when host is performing IO, in idle scenario recovery times out in 5 seconds.

Jul  2 11:03:18 IBMx3250-200-174 kernel:  connection9:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4528561473, last ping 4528566473, now 4528571473
Jul  2 11:03:18 IBMx3250-200-174 kernel:  connection9:0: detected conn error (1011)
Jul  2 11:03:18 IBMx3250-200-174 iscsid: Kernel reported iSCSI connection 9:0 error (1011) state (3)
Jul  2 11:03:23 IBMx3250-200-174 kernel:  session9: session recovery timed out after 5 secs

Comment 7 Ben Marzinski 2010-07-07 20:10:59 UTC

That looks like it's probably Bug 570681/583898.

Comment 8 Ben Marzinski 2010-07-07 20:20:01 UTC

I'm going to build you a debug package that will add some printouts to the functions between the two multipathd messages listed in comment 5, so that I can narrow down where that stall is happening, since it looks to be a seperate issue from the iscsi timeout.

Comment 9 Ben Marzinski 2010-07-07 22:21:53 UTC

You can get the debug packages at 

http://people.redhat.com/bmarzins/device-mapper-multipath/rpms/

If you could install these, recreate the issue, and attach the output from /var/log/messages to this bugzilla, that would be very helpful.

Comment 10 Tanvi 2010-07-08 07:28:47 UTC

Created attachment 430260 [details]
Log messages after installing debug package

Attaching the syslog messages after installing debug package.

Comment 11 Ben Marzinski 2010-07-09 18:08:04 UTC

Huh? Well, it's not getting stuck in the checkerloop code.

Jul  8 12:26:29 IBMx336-200-143 multipathd: updating multipath strings for 360a9
8000486e53774f34585a73434b6c 
Jul  8 12:27:26 IBMx336-200-143 multipathd: calling dm_get_map for 360a98000486e
53774f34585a73434b6c 

There is almost a minute gap between these two messages: however, the second messge is the first thing that happens in a function that is called immediately after the first message is printed.

So something unrelated is either keeping the checkerloop code from running, or keeping the log daemon code from logging the messages.  Are there any console messages that aren't getting picked up in /var/log/messages?

Comment 12 Tanvi 2010-07-12 11:15:15 UTC

(In reply to comment #11)

> 
> So something unrelated is either keeping the checkerloop code from running, or
> keeping the log daemon code from logging the messages.  Are there any console
> messages that aren't getting picked up in /var/log/messages?    

I am not seeing any missed messages on console.

Comment 13 Ben Marzinski 2010-07-23 19:30:24 UTC

So, is this problem unable to be recreated when writing to the multipath device itself, instead of the filesystem?

Comment 14 Tanvi 2010-07-26 06:28:52 UTC

(In reply to comment #13)
> So, is this problem unable to be recreated when writing to the multipath device
> itself, instead of the filesystem?    

Yes, delayed IO resumption is not seen when FS is not used.

Comment 15 Ben Marzinski 2010-07-27 01:29:39 UTC

I've been able to recreate a 3 minute IO stall with a similar iSCSI setup. I'm digging into what's going on right now.

Comment 16 Ben Marzinski 2010-07-29 06:57:31 UTC

Could you please try reproducing this with the kernel at

http://people.redhat.com/bmarzins/kernels/kernel-2.6.18-195.el5.x86_64.rpm

This kernel has the fix for Bug 570681.

With kernel-2.6.18-194.el5, if I have the node.session.timeo.replacement_timeout set to 5 seconds, the longest I have seen an IO stall is for 85 seconds.

With kernel-2.6.18-195.el5, I'm not sure what the longest stall was, but dt was set to warn if there was an IO stall of more than 30 seconds, and it never did.

Comment 17 Tanvi 2010-07-30 06:55:18 UTC

Initial test looks good and I am not seeing any delay in IO resumption.
Will run a full fledged test and update the bugzilla.

Comment 18 Tanvi 2010-08-02 06:00:27 UTC

Our regression tests passed successfully. We are not seeing any delay in IO resumption after upgrading the kernel to above test kernel.

Comment 19 Ben Marzinski 2010-08-02 14:45:23 UTC


*** This bug has been marked as a duplicate of bug 570681 ***

Note You need to log in before you can comment on or make changes to this bug.