Bug 151097 - Default TCP retransmit timeout too fast on NFS mounts
Summary: Default TCP retransmit timeout too fast on NFS mounts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: util-linux
Version: 3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 156320
TreeView+ depends on / blocked
 
Reported: 2005-03-14 21:26 UTC by Chuck Lever
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version: RHBA-2005-626
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-09-28 15:53:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2005:626 0 qe-ready SHIPPED_LIVE util-linux and mount bug fix update 2005-09-28 04:00:00 UTC

Description Chuck Lever 2005-03-14 21:26:27 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6)
Gecko/20050302 Firefox/1.0.1 Fedora/1.0.1-1.3.2

Description of problem:
If no timeo= option is specified when mounting an NFS file system with
TCP, the mount command provides a default value of 0.7 seconds.  This
value may be appropriate for NFS over UDP, but is way too aggressive
for TCP, and can result in performance loss or data corruption.  The
correct
default settings for NFS over TCP on 2.4 kernels should be
timeo=600,retrans=2.

Note that RHEL AS 2.1 also has this problem, but the RHEL 4 mount
command should have patches that were included to support NFSv4, which
have a fix for this issue.

Version-Release number of selected component (if applicable):
util-linux-2.11y-31.2

How reproducible:
Always

Steps to Reproduce:
1.  Add a printk in the NFS client's mount logic to show the timeo
2.  mount -o tcp
3.  look at the output of the printk
    

Actual Results:  The printk will show that the mount command passes in
a default timeo
and retrans value, and that value is too small on NFS over TCP mounts

Expected Results:  The mount command should pass in no timeo value (in
which case the
NFS client will pick an appropriate default, or should pass in a
reasonable timeo value such as described above.

Additional info:

This is a critical problem for customers who use NFS over TCP.

Comment 1 Chuck Lever 2005-03-16 16:00:33 UTC
One reason this problem has gone on for so long is that /proc/mounts does not
display the actual timeo and retrans values in effect for an NFS mount point. 
As part of the fix for this bug, can we get support for displaying those mount
options added to the NFS client's show_options method?  I'm working on adding
similar support in 2.6 mainline.  Thanks!

Comment 2 Chuck Lever 2005-03-31 22:32:05 UTC
What's the status of this issue?  The problem can potentially result in data
corruption, so we'd like a fix for this in the next update, if possible.

Comment 3 Tom Mitchell 2005-04-01 01:17:24 UTC
This impacts those with 'older' NFS appliances and fileservers.
Those with nis maps for automount etc will do well to set timeouts....

Comment 4 Damian Menscher 2005-04-13 01:42:22 UTC
I'm having possibly-related problems with this issue under RHEL3, but using 
UDP.  As previously mentioned, the UDP timeout is supposed to be 0.7 seconds, 
then double repeatedly after each timeout up to a max of 60 seconds.  Looking 
at the source shows the line "data.timeo = tcp ? 70 : 7;", which I take to mean 
UDP has a 0.7 second timeout, and TCP has a 7 second timeout, by default.

Problem is, that doesn't seem to be the case at all.  I used tcpdump to get a 
packet capture that included some timeouts.  The shocking thing is that it's 
not waiting anywhere near 0.7 seconds for the RPC response.  It's actually much 
shorter.  The first timeout seems to fluctuate a bit (latency in packet capture 
makes it hard to be precise), but it's on the order of 0.07 seconds.  I'm not 
sure, but maybe the order-of-magnitude shift is because we're using gigabit?  
Another possible issue is we're using the SMP kernel.

Anyway, this is a serious issue, since a moderately loaded fileserver will 
frequently take more than 0.07 seconds to respond.  I have not yet tested 
whether setting the timeo= option will be respected, but I don't have high 
hopes given how quickly it's timing out right now.

Should I submit this as a separate bug?  It's not clear to me whether it's the 
same bug or a different one.

Comment 5 Chuck Lever 2005-04-18 14:16:57 UTC
damian-

short UDP timeouts are normal.  RHEL 3 uses a request round-trip time estimator
which can trim the timeouts pretty short.  it will ignore the mount command line
setting.  i believe the lower bound was raised in later updates of RHEL 3 to
address the same issue you are reporting, but i can't find the bugzilla report
where this is addressed.  if you report this problem again, be sure to mention
which update of RHEL 3 you are using.

Comment 6 Chuck Lever 2005-04-18 14:19:02 UTC
My management is pressing me pretty hard on this issue, as it increases the
potential for data corruption on NFS/TCP mounts that use the default timeout
setting.  When will we get a fix for this problem?

Comment 7 Damian Menscher 2005-04-19 03:13:46 UTC
Yes, the short UDP timeouts are a result of the RTT estimator trimming it to 
HZ/30.  Recent kernels use HZ/10.  I've submitted <A 
HREF="https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=155313">Bug 
155313</A> on this issue, since it appears to be separate from your bug.  We 
increased to retrans=10 in the meantime.

Comment 10 Steve Dickson 2005-06-08 20:57:27 UTC
Should be fixed in util-linux-2.11y-31.8

Comment 14 Red Hat Bugzilla 2005-09-28 15:53:09 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-626.html



Note You need to log in before you can comment on or make changes to this bug.