Bug 693485 - mount.nfs retries for 2 minutes in RHEL6
Summary: mount.nfs retries for 2 minutes in RHEL6
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: nfs-utils
Version: 6.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Steve Dickson
QA Contact: yanfu,wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-04 19:29 UTC by Jason
Modified: 2011-04-08 14:26 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-08 14:26:57 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Jason 2011-04-04 19:29:49 UTC
Description of problem:
When I attempt to automount a local directory that does not exist, autofs calls mount.nfs on that local directory, and it takes 2 minutes for mount to eventually timeout.  This makes autofs impossible to use on RHEL6 in our environment where we share all our maps with LDAP, and the local directory doesn't always exist on all systems.

In RHEL5, the mount command will fail immediately, as expected.

Version-Release number of selected component (if applicable):
nfs-utils-1.2.3-4.el6.x86_64

How reproducible:
Every time on RHEL6.0 and RHEL6.1 beta

Steps to Reproduce:
1. cd /home/map/key
2. automount sees that /export/map/key does not exist
2. automount tries to mount the directory over nfs instead: 
   /sbin/mount.nfs servera:/export/map/key /home/map/key -o rw
  
Actual results:
The "cd" command hangs for 120 seconds while it waits for the mount command to return.  

Expected results:
There is no nfs server running on servera, so it should fail immediately.

Additional info:
This appears to be a recent regression.  The original fix was in this patch:
http://www.redhat.com/archives/fedora-extras-commits/2008-April/msg04838.html

Based on this bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=439807

Comment 2 RHEL Program Management 2011-04-04 19:43:31 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Steve Dickson 2011-04-05 19:06:30 UTC
(In reply to comment #0)
> Description of problem:
> When I attempt to automount a local directory that does not exist, autofs calls
> mount.nfs on that local directory, and it takes 2 minutes for mount to
> eventually timeout.  This makes autofs impossible to use on RHEL6 in our
> environment where we share all our maps with LDAP, and the local directory
> doesn't always exist on all systems.
> 
> In RHEL5, the mount command will fail immediately, as expected.
> 
> Version-Release number of selected component (if applicable):
> nfs-utils-1.2.3-4.el6.x86_64
> 
> How reproducible:
> Every time on RHEL6.0 and RHEL6.1 beta
> 
> Steps to Reproduce:
> 1. cd /home/map/key
> 2. automount sees that /export/map/key does not exist
> 2. automount tries to mount the directory over nfs instead: 
>    /sbin/mount.nfs servera:/export/map/key /home/map/key -o rw
> 
> Actual results:
> The "cd" command hangs for 120 seconds while it waits for the mount command to
> return.  
> 
> Expected results:
> There is no nfs server running on servera, so it should fail immediately.
What OS is your server are using? 

Also please note, you can control this time out by using 
the '-o retry=0' mount option.

Comment 4 Jason 2011-04-05 19:39:49 UTC
We're using RHEL6.0, but it continues to occur on RHEL6.1.  

Your retry=0 mount option does help - the cd into our automounted directory now fails in 1 second rather than 120 seconds.  

However, 1 second is still too long as we have users constantly trying to access directories that don't exist.

We believe this is a code regression, as it continues to work as expected in RHEL5.

Comment 5 Steve Dickson 2011-04-06 14:15:32 UTC
(In reply to comment #4)
> We're using RHEL6.0, but it continues to occur on RHEL6.1.  
> 
> Your retry=0 mount option does help - the cd into our automounted directory now
> fails in 1 second rather than 120 seconds.  
> 
> However, 1 second is still too long as we have users constantly trying to
> access directories that don't exist.
> 
> We believe this is a code regression, as it continues to work as expected in
> RHEL5.
hmm... I think something else is going on here... the only errors
that will cause the mount to be retried are:
    ESTALE, ETIMEDOUT, ECONNREFUSED and EHOSTUNREACH

So could you please add a -vv to the mount command line 
which will show what the server is returning

Comment 6 Jason 2011-04-06 15:20:20 UTC
Sure:

[root@unixdeva16 ~]# time /bin/mount -t nfs -vv unixdeva16:/export/adm/scripts /home/adm/scripts
mount.nfs: timeout set for Wed Apr  6 11:14:51 2011
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
mount.nfs: mount(2): Connection refused
mount.nfs: Connection timed out

real    2m5.009s
user    0m0.003s
sys     0m0.004s


Note that unixdeva16 is not running an nfs server, but this is the behavior of automount when attempting to mount a local filesystem that does not exist.

This is the behavior on a RHEL5 server:

[root@unixdeva15 ~]# time /bin/mount -t nfs -vv unixdeva15:/export/adm/scripts /home/adm/scripts
mount: mount to NFS server 'unixdeva15' failed: RPC Error: Program not registered.

real    0m0.003s
user    0m0.001s
sys     0m0.001s

Comment 7 Steve Dickson 2011-04-07 14:53:00 UTC
(In reply to comment #6)
> Sure:
> 
> [root@unixdeva16 ~]# time /bin/mount -t nfs -vv unixdeva16:/export/adm/scripts
> /home/adm/scripts
> mount.nfs: timeout set for Wed Apr  6 11:14:51 2011
> mount.nfs: trying text-based options
> 'vers=4,addr=164.55.92.32,clientaddr=164.55.92.32'
> mount.nfs: mount(2): Connection refused
Ah, the "Connection refused" error is the reason the mount is being retried.
The mount interrupts the ECONNREFUSED error as the server is down
and could be on its way back up. 


> This is the behaviour on a RHEL5 server:
> 
> [root@unixdeva15 ~]# time /bin/mount -t nfs -vv unixdeva15:/export/adm/scripts
> /home/adm/scripts
> mount: mount to NFS server 'unixdeva15' failed: RPC Error: Program not
> registered.
Notice the error is different with RHEL5 "Program not registered" 
The error means decisively means the server is not up which
is the reason the mount is not retried.  

Also note with the RHEL6 mount 'vers=4' which means version 4 of
NFS protocol is being tried which is different from the RHEL5 mount
where version 3 is tried first. This difference in default protocol
is the reason different errors are being returned.

To test this theory out simply add a '-o v3' to the RHEL6 mount options
which should the mount to fail immediately.

If this indeed it the case, there are a couple work-a-rounds
1) Added the '-o v3' option to all the mount commands
2) Edit the /etc/nfsmount.conf to have version 3 mounts done to
   all the server that you know don't support v4. Something
   similar to:
   [ Server "unixdeva16" ]
     Defaultvers=3

Option 2 is probably the better way to handle this... IMHO..

Comment 8 Jason 2011-04-08 12:09:51 UTC
Ah I see - thank you for the clarification, Steve.  

We'll push out /etc/nfsmount.conf files to our RHEL6 servers with our configuration manager.  I just tried it on our server and using both the hostname and 'localhost', it works as expected!

[ Server "unixdeva16" ]
  Defaultvers=3
[ Server "localhost" ]
  Defaultvers=3

Comment 9 Steve Dickson 2011-04-08 14:26:57 UTC
Good to hear things are work.... I'm going to close this bug since it appears
the problem is solved.


Note You need to log in before you can comment on or make changes to this bug.