Bug 619464

Summary: hang on existing systems when exporting NFS share to new systems
Product: Red Hat Enterprise Linux 4 Reporter: jas
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED CANTFIX QA Contact: yanfu,wang <yanwang>
Severity: medium Docs Contact:
Priority: low    
Version: 4.6CC: bfields, jlayton
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-08 15:36:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jas 2010-07-29 15:29:45 UTC
Description of problem:

My list of NFS exports has been gradually growing over the years. Right now, for example, my home directories are exported to around 800 hosts, even though only a relatively small subset of those will mount at the same time.  I used to just add hosts to /etc/exports on the file server, and run "exportfs -r", and everything would be fine.  New systems would be able to mount everything perfectly, and existing systems would not be affected by the additional export at all.  

As my list of exports has grown, I've been noticing a problem. Now, when I run exportfs -r, there is an approximate 7-10 second hang on the systems that have already mounted the share, and then everything returns to normal.  This doesn't happen *while* exportfs -r is running, but just after it exits.  I figured that maybe exportfs was "unexporting"/re-exporting to hosts that already had the share in use which might have caused the problem, so I tried to manually add/remove hosts thinking that this would only affect those hosts, but it did not. Exporting to one new host still causes the hang on all existing hosts.

Since I have multiple exports to all of the hosts, adding one new host can hang things for a while.  I can see that reducing the list of exports, or hosts would reduce the delay.  What I am wondering is if there is a better way that I can add hosts without affecting connectivity to existing hosts?

The NFS server itself is pretty powerful -- dual quad core box, lots of memory, many NFS threads, exclusive NFS server, etc...  Running 4.6, I have nfs-utils 1.0.6-84.EL4, and I can see that this has been upgraded to 1.0.6-93.EL4, but it doesn't sound like anything in the release notes would imply a fix in this area.

Version-Release number of selected component (if applicable):

nfs-utils 1.0.6-84.EL4

How reproducible:

export NFS directory to say, 800 hosts, then on one of those hosts, mount the directory, and run an "ls -R /export" on it.  On the file server, export to one additional host with exportfs.  After exportfs returns, the host doing the ls -R will hang for 10 seconds or so.

Additional info:

running 4.6 with kernel 2.6.9-89.0.23.EL

I sent a message on the linux nfs list, and it was suggested that the "-t" option in later versions of rpc.mountd might help.  The -t option was not backported to rpc.mountd in 1.0.6-93.EL4.  I doubt it will be in 4.9.

Comment 1 jas 2010-07-29 16:47:03 UTC
One minor correction -- given a large /etc/exports file where I export many filesystems to over 800 systems each, if I add a new export to only say, 1 system, then I use exportfs to manually export that export to one additional host (total 2 exports), I get the same hang on the original host that I was exporting to when I export to the other host.  This leads me to believe that the sheer TOTAL size of the exports file has more to do with this problem than the size of the individual export list for one export.

Comment 2 jas 2010-07-29 18:11:34 UTC
Doing more experimentation, I can verify that it is indeed the length of the exports file that makes a difference...  When exportfs runs, it updates /var/lib/nfs/etab.  When it exists, rpc.mountd processes the new etab file, and during this time, existing exports are non-functional.  The -t option in later versions of rpc.mountd would probably solve this.  From the man page of a later rpc.mountd on -t:  " This option specifies the number of worker threads that rpc.mountd spawns. The default is 1 thread, which is probably enough. More threads are usually only needed for NFS servers which need to handle mount storms of hundreds of NFS mounts in a few seconds, or when your DNS server is slow or unreliable. ".

Comment 3 J. Bruce Fields 2010-07-29 23:01:32 UTC
If possible, it would be really interesting to see how RHEL5 or 6 behaves in your setup, and whether -t helps there.

Installing a recent nfs-utils from source would also be another way to test this.

Comment 4 jas 2010-07-30 01:43:56 UTC
It is tricky to test RHEL5 or 6 in the production environment, but I have a couple of other ideas.  I tried to compile a later nfs-utils, but ran into a problem when there's a call for libblkid that doesn't exist in the older version of the lib.... I tried compiling the RHEL5 version from source RPM, but this would have required installing all the other dependencies... I tried to backport -t to RHEL4, and I did do it, but when I went to compile the source RPM, I realized that the patches are installed in order, and hence my patch which is based on the original mountd.c wouldn't apply cleanly... but I can fix that tomorrow..   I do have a new theory that the time involved in the transaction is because mountd is doing reverse lookup for each of the hostnames in the DNS.  That's 13,000 DNS lookups, and it does this every single time you add or remove a single host via exportfs.  On another server where I have 30,000 shares total, the delay is actually around 30 seconds!  I didn't even know that before.  I'm interested to see what happens if I convert exports over to use IP instead of name... 13,000 or 30,000 DNS lookups might take a while, and it appears that during this process, everything hangs... hmmm... will try tomorrow.

Comment 5 jas 2010-07-30 16:24:41 UTC
Ok.  I don't get it.  I converted all of /etc/exports to using IPs, and when exportfs is called, it reverse lookups every IP and converts back to name to place in /var/lib/nfs/etab!  why!?  There's no option to exportfs to tell it NOT to do that either.  This, of course forces mountd to convert them back to IP again.  ugh.  I guess -t is the only way. :(