Bug 619464
Summary: | hang on existing systems when exporting NFS share to new systems | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | jas |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED CANTFIX | QA Contact: | yanfu,wang <yanwang> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 4.6 | CC: | bfields, jlayton |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-12-08 15:36:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
jas
2010-07-29 15:29:45 UTC
One minor correction -- given a large /etc/exports file where I export many filesystems to over 800 systems each, if I add a new export to only say, 1 system, then I use exportfs to manually export that export to one additional host (total 2 exports), I get the same hang on the original host that I was exporting to when I export to the other host. This leads me to believe that the sheer TOTAL size of the exports file has more to do with this problem than the size of the individual export list for one export. Doing more experimentation, I can verify that it is indeed the length of the exports file that makes a difference... When exportfs runs, it updates /var/lib/nfs/etab. When it exists, rpc.mountd processes the new etab file, and during this time, existing exports are non-functional. The -t option in later versions of rpc.mountd would probably solve this. From the man page of a later rpc.mountd on -t: " This option specifies the number of worker threads that rpc.mountd spawns. The default is 1 thread, which is probably enough. More threads are usually only needed for NFS servers which need to handle mount storms of hundreds of NFS mounts in a few seconds, or when your DNS server is slow or unreliable. ". If possible, it would be really interesting to see how RHEL5 or 6 behaves in your setup, and whether -t helps there. Installing a recent nfs-utils from source would also be another way to test this. It is tricky to test RHEL5 or 6 in the production environment, but I have a couple of other ideas. I tried to compile a later nfs-utils, but ran into a problem when there's a call for libblkid that doesn't exist in the older version of the lib.... I tried compiling the RHEL5 version from source RPM, but this would have required installing all the other dependencies... I tried to backport -t to RHEL4, and I did do it, but when I went to compile the source RPM, I realized that the patches are installed in order, and hence my patch which is based on the original mountd.c wouldn't apply cleanly... but I can fix that tomorrow.. I do have a new theory that the time involved in the transaction is because mountd is doing reverse lookup for each of the hostnames in the DNS. That's 13,000 DNS lookups, and it does this every single time you add or remove a single host via exportfs. On another server where I have 30,000 shares total, the delay is actually around 30 seconds! I didn't even know that before. I'm interested to see what happens if I convert exports over to use IP instead of name... 13,000 or 30,000 DNS lookups might take a while, and it appears that during this process, everything hangs... hmmm... will try tomorrow. Ok. I don't get it. I converted all of /etc/exports to using IPs, and when exportfs is called, it reverse lookups every IP and converts back to name to place in /var/lib/nfs/etab! why!? There's no option to exportfs to tell it NOT to do that either. This, of course forces mountd to convert them back to IP again. ugh. I guess -t is the only way. :( |