Red Hat Bugzilla – Bug 251368
rpc.mountd crashes with particular netgroup setup
Last modified: 2007-11-30 17:12:12 EST
+++ This bug was initially created as a clone of Bug #250609 +++
+++ This bug was initially created as a clone of Bug #250594 +++
Escalated to Bugzilla from IssueTracker
-- Additional comment from firstname.lastname@example.org on 2007-08-02 07:51 EST --
Description of problem:
rpc.mountd crashes with particular netgroup setup
We have a particular setup for nfs using netgroups on a
large clustered NFS server with many home directories.
Every user has a netgroup of its own and we use member
netgroups to customize exports for each home directory
to sets of machines.
One particular research groups has machines where over 150
home directories and other NFS shares exported via
netgroups are accessible.
This crashes rpc.mountd. We have
located the bug in function auth_authenticate_internal()
in utils/mountd/auth.c in the source for the
RHEL4 nfs-utils-1.0.6-80 RPM. Here are lines 77 to 84:
my_client.m_naddr = 0;
my_client.m_addrlist = caller->sin_addr;
n = client_compose(caller->sin_addr);
*error = unknown_host;
The problem is that my_client.m_hostname is only 1024 bytes
large, strcpy and not strncpy is used and that the string
returned by client_compose() in our case is over 1024
characters long. The call to strcpy corrupts memory.
The bug is still present in the upstream source at
http://linux-nfs.org/ and thuss affects also RHEL5.
This event sent from IssueTracker by sprabhu [SEG - Storage]
-- Additional comment from email@example.com on 2007-08-02 09:12 EST --
Fixing this will take more than just changing this to strncpy. We could (and
maybe should) trivially do that, but we'll still have artificial limits due to
the string length. The ideal thing would be to eliminate the need for this
concatenated string entirely. Unfortunately, my last effort at this was shot
down upstream by Neil B (for good reason -- it didn't work the way it needed
to). I'll have look again at this code and see if we can do it in a way that
takes care of Neil's objections.
Cloning for fedora to track upstream work...
Created attachment 160915 [details]
patch 1 -- make m_hostname a dynamically allocated string
This patch just makes it so that m_hostname is a dynamically allocated string,
rather than a fixed length array of NFSCLNT_IDMAX.
Created attachment 160916 [details]
patch 2 -- make e_hostname a dynamically allocated string
This makes the exportent.e_hostname a dynamically allocated string. This needs
to be done since we copy m_hostname into it in some places.
Created attachment 160917 [details]
patch3 -- make a client_resolve function and have client_compose take a hostent
This patch may not be strictly needed, but it should get rid of one set of
hostname resolution from the auth_authenticate codepath. Rather than resolving
the name in auth_authenticate and then doing it again in client_compose, we
just use reuse the first one in client_compose.
These patches should get rid of the userspace limitations. It also adds a bit of
optimization. The current code does a lot of string copying that can be avoided,
and also should help eliminate some extra hostname resolution. We should be able
to have arbitrarily long m_hostname strings with this. I believe, however, we
have a limitation on how long this string can be when it's written to the kernel
So the big issue is what to do now:
1) hash these strings and use the hashes as cache keys, but that means pulling
in a hashing routine (maybe libcrypt.a?). This is also doesn't deal with another
problem which is that when using a lot of netgroups, you have to resolve them
any that are listed in any export.
2) use Neil B's idea and use the ipaddr as a cache key. This means some
duplication in the export cache, but maybe that's acceptable to cut down on the
number of netgroup lookups.
Reproducer is apparently here. Though by inspection, I can see what the issue is
Created attachment 161172 [details]
patch 4 -- add "use_ipaddr" command line option
This patch adds a use_ipaddr command-line option and makes it so that ipaddr
maps to ipaddr in auth.unix.ip cache when it's enabled. When we go to look up
in the nfsd.export cache, we have an IP address and a path. mountd can then
just compare the ip address/hostname to entries that match the path. In
addition to working around the string length limitations of the kernel caches,
this should prevent the (rather severe) performance penalty incurred when
dealing with an export table that contains a large number of netgroups.
As a side note, NFSCLNT_IDMAX is set to 1024, but that seems to be a number that
was grabbed out of thin air. The actual string length limitation seems to be
~945 or so (between 940 and 949, determined emprirically). I suspect that the
total string length of a cache line maxes out at 1024).
I'm not thrilled with using a command-line flag for this, but I don't see a
great way to automatically switch between the two modes. I need to do some more
testing and then I'll post it to nfs list as an RFC. Perhaps someone there will
have a good way to switch between the two modes.
My testing hasn't shown any obvious problems. Sent patches to upstream list as
an RFC to see if the general approach is valid. Awaiting feedback...
Created attachment 161874 [details]
patch4 -- add new "mode" for mountd to change how ipaddrs are mapped
Neil Brown commented that the patches looked good to him. My testing has been
successful with them so far, so I've told him to go ahead and commit the first
3 patches as is.
I've respun the 4th patch though to make it so that the "use_ipaddr" mode gets
enabled automatically when the list of netgroups is more than 1/2 of
NFSCLNT_IDMAX. This is the explanation I sent with the upstream patch:
If the total length of all m_hostnames in the MCL_NETGROUP list is more
than half NFSCLNT_IDMAX, then switch to the "use_ipaddr" mode.
I considered doing what Neil suggested and just using all of the
m_hostnames from all of the clientlists. Consider however a
diskless client setup where someone has a ton of exports all
exported to different IP addresses. We probably don't want to enable
this there, but it would probably be enabled if we based this on
all of the m_hostnames.
I figure someone will want to enable this new mode for 2 reasons:
1) a client is likely to overflow NFSCLNT_IDMAX (which is a decent
approximation of the max length of a line in the kernel cache)
2) we have a lot of netgroups in /etc/exports and resolving them all
for a single mount is prohibitively expensive
I think the only reasonable way to end up in situation #1 is if you
are dealing with a lot of netgroups. A client isn't likely to match
a bunch of different MCL_WILDCARD entries (though maybe it can), nor is
it likely to match a bunch of different MCL_SUBNET entries. So it
really only makes sense to pay attention to the MCL_NETGROUP list
Why NFSCLNT_IDMAX/2? No particular reason. I figured that would mean
that we had a lot of netgroups, but still had enough space in the
string to allow for non-netgroup matches.
Waiting for upstream comments at the moment. Ideally we'd like to make this not
need a new command-line option at all. I think that would entail getting rid of
the "old" behavior altogether, but Neilb seems to want to keep it since it does
reduce the size of the kernel caches. So we may be stuck with a new
command-line option unless he can be convinced.
New patchset posted upstream. Awaiting comments/commits...
Patchset is now upstream in nfs-utils git tree. There is still some ongoing
discussion about whether we should also include a commandline option to turn the
new behavior on and off, but I'm going to close this as UPSTREAM for now.