Bug 453804 - Using nfs4, file group ownership shows up as 'nobody' if file belongs to group with large number of users
Using nfs4, file group ownership shows up as 'nobody' if file belongs to grou...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: nfs-utils-lib (Show other bugs)
5.3
All Linux
high Severity medium
: rc
: ---
Assigned To: Jeff Layton
Martin Jenner
:
: 456180 549730 (view as bug list)
Depends On:
Blocks: 487029
  Show dependency treegraph
 
Reported: 2008-07-02 12:37 EDT by Steve
Modified: 2011-01-22 14:06 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 487029 (view as bug list)
Environment:
Last Closed: 2009-09-02 05:10:08 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
tcpdump captures of ls from NFS client within NFS+LDAP setup (1.76 MB, application/octet-stream)
2008-07-02 12:37 EDT, Steve
no flags Details
tcpdump captures of ls from NFS client within files based setup (2.88 KB, application/octet-stream)
2008-07-02 12:40 EDT, Steve
no flags Details
strace -f -tt -T -v -x -o logfile.txt -s 4096 ls -l /mnt/import/ (188.04 KB, text/plain)
2008-07-02 12:41 EDT, Steve
no flags Details
NFS client sosreport (1.37 MB, application/x-bzip2)
2008-07-02 12:42 EDT, Steve
no flags Details
LDAP server sosreport (1.40 MB, application/x-bzip2)
2008-07-02 12:42 EDT, Steve
no flags Details
NFS server sosreport (1.38 MB, application/x-bzip2)
2008-07-02 12:43 EDT, Steve
no flags Details
idmapd.conf (177 bytes, text/plain)
2008-07-29 06:48 EDT, Steve
no flags Details
patch -- retry getgrgid_r with larger buffer if call fails with -ERANGE error (2.59 KB, patch)
2008-10-24 13:08 EDT, Jeff Layton
no flags Details | Diff

  None (edit)
Description Steve 2008-07-02 12:37:58 EDT
Description of problem:

This is an escalation from issue tracker. I've summarized below ...

--------------------------- Original report ---------------------------
Description of problem:
When using NFSv4 + LDAP, the groups who contain too many users are mapped as
'nobody'. *The problem does not exist in NFSv3*.

It seems that the issue come from the length of the group, not the number of
members.

How reproductible :
Always

How to reproduce :
3 machines : LDAP server, NFSv4 server, a client
LDAP server : Create a group with many users.
NFS server : Create a shares with files or folder belonging to that group
... The client will see them as belonging to nobody.

The problem is coming from a length and not a maximum number of members (see
example below).
I had a first try on another machines, and this maximum length was diferent.

I have not been able to find any revelant bugzillas, or google searches.


Actual results:
[root@dhcp7-165 import]# ll
total 84
-rw-r--r-- 1 root group20longusernames    0 May 27 03:01 20longusernames
-rw-r--r-- 1 root group20users            0 May 27 02:04 20users
-rw-r--r-- 1 root nobody                  0 May 27 02:57 40longusernames
-rw-r--r-- 1 root group50users            0 May 27 02:46 50users
-rw-r--r-- 1 root group60users            0 May 27 02:50 60users
-rw-r--r-- 1 root group65users            0 May 27 02:52 65users
-rw-r--r-- 1 root group66users            0 May 27 02:55 66users
-rw-r--r-- 1 root nobody                  0 May 27 02:53 67users
-rw-r--r-- 1 root nobody                  0 May 27 02:51 70users
drwxr-xr-x 2 root nobody               4096 May 27 01:48 80longusernames
-rw-r--r-- 1 root nobody                  0 May 27 02:48 80users
drwxr-xr-x 2 root nobody               4096 May 27 01:49 90users
drwxr-xr-x 2 root nobody               4096 May 27 01:48 91users
drwxr-xr-x 2 root nobody               4096 May 27 01:48 92users
drwxr-xr-x 2 root nobody               4096 May 27 01:48 93users
-rw-r--r-- 1 root nobody                  0 May 27 02:07 noldap

We can see that group40longusernames, containing 40 users, with long names is
mapped as nobody whereas group66users, containing 66 users, with short names, is
mapped correctly.
In this example, it seems that the maximum is crossed with group67users.


Expected results:
[root@dhcp7-165 import]# ll
total 84
-rw-r--r-- 1 root group20longusernames    0 May 27 03:01 20longusernames
-rw-r--r-- 1 root group20users            0 May 27 02:04 20users
-rw-r--r-- 1 root group80longusernames    0 May 27 02:57 40longusernames
-rw-r--r-- 1 root group50users            0 May 27 02:46 50users
-rw-r--r-- 1 root group60users            0 May 27 02:50 60users
-rw-r--r-- 1 root group65users            0 May 27 02:52 65users
-rw-r--r-- 1 root group66users            0 May 27 02:55 66users
-rw-r--r-- 1 root group67users            0 May 27 02:53 67users
-rw-r--r-- 1 root group70users            0 May 27 02:51 70users
drwxr-xr-x 2 root group80longusernames 4096 May 27 01:48 80longusernames
-rw-r--r-- 1 root group80users            0 May 27 02:48 80users
drwxr-xr-x 2 root group90users         4096 May 27 01:49 90users
drwxr-xr-x 2 root group91users         4096 May 27 01:48 91users
drwxr-xr-x 2 root group92users         4096 May 27 01:48 92users
drwxr-xr-x 2 root group93users         4096 May 27 01:48 93users
-rw-r--r-- 1 root nobody                  0 May 27 02:07 noldap


Additional info:
[root@dhcp7-165 ~]# getent group group66users |wc -c
473
[root@dhcp7-165 ~]# getent group group67users |wc -c
480
-> the maximum number of caracters is between 473 and 480, including commas,
groupname, etc..

I am also joining the files :
fulllogfile.txt : # strace -f -tt -T -v -x -o logfile.txt -s 4096 ls -l /mnt/import/
ls-log.txt : # strace ls -l /mnt/import/
sosreport-client-cbuissar-110307-a41eab.tar.bz2
sosreport-ldap_server-844073-a1a81c.tar.bz2
sosreport-nfs_server-cbuissar-845457-f7115b.tar.bz2
ldif-files.tbz : list of ldif files I used to create the users and groups

My ldap config :
dn: dc=cedric-domain,dc=com
dn: ou=People,dc=cedric-domain,dc=com
dn: ou=Groups,dc=cedric-domain,dc=com

You may try to connect directly to the virtual machines :
ldap server : 10.65.7.182
NFS server : 10.65.7.127
Client : 10.65.7.165

What is the impact to the customer when they experience this problem?
security problems, as the permissions are not correct.
--------------------------- Original report ---------------------------

After a bit of investigation, we noticed that this problem is independent of
LDAP. We were able to reproduce this with a regular (/etc/{passwd,group} files
based) setup. If the number of users in a group is large (the exact number of
username characters is indeterminate), the file group owner shows up as nobody.

As seen from the attached straces and tcpdumps, the problem is from the nfs server.

I'm escalating this just to verify whether this is something well known and
understood. If this is not a known limitation/feature, I'll keep investigating.

regards,
- steve
Comment 1 Steve 2008-07-02 12:37:58 EDT
Created attachment 310822 [details]
tcpdump captures of ls from NFS client within NFS+LDAP setup
Comment 2 Steve 2008-07-02 12:40:23 EDT
Created attachment 310826 [details]
tcpdump captures of ls from NFS client within files based setup
Comment 3 Steve 2008-07-02 12:41:26 EDT
Created attachment 310827 [details]
strace -f -tt -T -v -x -o logfile.txt -s 4096 ls -l /mnt/import/
Comment 4 Steve 2008-07-02 12:42:00 EDT
Created attachment 310828 [details]
NFS client sosreport
Comment 5 Steve 2008-07-02 12:42:32 EDT
Created attachment 310829 [details]
LDAP server sosreport
Comment 6 Steve 2008-07-02 12:43:03 EDT
Created attachment 310830 [details]
NFS server sosreport
Comment 8 Jeff Layton 2008-07-28 10:18:33 EDT
Looking at the capture from comment #2. The totalgroup70 and totalgroup80 files
are both having their group owners reported as "nobody" by the server. So this
looks more like a server-side problem, but it's possible that there are
artificial limits like this in both directions (the client and server mapping
code is pretty similar).
Comment 9 Jeff Layton 2008-07-28 10:35:25 EDT
Hmm...the sosreports seem to lack the idmapd.conf files from these hosts (or
from your reproducer). Would it be possible to get them?

I assume that they're using "Method = nsswitch" there (not sure if it's possible
to do LDAP direct mapping with the idmapd in RHEL5), but it would be good to know.

Could you also paste in an excerpt from your "group" file for the groups that
owned the files for the capture in comment #2?
Comment 10 Jeff Layton 2008-07-28 11:03:12 EDT
From looking over the code, I *think* the limit may be _SC_GETGR_R_SIZE_MAX,
which is 1024 on my rhel5 x86_64 machine. Does the limit seem to be when the
list of group members is around 1024 chars?

The function nss_gid_to_name() is probably returning -ENOENT and that's making
idmapd fill in "nobody" for the group name. I'm not sure of the fix though. I'll
probably have to experiment some with getgrgid_r() if this turns out to be the
problem.
Comment 11 Issue Tracker 2008-07-29 06:39:04 EDT
Internal Status set to 'Waiting on SEG'

This event sent from IssueTracker by cbuissar 
 issue 182305
Comment 12 Issue Tracker 2008-07-29 06:47:29 EDT
Cedric, thanks for answering Jeff's questions. Sending your response up to
the BZ:

-------------------------------------------------------
I do not know whether the customer is using "Method = nsswitch", but I am
using it from my issue reproducing :
#  cat /etc/idmapd.conf  |grep Method
Method = nsswitch

I attached the idmap.conf file.

*However*, I just noticed something which might be important :
- As per the creation of the IT : it is not the number of group member
which is important (group40longusernames fails, but group66users is OK)
- *But* group20longusernames is working well and, surprise :
# getent group group20longusernames |wc -c
559
Which is much more than group67users (480), which does not work. My
earlier comment saying the length is between 473 and 480 is therefore not
correct.
-------------------------------------------------------



This event sent from IssueTracker by sfernand 
 issue 182305
Comment 13 Steve 2008-07-29 06:48:58 EDT
Created attachment 312855 [details]
idmapd.conf
Comment 15 Jeff Layton 2008-10-24 08:07:21 EDT
Looks like rawhide (nfs-utils-1.1.4-1.fc10) doesn't have this problem. I can have a rawhide host mount a nfs4 share from itself and it gets the right group mapping info, even with a group of 1000 users with 8 char names.

When I mount the same share on a RHEL5 machine, the group gets mapped to nobody. The call on the wire shows the right group name, so it seems that RHEL5's nfs-utils isn't picking up the group name correctly.
Comment 16 Jeff Layton 2008-10-24 13:08:09 EDT
Created attachment 321440 [details]
patch -- retry getgrgid_r with larger buffer if call fails with -ERANGE error

This patch seems to fix the problem. It's actually a problem with nfs-utils-lib where we don't retry the getgrnam_r call if it fails with -ERANGE (buffer too small).
Comment 19 Simon Vallet 2009-01-12 11:48:59 EST
I can confirm that this patch does seem to fix the problem.

On a RHEL5.2 client, nfs-utils-lib-1.0.8-7.2.z2:

etna14$  ls -l /env/export/nfs5/proj18/
total 976
drwxr-xr-x  5 ericp g_ericp    4096 jan  6 19:41 MetaHit
dr-xrws--- 18 joe   nobody     4096 sep  4 11:33 projet_ABD
dr-xrws--- 21 joe   g_cloaca   4096 déc 23 09:41 projet_ACS
dr-xrws--- 25 joe   nobody   565248 jan 10 10:10 projet_ACX
dr-xrws--- 18 joe   nobody   413696 jan  9 23:57 projet_ADI

After patching:

etna14$ ls -l /env/export/nfs5/proj18/
total 976
drwxr-xr-x  5 ericp g_ericp    4096 jan  6 19:41 MetaHit
dr-xrws--- 18 joe   g_intprj   4096 sep  4 11:33 projet_ABD
dr-xrws--- 21 joe   g_cloaca   4096 déc 23 09:41 projet_ACS
dr-xrws--- 25 joe   g_intprj 565248 jan 10 10:10 projet_ACX
dr-xrws--- 18 joe   g_intprj 413696 jan  9 23:57 projet_ADI

Simon
Comment 22 Brian Pontz 2009-02-09 14:41:39 EST
Just came across this problem last week. The above patch fixes it for me.
Comment 29 Jeff Layton 2009-03-03 10:26:03 EST
Committed in nfs-utils-lib-1.0.8-7.6.el5
Comment 32 David Kovalsky 2009-06-05 12:16:17 EDT
*** Bug 456180 has been marked as a duplicate of this bug. ***
Comment 35 errata-xmlrpc 2009-09-02 05:10:08 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1250.html
Comment 38 Steve Dickson 2011-01-22 14:06:04 EST
*** Bug 549730 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.