Bug 800679

Summary: NFS4 not mapping some groups from Solaris server
Product: [Fedora] Fedora Reporter: Darryl Bond <dbond>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 16CC: ant.starikov, bfields, jlayton, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-08 11:10:47 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
idmapd.conf
none
Uncompressed pcap trace none

Description Darryl Bond 2012-03-06 17:50:29 EST
Created attachment 568092 [details]
idmapd.conf

Description of problem:
Solaris 10 server, Fedora 16 client does not map some groups correctly
Group 14 has special significance on Solaris, sysadmin group. On Fedora etc group 14 is uucp. We have created a group in NIS 14:sysadmin_g. Solaris 10 server has some files and directories set to use sysadmin. These groups map to nobody on linux under nfs4 but map correctly to 14 on nfs3.

The Solaris server and fedora box uses the same NIS tables so have identical groups available.

The /etc/idmapd.conf Domain is set correctly so that the NIS users and groups map correctly. Th e/etc/nsswitch file is configured for nis.
passwd:     files nis
shadow:     files nis
group:      files nis
initgroups: files nis
hosts:      files nis dns

Note that the solaris grep 10 (staff) does map correctly to linux.

Version-Release number of selected component (if applicable):
nfs-utils-1.2.5-4.fc16.x86_64
libnfsidmap-0.24-7.fc16.x86_64
SunOS host 5.10 Generic_144488-17 sun4u sparc SUNW,SPARC-Enterprise

How reproducible:
Always, including on Centos 6 (and presumably RHEL6)

Steps to Reproduce:
1. On Solaris
touch /nfsshare/testfile
chgrp sysadmin /nfsshare/testfile
2. mount solaris:/nfsshare /mnt
3. ls -l /mnt/testfile
  
Actual results:
[xxxx@linux ~]$ ls -l testfile
-rw-rw-r--. 1 xxxx nobody 0 Mar  7 08:39 testfile


Expected results:
group should be sysadmin_g or uucp not nobody.

Additional info:
Comment 1 Steve Dickson 2012-03-15 10:21:11 EDT
Would it be possible to get a binary network trace using either 
   tcpdump -s0 -w /tmp/data.pcap host <server>
or 
   tshark -w /tmp/data.pcap <server>

The bzip2 the trace file:
   bzip2 /tmp/data.pcap
Comment 2 Steve Dickson 2012-03-15 11:54:44 EDT
This could be the same problem:

https://bugzilla.redhat.com/show_bug.cgi?id=770490
Comment 3 Darryl Bond 2012-03-15 18:58:04 EDT
Created attachment 570446 [details]
Uncompressed pcap trace

This trace was created with the following in commands
mount 10.4.171.31:/export/stds/genl /mnt
ls -ln /mnt/lib/lts
umount /mnt

the ls displays group 99 rather than 14
Comment 4 Steve Dickson 2012-03-19 10:13:34 EDT
(In reply to comment #3)
> Created attachment 570446 [details]
> Uncompressed pcap trace
> 
> This trace was created with the following in commands
> mount 10.4.171.31:/export/stds/genl /mnt
> ls -ln /mnt/lib/lts
> umount /mnt
> 
> the ls displays group 99 rather than 14
Well the server is returning 'username@gps.local' so the question is why is those name not being mapped correctly on the client side. 

Is the client and server in the same domain name (aka gps.local)? If not please added a 'Domian = gps.local' string to /etc/idmapd.conf. 

If the above does not work please turn on debugging by added setting  RPCIDMAPDARGS="-v' in /etc/sysconfig/nfs and then post what is is dmesg.
Comment 5 Darryl Bond 2012-03-19 17:14:06 EDT
Without the Domain=gps.local the rest of the names do not map.
Here is a tail of /var/log/messages after the idmapd service was restarted and an ls -l issued.

Mar 20 07:08:02 bashful rpc.idmapd[2997]: libnfsidmap: using domain: gps.local
Mar 20 07:08:02 bashful rpc.idmapd[2997]: libnfsidmap: Realms list: 'GPS.LOCAL' 
Mar 20 07:08:02 bashful rpc.idmapd[2997]: rpc.idmapd: libnfsidmap: using domain: gps.local
Mar 20 07:08:02 bashful rpc.idmapd[2997]: libnfsidmap: loaded plugin /lib64/libnfsidmap/nsswitch.so for method nsswitch
Mar 20 07:08:02 bashful rpc.idmapd[2997]: rpc.idmapd: libnfsidmap: Realms list: 'GPS.LOCAL'
Mar 20 07:08:02 bashful rpc.idmapd[2997]: rpc.idmapd: libnfsidmap: loaded plugin /lib64/libnfsidmap/nsswitch.so for method nsswitch
Mar 20 07:08:02 bashful rpc.idmapd[2998]: Expiration time is 600 seconds.
Mar 20 07:08:02 bashful rpc.idmapd[2998]: nfsdopenone: Opening /proc/net/rpc/nfs4.nametoid/channel failed: errno 2 (No such file or directory)
Mar 20 07:08:02 bashful rpc.idmapd[2998]: New client: 0
Mar 20 07:08:02 bashful rpc.idmapd[2998]: New client: 1
Mar 20 07:08:02 bashful rpc.idmapd[2998]: New client: 2
Mar 20 07:08:02 bashful rpc.idmapd[2998]: New client: 3
Mar 20 07:08:02 bashful rpc.idmapd[2998]: New client: 4
Mar 20 07:08:25 bashful automount[1803]: rpc_get_exports_proto
Mar 20 07:08:42 bashful rpc.idmapd[2998]: New client: 5
Mar 20 07:08:42 bashful rpc.idmapd[2998]: Opened /var/lib/nfs/rpc_pipefs//nfs/clnt5/idmap
Mar 20 07:08:42 bashful rpc.idmapd[2998]: New client: 6
Mar 20 07:08:42 bashful rpc.idmapd[2998]: New client: 7
Mar 20 07:08:42 bashful rpc.idmapd[2998]: Stale client: 6
Mar 20 07:08:42 bashful rpc.idmapd[2998]: #011-> closed /var/lib/nfs/rpc_pipefs//nfs/clnt6/idmap
Comment 6 Steve Dickson 2012-03-20 08:52:33 EDT
(In reply to comment #5)
> Without the Domain=gps.local the rest of the names do not map.
> Here is a tail of /var/log/messages after the idmapd service was restarted and
> an ls -l issued.
> 
> Mar 20 07:08:02 bashful rpc.idmapd[2997]: libnfsidmap: using domain: gps.local
> Mar 20 07:08:02 bashful rpc.idmapd[2997]: libnfsidmap: Realms list: 'GPS.LOCAL' 
> Mar 20 07:08:02 bashful rpc.idmapd[2997]: rpc.idmapd: libnfsidmap: using
> domain: gps.local
hmm... I wonder if this is related to bz753930 which was fixed
by the upstream commit:

commit 0c1766af5bf125a56fbf589111aa2f3876a5d709
Author: Steve Dickson <steved@redhat.com>
Date:   Sat Nov 12 10:19:52 2011 -0500

    nss_getpwnam: ignore case when comparing domain names
    
    nss_getpwnam() fails to find the password entry when the
    DNS domain name has both upper and lower characters,
    which is wrong. Case need to be ignored when comparing
    domain names.
    
    Signed-off-by: Steve Dickson <steved@redhat.com>

Just yesterday I fixed a debugging bug in this area. So could yuo
please try the following update which also contains the above
bug fix:

https://admin.fedoraproject.org/updates/libnfsidmap-0.24-8.fc16?_csrf_token=c4d5506098cc38d762bc76ce63381e2462575ce0

> Mar 20 07:08:02 bashful rpc.idmapd[2997]: libnfsidmap: loaded plugin
> /lib64/libnfsidmap/nsswitch.so for method nsswitch
> Mar 20 07:08:02 bashful rpc.idmapd[2997]: rpc.idmapd: libnfsidmap: Realms list:
> 'GPS.LOCAL'
> Mar 20 07:08:02 bashful rpc.idmapd[2997]: rpc.idmapd: libnfsidmap: loaded
> plugin /lib64/libnfsidmap/nsswitch.so for method nsswitch
> Mar 20 07:08:02 bashful rpc.idmapd[2998]: Expiration time is 600 seconds.
> Mar 20 07:08:02 bashful rpc.idmapd[2998]: nfsdopenone: Opening
> /proc/net/rpc/nfs4.nametoid/channel failed: errno 2 (No such file or directory)
I'm assuming the NFS server is not up which is the cause of this error...
Comment 7 Darryl Bond 2012-03-20 17:12:04 EDT
Nope, the nfs server was up the whole time.

I applied the test package and rebooted. The behavior is the same.

I noted that the log no longer mentions the Realm or domain name?

Mar 21 07:03:16 bashful rpc.idmapd[1453]: New client: 0
Mar 21 07:03:16 bashful rpc.idmapd[1453]: New client: 1
Mar 21 07:03:16 bashful rpc.idmapd[1453]: New client: 2
Mar 21 07:05:22 bashful rpc.idmapd[1453]: New client: 3
Mar 21 07:05:22 bashful rpc.idmapd[1453]: New client: 4
Mar 21 07:06:34 bashful rpc.idmapd[1453]: New client: 5
Mar 21 07:06:34 bashful rpc.idmapd[1453]: Opened /var/lib/nfs/rpc_pipefs//nfs/clnt5/idmap
Mar 21 07:06:34 bashful rpc.idmapd[1453]: New client: 6
Mar 21 07:06:34 bashful rpc.idmapd[1453]: New client: 7
Mar 21 07:06:34 bashful rpc.idmapd[1453]: Stale client: 6
Mar 21 07:06:34 bashful rpc.idmapd[1453]: #011-> closed /var/lib/nfs/rpc_pipefs//nfs/clnt6/idmap
Comment 8 Steve Dickson 2012-03-21 13:37:50 EDT
(In reply to comment #7)
> Nope, the nfs server was up the whole time.
> 
> I applied the test package and rebooted. The behavior is the same.
> 
> I noted that the log no longer mentions the Realm or domain name?
> 
hmm... that is strange..... did you restart rpc.idmapd?

I see a lot of libnfsidmap debugging when I do either
a  rpc.idmapd -f -v or set Verbose=1 in /etc/idmapd.conf
Comment 9 Darryl Bond 2012-03-21 17:26:04 EDT
Ok this looks better:

[root@bashful log]# /usr/sbin/rpc.idmapd -v -f
rpc.idmapd: libnfsidmap: using domain: gps.local
rpc.idmapd: libnfsidmap: Realms list: 'GPS.LOCAL' 
rpc.idmapd: libnfsidmap: loaded plugin /lib64/libnfsidmap/nsswitch.so for method nsswitch

rpc.idmapd: Expiration time is 600 seconds.
rpc.idmapd: nfsdopenone: Opening /proc/net/rpc/nfs4.nametoid/channel failed: errno 2 (No such file or directory)
rpc.idmapd: New client: 0
rpc.idmapd: New client: 1
rpc.idmapd: New client: 2
rpc.idmapd: New client: 3
rpc.idmapd: New client: 4
rpc.idmapd: New client: 8
rpc.idmapd: Opened /var/lib/nfs/rpc_pipefs//nfs/clnt8/idmap
rpc.idmapd: New client: 9
rpc.idmapd: Client 8: (user) name "root@gps.local" -> id "0"
rpc.idmapd: Client 8: (group) name "root@gps.local" -> id "0"
rpc.idmapd: Client 8: (group) name "sys@gps.local" -> id "3"
rpc.idmapd: New client: a
rpc.idmapd: Client 8: (group) name "other@gps.local" -> id "1"
[warn] event_del: event has no event_base set.
rpc.idmapd: Stale client: 9
rpc.idmapd: 	-> closed /var/lib/nfs/rpc_pipefs//nfs/clnt9/idmap
rpc.idmapd: Client 8: (user) name "u1@gps.local" -> id "1102"
rpc.idmapd: Client 8: (group) name "sysadmin@gps.local" -> id "60001"
rpc.idmapd: Client 8: (group) name "staff@gps.local" -> id "10"
rpc.idmapd: Client 8: (user) name "u2@gps.local" -> id "1529"
rpc.idmapd: Client 8: (group) name "enghtml@gps.local" -> id "119"
rpc.idmapd: Client 8: (user) name "u3@gps.local" -> id "1351"
rpc.idmapd: Client 8: (group) name "draftsrv@gps.local" -> id "40"
rpc.idmapd: Client 8: (user) name "u4@gps.local" -> id "60953"
rpc.idmapd: Client 8: (group) name "u4@gps.local" -> id "60953"
rpc.idmapd: Client 8: (group) name "u5@gps.local" -> id "117"
rpc.idmapd: Client 8: (group) name "u2@gps.local" -> id "1529"

Note it is mapping sysadmin to nobody, the others are correct.
Comment 10 Steve Dickson 2012-03-22 07:38:22 EDT
(In reply to comment #9)
> Ok this looks better:
> 
> [root@bashful log]# /usr/sbin/rpc.idmapd -v -f
> rpc.idmapd: libnfsidmap: using domain: gps.local
> rpc.idmapd: libnfsidmap: Realms list: 'GPS.LOCAL' 
> rpc.idmapd: libnfsidmap: loaded plugin /lib64/libnfsidmap/nsswitch.so for
> method nsswitch
> 
> rpc.idmapd: Expiration time is 600 seconds.
> rpc.idmapd: nfsdopenone: Opening /proc/net/rpc/nfs4.nametoid/channel failed:
> errno 2 (No such file or directory)
> rpc.idmapd: New client: 0
> rpc.idmapd: New client: 1
> rpc.idmapd: New client: 2
> rpc.idmapd: New client: 3
> rpc.idmapd: New client: 4
> rpc.idmapd: New client: 8
> rpc.idmapd: Opened /var/lib/nfs/rpc_pipefs//nfs/clnt8/idmap
> rpc.idmapd: New client: 9
> rpc.idmapd: Client 8: (user) name "root@gps.local" -> id "0"
> rpc.idmapd: Client 8: (group) name "root@gps.local" -> id "0"
> rpc.idmapd: Client 8: (group) name "sys@gps.local" -> id "3"
> rpc.idmapd: New client: a
> rpc.idmapd: Client 8: (group) name "other@gps.local" -> id "1"
> [warn] event_del: event has no event_base set.
> rpc.idmapd: Stale client: 9
> rpc.idmapd:  -> closed /var/lib/nfs/rpc_pipefs//nfs/clnt9/idmap
> rpc.idmapd: Client 8: (user) name "u1@gps.local" -> id "1102"
> rpc.idmapd: Client 8: (group) name "sysadmin@gps.local" -> id "60001"
> 
> Note it is mapping sysadmin to nobody, the others are correct.
id "60001" is nobody on your system? On my F16 box nobody has an uid of "99".
Comment 11 Steve Dickson 2012-03-22 15:30:08 EDT
*** Bug 770490 has been marked as a duplicate of this bug. ***
Comment 12 Darryl Bond 2012-03-22 17:07:36 EDT
Oracle Corporation	SunOS 5.10	Generic Patch	January 2005
[user@solarisbox ~]$ grep nobody /etc/group
nobody::60001:
[user@solarisbox ~]$ ypcat group | grep nobody
nobody::60001:
[dbond@solarisbox ~]$
Comment 13 Steve Dickson 2012-03-23 10:41:50 EDT
(In reply to comment #12)
> Oracle Corporation SunOS 5.10 Generic Patch January 2005
> [user@solarisbox ~]$ grep nobody /etc/group
> nobody::60001:
> [user@solarisbox ~]$ ypcat group | grep nobody
> nobody::60001:
> [dbond@solarisbox ~]$
Ok.. so be it... 

What should the uid/gid of sysadmin be? It does exist locally or in NIS?
Comment 14 Steve Dickson 2012-03-26 15:13:02 EDT
So in theory updating to libnfsidmap-0.24-7.fc16 should fix this problem
Comment 15 Darryl Bond 2012-03-26 17:51:10 EDT
The user and group of sysadmin is 14
Our NIS is hosted by Windows Services for Unix. Active Directory does not allow users and groups to share the same name. Our NIS users and groups for sysadmin look like this:
passwd:
sysadmin:jEOAFknCz1234:14:20014:sys admin user:/homes/host/sysadmin:/bin/ksh
group:
sysadmin_g:*:14:

I see this might explain the problem, the idmap debug above shows that it requests the group by name and doesn't use the GID. There is no group sysadmin in NIS, only on the solaris server.

libnfsidmap-0.24-8.fc16.x86_64 doesn't fix it? should I downgrade to 0.24-7?
Comment 16 Steve Dickson 2012-03-27 10:29:02 EDT
(In reply to comment #15)
> The user and group of sysadmin is 14
> Our NIS is hosted by Windows Services for Unix. Active Directory does not allow
> users and groups to share the same name. Our NIS users and groups for sysadmin
> look like this:
> passwd:
> sysadmin:jEOAFknCz1234:14:20014:sys admin user:/homes/host/sysadmin:/bin/ksh
> group:
> sysadmin_g:*:14:
> 
> I see this might explain the problem, the idmap debug above shows that it
> requests the group by name and doesn't use the GID. There is no group sysadmin
> in NIS, only on the solaris server.
ah this does explain why the group is not being found... But I wonder why 
the group is not being found in /etc/groups... Generally the look up is
done to the file and then NIS is consulted. 
 
> 
> libnfsidmap-0.24-8.fc16.x86_64 doesn't fix it? should I downgrade to 0.24-7?
-8 does have bug fix that deals with DNS names.... So you can leave it...
Comment 17 Darryl Bond 2012-03-27 18:21:40 EDT
Aahhh, I had it working but now I can't get it to work again.
I remember the steps that let to it working, but I can't duplicate it :)

I had the nfs4 mount in place.
I fiddled with /etc/group and changed
# uucp:x:14:
sysadmin:x:14:
ls -l still showed nobody
I ran ldconfig which sometimes triggers a re-read of nsswitch and dumping of caches
ls -l still showed nobody

I ran service nfs-idmapd stop (which is the incorrect name and did not stop anything, just didn't tell me)
I ran 
# rpc.idmapd -f -v (on top of the already running service)
ls -l still showed nobody

I umounted the nfs4 mount and remounted it.
Lo, the ls -l showed sysadmin_g

I attempted to duplicate it on another F16 box with the new libnfsidmap-0.24-7.fc16 in place, no go.

I rebooted mine and have not been able to get it to work again.

It was strange though, when ot was working, the output of rpc.idmapd -f -v did not show it mapping the sysadmin group ???

# getent group 14
uucp:x:14:
# getent group sysadmin
# getent group sysadmin_g
sysadmin_g:*:14:
Comment 18 Steve Dickson 2012-11-08 11:10:47 EST
I'm going to close, feel free to reopen if the problems comes back...