Bug 507341

Summary: svcgssd: Add time out argument
Product: Red Hat Enterprise Linux 5 Reporter: Sachin Prabhu <sprabhu>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: yanfu,wang <yanwang>
Severity: medium Docs Contact:
Priority: high    
Version: 5.8CC: bfields, cww, cye, iannis, jwest, orion
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: nfs-utils-1.0.9-61.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-08 07:33:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 668957, 807971    
Attachments:
Description Flags
RHEL 5: Add timeout option to client side rpc.gssd utility. none

Description Sachin Prabhu 2009-06-22 12:36:56 UTC
Consider a nfs share using kerberos authentication mounted on a client and using a user's kerberos principal. If the additional groups for this user are modified and the user logs in with the new parameters, the changes in the additional groups will not be reflected for the nfs share unless the share is re-mounted.

To reproduce:
1) Export a nfs4 share using kerberos authentication
/export      gss/krb5(sync,rw,fsid=0,insecure,no_subtree_check)
/export/data gss/krb5(sync,rw,nohide,insecure,no_subtree_check)
2) Create user foo part of group foo. Make sure that the user details for this 
user are available both on client and server.
3) Create group bar.
4) Create 2 directories on the nfs share.
Directory foo with ownership root.foo and mode set to 2775
Directory bar with ownership root.bar and mode set to 2775

On the client, 
1) Login and obtain a kerberos principal for this user.
2) mount the share export from the nfs server from step1
3) create file in /mnt/foo/ and confirm that the user can write to the direcoty
4) try and create file in /mnt/bar. This should fail since the user doesn't have permissions to write to this directory.
5) Modify user foo to add additional group bar. Make sure that these details are available on both the nfs server and client.
6) Re-login as foo and confirm that the user is part of group bar using the id command.
7) try creating file in /mnt/foo. This is successful.
8) try creating file in /mnt/bar. This fails even though user has permission to write to this directory.

The problem is caused due to high expiration time on the context cache on the nfs server which stores the uid/gids for the user. The old values are used in this case unless the share is remounted or the cache flused on the server with the command

The user will not be able to write to this directory unless the share is remounted on the client or the rsc(context) cache is flushed on the server with the command
echo `date +'%s'` > /proc/net/rpc/auth.rpcsec.context/flush

The timeout value for each cache entry is set to a high value in 

static int
do_svc_downcall(gss_buffer_desc *out_handle, struct svc_cred *cred,
                gss_OID mech, gss_buffer_desc *context_token)
{
..
        qword_printint(f, 0x7fffffff); /*XXX need a better timeout */
..
}

Such high expiry times can also result in memory issues on the nfs server. 

The request is to set this expiry time to a sane value.

Comment 1 Sachin Prabhu 2009-06-22 12:39:19 UTC
Upstream fix:

commit eb3a145789b9eedd39b56e1d76f412435abaa747
Author: Kevin Coffman <kwc.edu>
Date:   Thu Dec 11 11:43:31 2008 -0500

    svcgssd: use the actual context expiration for cache
    
    Instead of sending down an infinite expiration value for the rsi(init) and
    rsc(context) cache entries, use a reasonable value for the rsi cache, and
    the actual context expiration value for the rsc cache.
    
    Prompted by a proposal from Neil Brown as a result of a complaint of a
    server running out of kernel memory when under heavy load of rpcsec_gss
    traffic.  Neil's original patch used one minute for the init cache and one
    hour for the context cache.  Using the actual expiration time prevents
    unnecessary context re-negotiation.
    
    Signed-off-by: Kevin Coffman <kwc.edu>
    Signed-off-by: Steve Dickson <steved>

Comment 5 RHEL Program Management 2010-08-09 19:05:00 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 7 Chao Ye 2011-06-15 07:22:05 UTC
Confirm patch in nfs-utils cvs

Comment 8 yanfu,wang 2011-06-16 04:00:57 UTC
hi Steve,
Seems the issue can't fix in the 5.7 nfs-utils package, pls check my below test steps:
First, I could reproduce on RHEL5.3:
KDC: amd-ma78gm-01.rhts.eng.bos.redhat.com
# cat /etc/krb5.conf 
[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = RHTS.ENG.BOS.REDHAT.COM
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h
 forwardable = yes

[realms]
 RHTS.ENG.BOS.REDHAT.COM = {
  kdc = amd-ma78gm-01.rhts.eng.bos.redhat.com:88
  admin_server = amd-ma78gm-01.rhts.eng.bos.redhat.com:749
  default_domain = rhts.eng.bos.redhat.com 
 }

[domain_realm]
 .rhts.eng.bos.redhat.com = RHTS.ENG.BOS.REDHAT.COM 
 rhts.eng.bos.redhat.com = RHTS.ENG.BOS.REDHAT.COM 

[appdefaults]
 pam = {
   debug = false
   ticket_lifetime = 36000
   renew_lifetime = 36000
   forwardable = true
   krb4_convert = false
 }

# kadmin.local 
kadmin.local:  listprincs 
K/M.BOS.REDHAT.COM
foo.BOS.REDHAT.COM
kadmin/admin.BOS.REDHAT.COM
kadmin/amd-ma78gm-01.rhts.eng.bos.redhat.com.BOS.REDHAT.COM
kadmin/changepw.BOS.REDHAT.COM
kadmin/history.BOS.REDHAT.COM
krbtgt/RHTS.ENG.BOS.REDHAT.COM.BOS.REDHAT.COM
nfs/hp-dx2200-01.rhts.eng.bos.redhat.com.BOS.REDHAT.COM
nfs/nec-em24-3.rhts.eng.bos.redhat.com.BOS.REDHAT.COM
root/admin.BOS.REDHAT.COM


nfs server: nec-em24-3.rhts.eng.bos.redhat.com
/etc/sysconfig/nfs:
SECURE_NFS="yes"
RPCGSSDARGS="-vvv"
RPCSVCGSSDARGS="-vvv"

# cat /etc/exports 
/export      gss/krb5(sync,rw,fsid=0,insecure,no_subtree_check)

# klist -ke
Keytab name: FILE:/etc/krb5.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   3 nfs/nec-em24-3.rhts.eng.bos.redhat.com.BOS.REDHAT.COM (DES cbc mode with CRC-32) 

# setenforce 0
# /etc/init.d/nfs start


nfs client: hp-dx2200-01.rhts.eng.bos.redhat.com
/etc/sysconfig/nfs:
SECURE_NFS="yes"

[root@hp-dx2200-01 ~]# klist -ke
Keytab name: FILE:/etc/krb5.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   3 nfs/hp-dx2200-01.rhts.eng.bos.redhat.com.BOS.REDHAT.COM (DES cbc mode with CRC-32) 

[root@hp-dx2200-01 ~]# /etc/init.d/rpcidmapd start
[root@hp-dx2200-01 ~]# /etc/init.d/rpcgssd start


Add a user foo on both the client and the server with the same uidnumber and gidnumber and add a principal for the user in the KDC. 
# id foo
uid=501(foo) gid=501(foo) groups=501(foo) context=root:system_r:unconfined_t:SystemLow-SystemHigh

Create one directories on the nfs share, directory foo with ownership root.foo and mode set to 2775

[root@hp-dx2200-01 ~]# mount -t nfs4 -o sec=krb5 nec-em24-3.rhts.eng.bos.redhat.com:/ /mnt
[root@hp-dx2200-01 ~]# cat /proc/mounts 
...
nec-em24-3.rhts.eng.bos.redhat.com:/ /mnt nfs4 rw,vers=4,rsize=32768,wsize=32768,hard,intr,proto=tcp,timeo=600,retrans=3,sec=krb5,addr=nec-em24-3.rhts.eng.bos.redhat.com 0 0

Verify that the user foo has write and read access for /mnt/foo.

Add a new group bar, and add the user as a member to this group. Verify that the user foo is a member of group bar running id -a as foo. Make sure that these details are available on both the nfs server and client.
# groupadd bar
# usermod -G bar foo
# id -a foo
uid=501(foo) gid=501(foo) groups=501(foo),502(bar) context=root:system_r:unconfined_t:SystemLow-SystemHigh

Create directory bar with ownership root.bar and mode set to 2775
[root@nec-em24-3 export]# ls -ltr
total 16
drwxrwsr-x 2 root foo 4096 Jun 15 05:09 foo
drwxrwsr-x 2 root bar 4096 Jun 15 05:09 bar

Re-login as foo to re-authenticate on client:
[foo@hp-dx2200-01 mnt]$ exit
logout
[root@hp-dx2200-01 ~]# su - foo
[foo@hp-dx2200-01 ~]$ kdestroy 
[foo@hp-dx2200-01 ~]$ kinit
[root@hp-dx2200-01 ~]# ls -l /mnt
drwxrwsr-x 2 root bar    4096 Jun 15 05:45 bar
drwxrwsr-x 2 root foo    4096 Jun 15 05:44 foo
[foo@hp-dx2200-01 ~]$ touch /mnt/foo/testfile
[foo@hp-dx2200-01 ~]$ touch /mnt/bar/testfile
touch: cannot touch `/mnt/bar/testfile': Permission denied

then use workaround and create file successfully:
[foo@hp-dx2200-01 ~]$ exit
logout
[root@hp-dx2200-01 ~]# umount /mnt
[root@hp-dx2200-01 ~]# mount -t nfs4 -o sec=krb5 nec-em24-3.rhts.eng.bos.redhat.com:/ /mnt
[root@hp-dx2200-01 ~]# su - foo
[foo@hp-dx2200-01 ~]$ kdestroy 
[foo@hp-dx2200-01 ~]$ kinit
Password for foo.BOS.REDHAT.COM: 
[foo@hp-dx2200-01 ~]$ touch /mnt/bar/testfile


try to verified on RHEL5.7-Server-20110608.1 using the same test steps:
[root@athlon4 ~]# uname -a
Linux athlon4.rhts.eng.bos.redhat.com 2.6.18-266.el5 #1 SMP Tue Jun 7 16:44:57 EDT 2011 i686 athlon i386 GNU/Linux
[root@athlon4 ~]# rpm -qa|grep nfs-utils
nfs-utils-1.0.9-53.el5
nfs-utils-lib-1.0.8-7.6.el5
# cat /proc/mounts 
...
hp-bl495cg5-01.rhts.bos.redhat.com:/ on /mnt type nfs4 (rw,sec=krb5,addr=10.16.66.109)

[root@athlon4 ~]# su - foo
[foo@athlon4 ~]$ ls -l /mnt
ls: /mnt: Permission denied
[foo@athlon4 ~]$ kinit
Password for foo.BOS.REDHAT.COM: 
[foo@athlon4 ~]$ ls -l /mnt
total 8
drwxrwsr-x 2 root foo 4096 Jun 15 22:48 foo
[foo@athlon4 ~]$ touch /mnt/foo/file1
[foo@athlon4 ~]$ ls -lR /mnt/
/mnt/:
total 8
drwxrwsr-x 2 root foo 4096 Jun 15 22:49 foo

/mnt/foo:
total 4
-rw-rw-r-- 1 foo foo 0 Jun 15 22:49 file1

[root@hp-bl495cg5-01 export]# groupadd bar
[root@hp-bl495cg5-01 export]# usermod -G bar foo
[root@hp-bl495cg5-01 export]# id foo
uid=504(foo) gid=504(foo) groups=504(foo),505(bar) context=root:system_r:unconfined_t:SystemLow-SystemHigh
[root@hp-bl495cg5-01 export]# mkdir bar
[root@hp-bl495cg5-01 export]# chown root.bar bar
[root@hp-bl495cg5-01 export]# chmod 2775 bar
[root@hp-bl495cg5-01 export]# ls -ltr
drwxrwsr-x 2 root foo 4096 06-15 22:49 foo
drwxrwsr-x 2 root bar 4096 06-15 22:53 bar

[root@athlon4 ~]# groupadd bar
[root@athlon4 ~]# usermod -G bar foo
[root@athlon4 ~]# id foo
uid=504(foo) gid=504(foo) groups=504(foo),505(bar) context=root:system_r:unconfined_t:SystemLow-SystemHigh
[root@athlon4 ~]# su - foo
[foo@athlon4 ~]$ ls -l /mnt
total 16
drwxrwsr-x 2 root bar 4096 Jun 15 22:53 bar
drwxrwsr-x 2 root foo 4096 Jun 15 22:49 foo
[foo@athlon4 ~]$ touch /mnt/bar/file1
touch: cannot touch `/mnt/bar/file1': Permission denied
[foo@athlon4 ~]$ kdestroy 
[foo@athlon4 ~]$ kinit 
Password for foo.BOS.REDHAT.COM: 
[foo@athlon4 ~]$ touch /mnt/bar/file1
touch: cannot touch `/mnt/bar/file1': Permission denied

note: the same permission denied message is still shown, and use the workaround let umount and mount the /nfs-directory againt. Try to touch the file again, which will succeed. So the issue seems didn't be fixed.

Comment 9 RHEL Program Management 2011-06-21 05:58:29 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 12 Sachin Prabhu 2012-01-13 11:37:58 UTC
The patch in
https://bugzilla.redhat.com/show_bug.cgi?id=507341#c1
will be insufficient for this case. 

In the original case, the context was set to never expire. This was obviously wrong. The patch changes this to a saner expire value which is the value at which the context will expire on NFS server. 

However this is still far too long for the test case where the client simply logs out and then logs in again in a few seconds. In such a case, the time elapsed is not too much and the context cached on the NFS server is still valid. This means the original UID/GID will be used. 

Looks like the only way to workaround this would be to set a much lower expiry time instead of relying on the context expiry time.

Comment 13 Sachin Prabhu 2012-01-13 12:08:49 UTC
A timeout parameter for rpc.svcgssd to force a shorter timeout value similar to that used in rpc.gssd may be a good compromise in this case.

Comment 16 Sachin Prabhu 2012-01-17 11:50:09 UTC
http://thread.gmane.org/gmane.linux.nfs/46234

Comment 18 Sachin Prabhu 2012-01-18 12:14:52 UTC
The problem is described well by Bruce in the upstream list.
http://thread.gmane.org/gmane.linux.nfs/46234

The problem is that the rpc.gssd client daemon on the client will cache the client credential. This cached credential cannot be destroyed by the userland tools such as kdestroy/kinit. In such cases,the original credential used by the user will be continued to be used by the client which in turn will result in the server using the cached credentials on its end.

There is another workaround which involves backporting the following patch
--
commit 1e1c7be98749fff054beec4bf67b436b58f6edac
Author: Lukas Hejtmanek <xhejtman.cz>
Date:   Tue Jul 15 10:07:45 2008 -0400

    The default expiration of kernel gss contexts is the expiration
    of the Kerberos ticket used in its creation.  (For contexts
    created using the Kerberos mechanism.)  Thus kdestroy has
    no effect in nullifying the kernel context.
    
    This patch adds -t <timeout> option to rpc.gssd so that the client's
    administrator may specify a timeout for expiration of contexts in kernel.
    After this timeout, rpc.gssd is consulted to create a new context.
    
    By default, timeout is 0 (i.e., no timeout at all) which follows the
    previous behavior.
    
    Signed-off-by: Lukas Hejtmanek <xhejtman.cz>
    Signed-off-by: Kevin Coffman <kwc.edu>
    Signed-off-by: Steve Dickson <steved>
--

The administrator sets a smaller timeout on the usage of a credential by a NFS client. In the event the user's credentials changing, the user log back in and uses kinit to obtain a new credential on the client. Following the expiry of the credential stored in cache, the client side program rpc.gssd will use the latest version of credentials which in turn will trigger an update for the stored credentials on the NFS server end.

Comment 19 Sachin Prabhu 2012-01-18 12:50:40 UTC
Created attachment 556022 [details]
RHEL 5: Add timeout option to client side rpc.gssd utility.

Backport of patch mentioned in c#18. This allows the administrator to specify a timeout on the cache on the client side.

In this case, after adding a secondary group to the user, the user is expected to reinitialise the kerberos ticket with kdestroy-kinit. On cache expiry on the client, the rpc.gssd will start using the new kerberos credential. This will force the NFS server to use the new credentials too thereby also refreshing the users secondary group list.



gssd: Allow administrator to specify cached context expiration time

Backport of following upstream patch.

--
commit 1e1c7be98749fff054beec4bf67b436b58f6edac
Author: Lukas Hejtmanek <xhejtman.cz>
Date:   Tue Jul 15 10:07:45 2008 -0400

    The default expiration of kernel gss contexts is the expiration
    of the Kerberos ticket used in its creation.  (For contexts
    created using the Kerberos mechanism.)  Thus kdestroy has
    no effect in nullifying the kernel context.

    This patch adds -t <timeout> option to rpc.gssd so that the client's
    administrator may specify a timeout for expiration of contexts in kernel.
    After this timeout, rpc.gssd is consulted to create a new context.

    By default, timeout is 0 (i.e., no timeout at all) which follows the
    previous behavior.

    Signed-off-by: Lukas Hejtmanek <xhejtman.cz>
    Signed-off-by: Kevin Coffman <kwc.edu>
    Signed-off-by: Steve Dickson <steved>
--

Signed-off-by: Sachin Prabhu <sprabhu>

Comment 23 yanfu,wang 2012-08-10 08:02:39 UTC
Verified patch nfs-utils-1.0.9-svcgssd-sanetime.patch in nfs-utils-1.0.9-64.el5 using the steps as comment #8, and using rpc.gssd -t timeout option:
# ps -ef|grep rpc.gssd
root      4461     1  0 03:41 ?        00:00:00 rpc.gssd -t 5
[root@dell-pem605-01 ~]# su - foo
[foo@dell-pem605-01 ~]$ touch /mnt/bar/111
touch: cannot touch `/mnt/bar/111': Permission denied

Now modify user foo to add additional group bar.
On client, run touch again:
[foo@dell-pem605-01 ~]$ touch /mnt/bar/111
[foo@dell-pem605-01 ~]$ ls -l /mnt/bar/
total 4
-rw-rw-r-- 1 foo bar 0 Aug 10 03:44 111

Comment 25 errata-xmlrpc 2013-01-08 07:33:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0068.html