Bug 751353 - nfs-utils-1.2.3-7.el6_1.1 breaks rpc.gssd
Summary: nfs-utils-1.2.3-7.el6_1.1 breaks rpc.gssd
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: nfs-utils
Version: 6.4
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Steve Dickson
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
: 753841 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-04 13:41 UTC by Rob Henderson
Modified: 2018-11-26 18:18 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-16 14:22:14 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Rob Henderson 2011-11-04 13:41:56 UTC
Description of problem:

The introduction of nfs-utils-1.2.3-7.el6_1.1 has broken our nfsv4+krb5 configuration.  We are now seeing rpc.gssd seg fault frequently and, even when running, fail to work.


Version-Release number of selected component (if applicable):

Our config has worked reliably with nfs-utils-1.2.3-7.el6 but broke immediately when nfs-utils-1.2.3-7.el6 was installed.  Backing out to the old version or just replacing the new rpc.gssd with the older version fixes the problem.


How reproducible:

Always

Steps to Reproduce:
1. install nfs-utils-1.2.3-7.el6_1.1
2. restart rpc.gssd
3. log in and try to access a kerberized nfsv4 mount

  
Actual results:

We are seeing two errors in the logs.  First, it segfaults very frequently:

Nov  2 17:43:37 chickadee kernel: rpc.gssd[1287]: segfault at 1 ip 00007fbe310677ae sp 00007fff935b4358 error 4 in libgssglue.so.1.0.0[7fbe31064000+9000]

Secondly, it just plain doesn't work, even when it is running, and generates this error:

Nov  3 16:47:33 chickadee rpc.gssd[1309]: ERROR: GSS-API: error in gss_set_allowable_enctypes(): GSS_S_NO_CRED (No credentials were supplied, or the credentials were unavailable or inaccessible) - Unknown error



Expected results:

To work as it did with nfs-utils-1.2.3-7.el6

Additional info:

The gss_set_allowable_enctypes error above make me suspect that rpc.gssd is somehow not able to grok our krb5.conf which includes the following (which we need in order to properly talk to our ADS server):

[libdefaults]
        default_tkt_enctypes = des-cbc-crc aes256-cts-hmac-sha1-96 arcfour-hmac
        allow_weak_crypto = 1

Also, I've only tested this on x86_64 systems since all our rhel6 systems are 64 bit.  So, I have no information about whether the problem is there on a 32bit system.

Comment 2 Steve Dickson 2011-11-04 18:03:18 UTC
Could you please do a "debuginfo-install nfs-utils" 

Next run rpc.gssd from gdb by doing:
   # gdb /usr/sbin/rpc.gssd
   (gdb) run -f -v

then another window do the mount.

When the rpc.gssd crashes gdb will return the prompt. At
that point please get a back track by typing:
   (gdb) bt

please post that backtrace...

Comment 3 Rob Henderson 2011-11-04 19:10:28 UTC
Hey Steve, I've repeated this test a few time and am seeing two different behaviors.  Sometimes it hits case 1 and sometimes 2, seemingly rather randomly.  But, the actual crash seems the same in both cases with the only diff being whether the gss_set_allowable_enctypes error pops up or not.

case 1
====

(gdb) run -f -v
Starting program: /usr/sbin/rpc.gssd -f -v
[Thread debugging using libthread_db enabled]
beginning poll

Program received signal SIG37, Real-time event 37.
0x00007ffff67620e8 in __poll (fds=0x7ffff82060e0, nfds=256, timeout=500) at ../sysdeps/unix/sysv/linux/poll.c:83
83        return INLINE_SYSCALL (poll, 3, CHECK_N (fds, nfds), nfds, timeout);
(gdb) bt
#0  0x00007ffff67620e8 in __poll (fds=0x7ffff82060e0, nfds=256, timeout=500) at ../sysdeps/unix/sysv/linux/poll.c:83
#1  0x00007ffff7ff41a8 in gssd_run () at gssd_main_loop.c:224
#2  0x00007ffff7ff3ebe in main (argc=<value optimized out>, argv=<value optimized out>) at gssd.c:187
(gdb)


case 2
====

(gdb) run -f -v
Starting program: /usr/sbin/rpc.gssd -f -v
[Thread debugging using libthread_db enabled]
beginning poll
handling gssd upcall (/var/lib/nfs/rpc_pipefs/nfs/clnt480)
handling krb5 upcall (/var/lib/nfs/rpc_pipefs/nfs/clnt480)
ERROR: GSS-API: error in gss_set_allowable_enctypes(): GSS_S_NO_CRED (No credentials were supplied, or the credentials were unavailable or inaccessible) - Unknown error
WARNING: Failed while limiting krb5 encryption types for user with uid 0
ERROR: GSS-API: error in gss_set_allowable_enctypes(): GSS_S_NO_CRED (No credentials were supplied, or the credentials were unavailable or inaccessible) - Unknown error
WARNING: Failed while limiting krb5 encryption types for user with uid 0
WARNING: Failed to create machine krb5 context with any credentials cache for server newt.soic.indiana.edu
doing error downcall

Program received signal SIG37, Real-time event 37.
0x00007ffff67620e8 in __poll (fds=0x7ffff82060e0, nfds=256, timeout=500) at ../sysdeps/unix/sysv/linux/poll.c:83
83        return INLINE_SYSCALL (poll, 3, CHECK_N (fds, nfds), nfds, timeout);
(gdb) bt
#0  0x00007ffff67620e8 in __poll (fds=0x7ffff82060e0, nfds=256, timeout=500) at ../sysdeps/unix/sysv/linux/poll.c:83
#1  0x00007ffff7ff41a8 in gssd_run () at gssd_main_loop.c:224
#2  0x00007ffff7ff3ebe in main (argc=<value optimized out>, argv=<value optimized out>) at gssd.c:187
(gdb)

Comment 4 Steve Dickson 2011-11-07 15:16:09 UTC
Its bit odd that rpc.gssd is dying in poll... I wonder if its stepping on memory somewhere due to that error... 

Adding our Kerberos guy to hopefully add some insight as to why the error is happening... 

The only difference between 1.2.3-7.el6 and 1.2.3-7.el6_1.1 is
a patch that fixes are parsing problem in the unmount code. So
this probably does exist in 1.2.3-7.el6...

What krb5 version (rpm -q krb5) is install?
Also could you possibly share what's in you /etc/krb5.keytab
(klist -ke)?

Comment 5 Rob Henderson 2011-11-07 15:49:11 UTC
Here's the versions of krb5*

# rpm -qa krb5\*
krb5-appl-clients-1.0.1-2.el6_1.1.x86_64
krb5-pkinit-openssl-1.9-9.el6_1.2.x86_64
krb5-server-1.9-9.el6_1.2.x86_64
krb5-appl-servers-1.0.1-2.el6_1.1.x86_64
krb5-auth-dialog-0.13-3.el6.x86_64
krb5-devel-1.9-9.el6_1.2.x86_64
krb5-workstation-1.9-9.el6_1.2.x86_64
krb5-libs-1.9-9.el6_1.2.i686
krb5-server-ldap-1.9-9.el6_1.2.x86_64
krb5-libs-1.9-9.el6_1.2.x86_64

And, here's what's in the keytab:

# klist -ke
Keytab name: WRFILE:/etc/krb5.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   3 nfs/robwilco.cs.indiana.edu.EDU (des-cbc-crc) 
   3 nfs/robwilco.cs.indiana.edu.EDU (des-cbc-md5) 
   3 nfs/robwilco.cs.indiana.edu.EDU (arcfour-hmac) 
   3 nfs/robwilco.cs.indiana.edu.EDU (aes256-cts-hmac-sha1-96) 
   3 nfs/robwilco.cs.indiana.edu.EDU (aes128-cts-hmac-sha1-96) 

And, as noted earlier, I have this in the krb5.conf

[libdefaults]
        default_tkt_enctypes = des-cbc-crc aes256-cts-hmac-sha1-96 arcfour-hmac
        allow_weak_crypto = 1

I will just note that this somewhat unusual krb5 config is the only setup that seems to work with the campus ADS server. 

The problem is perfectly repeatable with the rpc.gssd from 1.2.3-7.el6_1.1 and all I have to do to make it work and break is just swap back and forth between the rpc.gssd executable from 1.2.3-7.el6 and 1.2.3-7.el6_1.1.  So, there sure seems to be something different with the new rpc.gssd that is tickling a bug somewhere.

Comment 6 RHEL Program Management 2011-11-11 06:47:07 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 7 Rob Henderson 2011-11-14 17:48:41 UTC
I just rebuilt the previous (working) 1.2.3-7.el6 nfs-utils rpm on a fully patched system and that fails in the exact same way as the newly released 1.2.3-7.el6_1.1.  So, rather than being a problem introduced in 1.2.3-7.el6_1.1 perhaps some other rpm has changed that is affecting how the nfs-utils rpm is built that is the culprit.  ???

Comment 9 Richard Smits 2011-12-01 09:30:06 UTC
We have the exact same issue here.
- nfs-utils-1.2.3-7.el6_1.1 fails.
- nfs-utils-1.2.3-7.el6 works.
The messages we are getting in the log are :
---
Nov 30 10:56:37 server kernel: rpc.gssd[7053]: segfault at 1 ip 00007fa00b1217ae sp 00007fff65984c98 error 4 in libgssglue.so.1.0.0[7
fa00b11e000+9000]
---
Also more errors from Encryption :
---
KDC has no support for encryption type
---
It is very easy to reproduce here. We have also some crash reports if you are interested.

Comment 10 Remigiusz Górecki 2011-12-08 18:17:45 UTC
Hi

I have the same problem. I don't know why I haven't found this bug reported by Rob and I reported my own bug: https://bugzilla.redhat.com/show_bug.cgi?id=753841

I think, we have the same problem.

Unfortunately, as I can see, there isn't any newer package in Red Hat 6.2

Comment 11 Steve Dickson 2011-12-13 23:52:24 UTC
Would it be possible to make a core available so I can poke around in it?

Also everyone who is seeing problem is using a Windows AD as the KDC, correct?

Comment 12 Steve Dickson 2011-12-13 23:58:52 UTC
*** Bug 753841 has been marked as a duplicate of this bug. ***

Comment 13 Richard Smits 2011-12-14 07:40:38 UTC
Yes, we are also using Windows AD as the KDC.
nfs-utils-1.2.3-7.el6_1.1 was not working correct on Redhat 6.1.
nfs-utils-1.2.3-15.el6.x86_64 on Redhat 6.2 is working fine it seems. So the issue seems to be resolved ? I have not looked in the changelog yet.

Comment 14 Remigiusz Górecki 2011-12-14 07:51:40 UTC
No I don't use Windows AD as KDC. KDC is running on Red Hat 6.1 - krb5-server-1.9-9.el6_1.2.
It's nice to hear that nfs-utils-1.2.3-15.el6.x86_64 works fine. I'll check it today.

Comment 15 Steve Dickson 2011-12-14 13:22:04 UTC
This bz seems to be similar: 
    https://bugzilla.redhat.com/show_bug.cgi?id=765960

Comment 16 Rob Henderson 2011-12-14 13:34:02 UTC
FYI, we are using Windows AD as the KDC and nfs-utils-1.2.3-15.el6 has resolved the problem for us as well.  Thanks!

Comment 17 Steve Dickson 2011-12-16 14:22:14 UTC
It sound like the problem is fixed in the -15 release... So I am going
to close this...


Note You need to log in before you can comment on or make changes to this bug.