Bug 786463
Summary: | nfs mount hangs when kerberos ticket expires | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jonathan Underwood <jonathan.underwood> |
Component: | kernel | Assignee: | Scott Mayhew <smayhew> |
kernel sub component: | NFS | QA Contact: | JianHong Yin <jiyin> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | unspecified | ||
Priority: | unspecified | CC: | andros, dpal, dros, dwysocha, eguan, jlayton, johnh, kevin.fu, pablo.iranzo, rmainz, rvdwees, rwheeler, smayhew, ssorce, urkedal |
Version: | 6.1 | Keywords: | Reopened |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.32-489.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-10-14 05:08:18 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jonathan Underwood
2012-02-01 14:10:34 UTC
Any comment on this? It seems to be a rather serious security bug? (In reply to comment #2) > Any comment on this? It seems to be a rather serious security bug? We will be looking into this at this year's Connectathon. We are hoping that the new sssd daemon (part of the IPA package) can be configured to renew tickets for long running process and cron jobs. sssd renewing tickets certainly improves matters, but only so far I think. I'm running against an AD domain, giving us kerberos tickets with a 7 days max renewable lifetime. You've surely still got the same problem at the end of those 7 days? (In reply to comment #4) > I'm running against an AD domain, giving us kerberos tickets with a 7 days max > renewable lifetime. You've surely still got the same problem at the end of > those 7 days? Point... Leaving NFS aside... Is there a way today for one entity to get a new kerberos ticket for another entity? (In reply to comment #5) > Point... Leaving NFS aside... Is there a way today for one entity to get a new > kerberos ticket for another entity? I certainly don't know of any way that could happen. sssd does have a mode whereby it can actually stash your password in the kernel keyring for the case where you authenticate to a cached credential offline so that it can get a ticket later when it does come online. Possibly that can be extended to allow infinite renewal of kerberos tickets? The option is called krb5_store_password_if_offline if that's of any interest. There'd clearly be some mild security concerns, but possibly it'd be acceptable in some cases? There's two aspects to this bug: 1) Automatic renewal of Kerberos tickets 2) That an expired ticket causes a DOS for all other users (and also fills the log with messages eventually crashing the machine). Certainly (1) will mitigate (2). But I still think (2) needs solving in the near future, independent of what sssd does to sort out (1). [Aside: the kinit (k5init, krenew) package is helpful for starting a long running process and renewing tickets while it runs]. Should this be moved to the kernel component? This DOS is totally killing our machines. We believe the solution to this will be to move over to the IPA information management service. In the service there is a daemon call sssd that will automatically go renew Kerberos tickets. So the solution will be at the user level. (In reply to comment #9) > We believe the solution to this will be to move over to the IPA information > management service. In the service there is a daemon call sssd that will > automatically go renew Kerberos tickets. So the solution will be at the user > level. I am already using sssd (1.5.1 as shipped with rhel 6.2) which already has this functionality (via the krb5_renew_interval option), but the point is: when a ticket fails to renew through sssd or otherwise, rpcgssd starts consuming large amounts of CPU, the kernel fills the logs with error messages, and no other user can mount their home directories (presumably because rpcgssd has gone into some sort of loop). Result: DOS. Any update on this - this is a really serious problem here (when already using sssd) - an expired kerberos ticket shouldn't bring down a box - this does need to be fixed kernel side, it is a kernel bug. (In reply to comment #9) > We believe the solution to this will be to move over to the IPA information > management service. In the service there is a daemon call sssd that will > automatically go renew Kerberos tickets. So the solution will be at the user > level. This is *not* a solution, as it's merely reducing your exposure to the problem. Renewal is not infinite, so you're going to hit a point where you can no longer renew and then hit this DOS. With a lot of users on a system you're going to be hitting this rather regularly. I assume the bug will manifest itself at this point. You just can't blame this on the presence of a ticket, the system needs to cope with it. Some just pointed me to a daemon call krenew (kstart-3.16.1.el6). They claim it works well for renewing user credentials... Unfortunately I have not had any cycles to look into it... yet... (In reply to comment #13) > Some just pointed me to a daemon call krenew (kstart-3.16.1.el6). They claim it > works well for renewing user credentials... Unfortunately I have not had any > cycles to look into it... yet... Does this problem occur when a ticket fully expires (i.e. it has reached the end of its renewable lifetime)? If so, no amount of renewing can fix this serious bug. (In reply to comment #13) > Some just pointed me to a daemon call krenew (kstart-3.16.1.el6). They claim it > works well for renewing user credentials... Unfortunately I have not had any > cycles to look into it... yet... Steve, with respect, you're missing the point. The bug here is not with the ticket expiring - that happens. krenew, k5init, sssd etc are userspace ways of renewing the ticket which work for as long as the ticket is renewable. The bug is that when ticket expiration happens, which is a legit thing to happen, it causes a DOS for all other users. That DOS is the bug. (In reply to comment #15) > (In reply to comment #13) > > Some just pointed me to a daemon call krenew (kstart-3.16.1.el6). They claim it > > works well for renewing user credentials... Unfortunately I have not had any > > cycles to look into it... yet... > > Steve, with respect, you're missing the point. The bug here is not with the > ticket expiring - that happens. krenew, k5init, sssd etc are userspace ways of > renewing the ticket which work for as long as the ticket is renewable. The bug > is that when ticket expiration happens, which is a legit thing to happen, it > causes a DOS for all other users. That DOS is the bug. 100% agree. If I leave my machine logged in and go away for a week, my ticket *will* expire whatever any daemon tries to do to renew it. If the result of that is that nobody can use the shared machine I'm logged in to at the time (which in my case may have 100 users on it) that's not going to go down well. Any mention of renewal is not solving the problem. (In reply to comment #15) > (In reply to comment #13) > > Some just pointed me to a daemon call krenew (kstart-3.16.1.el6). They claim it > > works well for renewing user credentials... Unfortunately I have not had any > > cycles to look into it... yet... > > Steve, with respect, you're missing the point. The bug here is not with the > ticket expiring - that happens. krenew, k5init, sssd etc are userspace ways of > renewing the ticket which work for as long as the ticket is renewable. The bug > is that when ticket expiration happens, which is a legit thing to happen, it > causes a DOS for all other users. That DOS is the bug. No... I do understand the point... Once the ticket is completely expired there is no way to grant another ticket that can be renewed by the assorted user level daemons. Lets open this up to a wider audience... Andy, Simo any thoughts? Steve, you certainly need to gracefully handle a case where user credentials are expired. The problem is that it is difficult to handle this properly. We may need a notification mechanism that allows user space to tell the kernel to stop asking for user X and another mechanism by which a login process (either a pam module or sssd) can tell the kernel it can now spam user space again with requests for user X. The reason we need a notification mechanism is that you need to allow access to NFS immediately after login for NFS mounted home dirs, so a time base negative cache won't work. How to implement this notification mechanism ? I do not know at this stage. Ah another thought, whatever we do it shouldn't be NFS specific as cifs.ko will have exactly the same issue. Adding Jeff so he can chime in on this as well. Any progress on this bug? Just tried on latest RHEL 6.5 with kerberos NFS auto home directory, when the user's ticket expires, that user can't login to the server (home nfs mount hung). Right now I just use following quick and dirty hourly cron to clean up any expired ticket cache, at least this will allow the user login again and acquire a new ticket. for i in `ls /tmp/krb5cc_*`; do KRB5CCNAME=$i klist -l |grep Expired |awk '{print $2}'|cut -d: -f2 ; done | grep krb5cc |xargs rm -f Can anyone comment if the expired tickets can be cleared from /tmp by a more reliable watch daemon , will this be an viable solution? (In reply to kfu from comment #22) > Any progress on this bug? Just tried on latest RHEL 6.5 with kerberos NFS > auto home directory, when the user's ticket expires, that user can't login > to the server (home nfs mount hung). > > Right now I just use following quick and dirty hourly cron to clean up any > expired > ticket cache, at least this will allow the user login again and acquire a > new ticket. > > for i in `ls /tmp/krb5cc_*`; do KRB5CCNAME=$i klist -l |grep Expired |awk > '{print $2}'|cut -d: -f2 ; done | grep krb5cc |xargs rm -f > > Can anyone comment if the expired tickets can be cleared from /tmp by a more > reliable watch daemon , will this be an viable solution? Well yes and no... From NFS stand point no, when a ticket expires NFS will still hang. But using the sssd daemon from the ipa-client package which renews users automatically does avoid the problem. (In reply to Steve Dickson from comment #23) > > Well yes and no... From NFS stand point no, when a ticket expires NFS will > still hang. But using the sssd daemon from the ipa-client package which > renews users automatically does avoid the problem. Avoid, or just extend from being a 12 hour to a 7 day issue? (In reply to John Hodrien from comment #24) > (In reply to Steve Dickson from comment #23) > > > > Well yes and no... From NFS stand point no, when a ticket expires NFS will > > still hang. But using the sssd daemon from the ipa-client package which > > renews users automatically does avoid the problem. > > Avoid, or just extend from being a 12 hour to a 7 day issue? the sssd deamon, part of the ipa-client package, will renew tickets automatically for user that are created by the ipa user-add See slices 17 through 19 in http://people.redhat.com/steved/Summits/Summit13/Summit_Handout13.pdf (In reply to Steve Dickson from comment #25) > > the sssd deamon, part of the ipa-client package, will renew tickets > automatically for user that are created by the ipa user-add > See slices 17 through 19 in > http://people.redhat.com/steved/Summits/Summit13/Summit_Handout13.pdf Yes, up to the maximum renewal time of the ticket (as shown by klist), which for Active Directory [probably the most common case], is not more than 7 days. I'm not sure how much further we can go with this bz since nfs-utils itself does not have a utility to renew tickets So I'm going to close this as CANTFIX (In reply to Steve Dickson from comment #27) > I'm not sure how much further we can go with this bz since > nfs-utils itself does not have a utility to renew tickets > So I'm going to close this as CANTFIX This has nothing to do with not being able to renew, and everything to do with the behaviour being crappy when it does (since ticket expiration is just a fact of life). But if the kernel / nfs-utils can't cope with an expired ticket, then yes, we're stuffed. For the case of user auto mount (nfs4 sec=krb5p) in IPA domain, if we can clean up expired tickets (after exhaust renewable life) in /tmp/krb5cc_* on the client, at least it should allow user to log in again and acquire a new ticket, thus regain nfs4 home auto mount. For those "nfs4 -o sec=krb5p" mounts in /etc/fstab, above won't help to regain nfs mount, but my understanding is to get a keytab of nfs/client.fqdn and put it on the nfs server's keytab to avoiding ticket renewal. Are above thoughts correct? In latest Fedora/RHEL versions there is a component called GSS-proxy that is created to solve among others this problem too. If given a keytab it can be configured to renew the ticket on behalf of the user indefinitely as it can use constrained delegation if this is required/configured. https://ssimo.org/slides/devconf-2013-gss-proxy.pdf http://fedoraproject.org/wiki/Features/gss-proxy http://k5wiki.kerberos.org/wiki/Projects/ProxyGSSAPI https://fedorahosted.org/gss-proxy/ (In reply to Dmitri Pal from comment #30) > In latest Fedora/RHEL versions there is a component called GSS-proxy that is > created to solve among others this problem too. If given a keytab it can be > configured to renew the ticket on behalf of the user indefinitely as it can > use constrained delegation if this is required/configured. If given a user keytab, sure. But that's a decidedly atypical case. If it's still the case that it behaves as the original reported describes, none of these solutions address the problem, and you're still in a pretty grim place. I think just revisit this with RHEL7 and go on from there. (In reply to John Hodrien from comment #31) > (In reply to Dmitri Pal from comment #30) > > In latest Fedora/RHEL versions there is a component called GSS-proxy that is > > created to solve among others this problem too. If given a keytab it can be > > configured to renew the ticket on behalf of the user indefinitely as it can > > use constrained delegation if this is required/configured. > > If given a user keytab, sure. But that's a decidedly atypical case. If > it's still the case that it behaves as the original reported describes, none > of these solutions address the problem, and you're still in a pretty grim > place. I think just revisit this with RHEL7 and go on from there. You can use a user keytab. For sure that would work. But I was talking about a different use case. You can have a keytab issued for GSS proxy. GSS proxy can be configured to do constrained delegation (subject to server side policy enforcement). This means that GSS proxy can be told to use s4u2self + s4u2proxy. What happens is that it will first acquire a ticket on behalf of the user for itself (s4u2self) and then use it to acquire ticket for NFS server (s4u2proxy). If GSS proxy is configured to do this (impersonate user) and KDC policies (if any) allow this to happen the user ticket will be acquired on demand when needed solving the issue of ticket expiration. Please try RHEL7. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Patch(es) available on kernel-2.6.32-489.el6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-1392.html |