Bug 853558 - [sssd[krb5_child[PID]]]: Credential cache directory /run/user/UID/ccdir does not exist
[sssd[krb5_child[PID]]]: Credential cache directory /run/user/UID/ccdir does ...
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: sssd (Show other bugs)
18
Unspecified Linux
unspecified Severity high
: ---
: ---
Assigned To: Jakub Hrozek
Fedora Extras Quality Assurance
: Reopened
: 890062 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-31 16:41 EDT by John Florian
Modified: 2013-02-09 06:23 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-09 06:23:36 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
sssd_$DOMAIN log at debug_level 8 (138.05 KB, text/plain)
2012-08-31 16:41 EDT, John Florian
no flags Details
krb5_child.log messages resulting from running "sudo date" (2.80 KB, text/plain)
2012-09-11 11:38 EDT, John Florian
no flags Details

  None (edit)
Description John Florian 2012-08-31 16:41:22 EDT
Created attachment 608698 [details]
sssd_$DOMAIN log at debug_level 8

Description of problem:
sudo is failing in Fedora 18 (development) with identities via LDAP and authorization via Kerberos.

Version-Release number of selected component (if applicable):
# rpm -qa | egrep 'krb5|systemd|sssd'
systemd-libs-188-3.fc18.i686
systemd-188-3.fc18.i686
systemd-sysv-188-3.fc18.i686
sssd-1.9.0-19.fc18.beta6.i686
pam_krb5-2.3.14-3.fc18.i686
sssd-client-1.9.0-19.fc18.beta6.i686
krb5-libs-1.10.2-7.fc18.i686

How reproducible:
always

Actual results:
User POV:
sudo date
[sudo] password for my_user_name: 
Sorry, try again.
[sudo] password for my_user_name: 
sudo: 1 incorrect password attempt

Log POV:
==> /var/log/messages <==
Aug 31 13:39:29 test-host [sssd[krb5_child[10593]]]: Credential cache directory /run/user/my_uid/ccdir does not exist

==> /var/log/secure <==
Aug 31 13:39:29 test-host sudo: pam_sss(sudo:auth): system info: [Credential cache directory /run/user/my_uid/ccdir does not exist]

No AVCs reported.

See attachment.
Comment 1 Nalin Dahyabhai 2012-09-04 17:39:06 EDT
krb5 1.10 doesn't create the directory for applications.  Could SSSD be getting tripped up by that?

The Kerberos libraries will start to create the final component of the path, if necessary, in krb5 1.11, but krb5 1.11 is slated to land during the F19 cycle, so I would recommend against depending on it.

There are also some leftovers in the directory after krb5_cc_destroy() is used to remove it, but that's probably less of an issue here.
Comment 2 Dmitri Pal 2012-09-06 17:42:05 EDT
Upstream ticket:
https://fedorahosted.org/sssd/ticket/1512
Comment 3 Jakub Hrozek 2012-09-10 17:00:56 EDT
(In reply to comment #1)
> krb5 1.10 doesn't create the directory for applications.  Could SSSD be
> getting tripped up by that?
> 
> The Kerberos libraries will start to create the final component of the path,
> if necessary, in krb5 1.11, but krb5 1.11 is slated to land during the F19
> cycle, so I would recommend against depending on it.
> 

The SSSD would attempt to create the last directory component of the dircache if it doesn't exist (and only the last component, we don't pretend we're mkdir -p)

> There are also some leftovers in the directory after krb5_cc_destroy() is
> used to remove it, but that's probably less of an issue here.
Comment 4 Jakub Hrozek 2012-09-10 17:31:00 EDT
You are running sudo you I suspect you were able to log in fine with that user. Can you paste the output of "klist" when logged in?

I assume the login was performed offline when the LDAP server was down. This debug message indicates that it was:
(Fri Aug 31 16:18:33 2012) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]

What is the value of cache_credentials in your sssd.conf? Does the same error happen even if the remote server is up?

Can you also attach the file /var/log/sssd/krb5_child.log ?
Comment 5 Jakub Hrozek 2012-09-10 17:32:21 EDT
(In reply to comment #4)
> You are running sudo you I suspect you were able to log in fine with that
> user. Can you paste the output of "klist" when logged in?
> 

Err, sorry, my English skills have apparently reached a new low. The sentence should have read "You are running sudo so I suspect you were able to log in fine with as that user".
Comment 6 Nalin Dahyabhai 2012-09-10 18:34:19 EDT
(In reply to comment #3)
> (In reply to comment #1)
> > krb5 1.10 doesn't create the directory for applications.  Could SSSD be
> > getting tripped up by that?
> > 
> > The Kerberos libraries will start to create the final component of the path,
> > if necessary, in krb5 1.11, but krb5 1.11 is slated to land during the F19
> > cycle, so I would recommend against depending on it.
> 
> The SSSD would attempt to create the last directory component of the
> dircache if it doesn't exist (and only the last component, we don't pretend
> we're mkdir -p)

Good, that's in line with what 1.11 will be doing.
Comment 7 Jakub Hrozek 2012-09-11 05:19:02 EDT
(In reply to comment #6)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > krb5 1.10 doesn't create the directory for applications.  Could SSSD be
> > > getting tripped up by that?
> > > 
> > > The Kerberos libraries will start to create the final component of the path,
> > > if necessary, in krb5 1.11, but krb5 1.11 is slated to land during the F19
> > > cycle, so I would recommend against depending on it.
> > 
> > The SSSD would attempt to create the last directory component of the
> > dircache if it doesn't exist (and only the last component, we don't pretend
> > we're mkdir -p)
> 
> Good, that's in line with what 1.11 will be doing.

OK, I checked the SSSD code again after reading the discussion in #848228 and I was wrong about calling mkdir. We *do* create the whole directory, we just attempt to be clever:
   1) the SSSD expands the template specified by krb5_ccname_template
      and krb5_ccachedir to get an absolute path name
   2) then the SSSD process (running as root) creates the path up to the
      last component
   3) the last component is created when the ccache is primed in a separate
      process running with user's credentials mostly to avoid TOCTOU
      vulnerabilities between creating the directory and chowning it.

We've been creating the directory also in order to get around the problem of systemd creating the directory too late during session, whereas we need the ccache to be stored in a directory during the auth phase.
Comment 8 John Florian 2012-09-11 11:37:10 EDT
(In reply to comment #4)
> You are running sudo you I suspect you were able to log in fine with that
> user. Can you paste the output of "klist" when logged in?
[translated ...  :-)]
> "You are running sudo so I suspect you were able to log in fine with as that user".

Yes, my normal login (via ssh) worked/works just fine.
 
> I assume the login was performed offline when the LDAP server was down. This
> debug message indicates that it was:
> (Fri Aug 31 16:18:33 2012) [sssd[be[default]]] [sdap_process_result]
> (0x0040): ldap_result error: [Can't contact LDAP server]

I have to agree with your assessment, but would be surprised by that.  However, I cannot confirm/deny it at this point.  I can say that this LDAP server is a rather critical corporate component so there should have been lots of complaining if it had been down.  Also, I would have required the same server for my ssh login only moments earlier and for identity purposes with my home directory and given the ridiculous amount of crap^Wcustomization in my .bashrc I can tell you that my user login experience is frightful when $HOME is fouled up by LDAP problems -- none of which I recall in this case.
 
> What is the value of cache_credentials in your sssd.conf?

Hmmm, not what I seem to recall:
# grep cache_credentials /etc/sssd/sssd.conf 
cache_credentials = True

I would also say this has been true for the life of the host too, since I have puppet managing all of this and those resources are under git and I've not changed the master sssd.conf since 2011-12-05.

Given that caching is enabled, maybe there was a short-term problem with the LDAP server, or it was being administered with a ham-fisted approach.  There are many admin-capable folks in my team and inter-personnel communications could be much better so it's really impossible for me to say for certain.

> Does the same error happen even if the remote server is up?

I'm not seeing it now, so I'd say no.

> Can you also attach the file /var/log/sssd/krb5_child.log ?

Coming right up.
Comment 9 John Florian 2012-09-11 11:38:28 EDT
Created attachment 611834 [details]
krb5_child.log messages resulting from running "sudo date"
Comment 10 John Florian 2012-09-11 11:49:10 EDT
Just applied lots of updates (after posting the above messages) and sudo seems to be working for me now.  Packages that look involved are:

# rpm -qa --last | awk '/sss|sudo|pam.*Tue 11 Sep 2012/' 
sudo-1.8.6-1.fc18.i686                        Tue 11 Sep 2012 11:43:55 AM EDT
libsss_sudo-1.9.0-21.fc18.beta7.i686          Tue 11 Sep 2012 11:43:35 AM EDT
sssd-1.9.0-21.fc18.beta7.i686                 Tue 11 Sep 2012 11:40:47 AM EDT
libsss_idmap-1.9.0-21.fc18.beta7.i686         Tue 11 Sep 2012 11:40:46 AM EDT
sssd-client-1.9.0-21.fc18.beta7.i686          Tue 11 Sep 2012 11:40:41 AM EDT

Thanks guys!
Comment 11 Nalin Dahyabhai 2012-09-13 11:24:02 EDT
(In reply to comment #7)
> OK, I checked the SSSD code again after reading the discussion in #848228
> and I was wrong about calling mkdir. We *do* create the whole directory, we
> just attempt to be clever:
>    1) the SSSD expands the template specified by krb5_ccname_template
>       and krb5_ccachedir to get an absolute path name
>    2) then the SSSD process (running as root) creates the path up to the
>       last component
>    3) the last component is created when the ccache is primed in a separate
>       process running with user's credentials mostly to avoid TOCTOU
>       vulnerabilities between creating the directory and chowning it.
> 
> We've been creating the directory also in order to get around the problem of
> systemd creating the directory too late during session, whereas we need the
> ccache to be stored in a directory during the auth phase.

I've resigned myself to having to do the same thing in pam_krb5.
Comment 12 Jakub Hrozek 2012-12-06 14:25:07 EST
There has been no activity here for quite some time, the setup seems to work now for the original reported and the corresponding upstream ticket has been closed as WORKSFORME as well.
Comment 13 Jason Tibbitts 2013-01-23 14:59:41 EST
I'm running into this as well, and after an extensive debugging session on IRC, sgallagh asked me to reopen this ticket.  I can trivially reproduce this and would be happy to provide any information you request.
Comment 14 Jason Tibbitts 2013-01-23 15:01:27 EST
I'm using stock F18 with recent updates:

systemd-197-1.fc18.1.x86_64
sssd-1.9.3-1.fc18.x86_64
pam_krb5-2.4.1-1.fc18.x86_64
krb5-libs-1.10.3-5.fc18.x86_64
Comment 15 Stephen Gallagher 2013-01-23 15:11:05 EST
I haven't been able to figure out why the directory isn't available in the first place for Jason, but I can trivially reproduce the effects as follows:

1) Log in to the system after a fresh boot. (This part works for me just fine).
2) rm -Rf /run/user/$UID/krb5cc
3) Attempt to 'su - <username>'

Expected results:
The krb5cc directory should be recreated by SSSD and the user should be logged in.

Actual results:
(Wed Jan 23 14:27:46 2013) [[sssd[krb5_child[9809]]]] [create_ccache_in_dir] (0x0040): 495: [-1765328189][Credential cache directory /run/user/13041/krb5cc does not exist]

It looks like the problem is that, if the user has a DIR::/path/to/ccfile (note the two colons and that the path ends with the file, not the directory), SSSD assumes that the directory MUST already exist and then it jumps to krb5_cc_resolve() which returns KRB5_FCC_NOFILE and we then fail.

So there are three things contributing here:
1) Something has apparently removed this directory from Jason's running system. I have no idea what caused this.
2) The LDB cache is storing the DIR::/path/to/ccfile version of the cache instead of DIR:/path/to/cachedir. I'm not sure if that's intentional.
3) If we get passed the DIR:: (two colon) version of the ccache to create_ccache_in_dir(), we attempt to reuse the file in a nonexistent directory and fail.


My recommendation here is that we should check for KRB5_FCC_NOFILE as a return code from krb5_cc_resolve() and switch to attempting to create the directory. This probably means linking to libpath_utils from ding-libs and taking advantage of the get_basename() routine so we can truncate the file at the end of the DIR:: path.
Comment 16 Stephen Gallagher 2013-01-23 15:17:15 EST
Jason noted on IRC that it appears that he suffers this issue after he has logged out of all sessions of his user on the system. That's believable, as I think systemd now purges /run/user/$UID when the last session is terminated.

This just makes it even more urgent to ensure that we can recreate this cache.
Comment 17 Jakub Hrozek 2013-01-24 00:30:21 EST
*** Bug 890062 has been marked as a duplicate of this bug. ***
Comment 18 Jakub Hrozek 2013-01-24 00:31:33 EST
Thank you for the extensive information in comments #15 and #16. That should be enough for a fix.

I've reopened upstream #1512 as well.
Comment 19 Jason Tibbitts 2013-01-24 14:01:16 EST
Just a note; Stephen asked that I try after a reboot, and I went ahead and tried it.  Unfortunately it seems to be random; on some boots the problem appears and on some it doesn't.  I'm at a complete loss as to why.  /run is on tmpfs so it can't be holding any state.
Comment 20 Fedora Update System 2013-01-30 09:15:26 EST
sssd-1.9.4-2.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/sssd-1.9.4-2.fc18
Comment 21 Fedora Update System 2013-02-01 12:11:11 EST
Package sssd-1.9.4-2.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sssd-1.9.4-2.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-1795/sssd-1.9.4-2.fc18
then log in and leave karma (feedback).
Comment 22 Fedora Update System 2013-02-09 06:23:38 EST
sssd-1.9.4-2.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.