Bug 756428 - sssd.service does not depend on time-sync.target, breaks krb5 and such
Summary: sssd.service does not depend on time-sync.target, breaks krb5 and such
Alias: None
Product: Fedora
Classification: Fedora
Component: sssd
Version: 16
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: Stephen Gallagher
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2011-11-23 14:34 UTC by Sandro Mathys
Modified: 2020-05-02 16:41 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2012-03-05 21:23:48 UTC
Type: ---

Attachments (Terms of Use)
secure log of what's described in comment #6 (2.13 KB, text/plain)
2011-11-24 08:13 UTC, Sandro Mathys
no flags Details
syslog of what's described in comment #6 (1.24 KB, text/plain)
2011-11-24 08:13 UTC, Sandro Mathys
no flags Details
systemctl status sssd.service output as described in comment #6 (1.07 KB, application/octet-stream)
2011-11-24 08:14 UTC, Sandro Mathys
no flags Details

System ID Private Priority Status Summary Last Updated
Github SSSD sssd issues 2138 0 None None None 2020-05-02 16:41:20 UTC

Description Sandro Mathys 2011-11-23 14:34:43 UTC
Description of problem:
sssd.service should depend on time-sync.target to make sure the system has the correct time. Otherwise krb5 and probably other authentication mechanisms won't work.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Configure sssd to use krb5 authentication
2. Make sure your hwclock is off my an hour or so
3. Configure ntpd/ntpdate or chronyd/chrony-wait
4. Reboot
5. Try to log in
Actual results:
Authentication fails

Expected results:
Authentication works

Additional info:
/lib/systemd/system/sssd.service needs to be changed to read:
After=syslog.target time-sync.target

Comment 1 Stephen Gallagher 2011-11-23 14:45:43 UTC
This is a bit of a chicken and egg problem. time-sync.target (IIRC) is dependent on the network target. SSSD has to be available to handle requests before the network comes up. SSSD is capable of handling this because it has built-in caching (unlike nss_ldap which would in many cases hang for minutes during startup).

What you should be dealing with here is a race-condition. If you try to log in during the period where the network is up, but ntp is not yet started, then SSSD will contact the KDC and get an error about clock skew. But once ntp adjusts the time, logins should work correctly. This race-condition should be *very* small.

If you're seeing this always fail, then there's a bug in NTP/chrony, not SSSD.

Comment 2 Simo Sorce 2011-11-23 15:00:14 UTC
Stephen, what you say is right, but then we need to patch sssd to use offline authentication if we get a clock skew error. Failing authentication in this case is not the expected outcome.

I think we should open a ticket in sssd's trac if we fail authentication on clock skew.

Comment 3 Sandro Mathys 2011-11-23 15:05:51 UTC
chronyd-wait.service does work, i.e. the time is corrected within a couple of seconds. The system's time is correct when I try to login, but authentication will fail.

As soon as I restart sssd, authentication works.

Reproducable, always.

Now tell me where this is a chrony bug and not a SSSD bug.

Comment 4 Stephen Gallagher 2011-11-23 15:13:05 UTC
Upstream ticket:

Comment 5 Stephen Gallagher 2011-11-23 15:28:24 UTC
If the system time is correct, but SSSD fails then something else is going on here.

Are you using cached credentials? Also, are you using GSSAPI to connect to the LDAP provider (or using the IPA provider?)

What shows up in /var/log/secure when this happens?

Comment 6 Sandro Mathys 2011-11-24 08:12:31 UTC
We don't cache credentials as the user homes are on a CIFS share so login in offline state is useless in this case anyway.

Yes, we are using GSSAPI to connect to the ldap provider.

I'll attach the secure log, the syslog (grep'ed for sss and chrony) and the output of two 'systemctl status sssd.service'.

I did the following:
1) reboot
2) login as (remote user) sandroma -> fail (3 times, just to make sure)
3) login as (local user) root
4) systemctl status sssd.service
5) systemctl restart sssd.service
6) login as sandroma -> success
7) systemctl status sssd.service

If you put the information of those logs into one timeline, you'll clearly see chrony has synced the time well ahead oh the user trying to log in. Oh, in case it matters somehow: I logged in on a tty, even though gdm was started.

Comment 7 Sandro Mathys 2011-11-24 08:13:31 UTC
Created attachment 535807 [details]
secure log of what's described in comment #6

Comment 8 Sandro Mathys 2011-11-24 08:13:52 UTC
Created attachment 535808 [details]
syslog of what's described in comment #6

Comment 9 Sandro Mathys 2011-11-24 08:14:31 UTC
Created attachment 535809 [details]
systemctl status sssd.service output as described in comment #6

Comment 10 Stephen Gallagher 2011-11-28 12:30:01 UTC
Sorry for the long delay. It was a holiday in the US.

Sandro, could you set
debug_level = 9

in the [domain/<DOMAINNAME>] section of /etc/sssd/sssd.conf and then reboot to rerun the failing test?

Then please attach /var/log/sssd/sssd_<DOMAINNAME>.log and /var/log/ldap_child.log to this ticket (sanitized if needed).

Also, could you please try one more thing with your test? Please try waiting five minutes after the first failure before attempting a second login. I have a hunch that what may be happening is this:

SSSD comes up before NTP starts.
Something asks SSSD for a lookup before NTP has started.
SSSD tries to connect to LDAP and is denied the GSSAPI bind, so SSSD goes into "offline mode" to answer requests from the cache for two minutes.
The system finishes booting and you try to log in before those two minutes are complete. Since you don't have cached credentials, SSSD has to return failure (since from its perspective, there's no way to validate you until we return to online mode).

I should point out that in this configuration, the cached credentials would still be valuable, as you are technically "online" in a network sense, but "offline" from SSSD's perspective.

Anyway, I want to rule out whether this is a timing issue or somehow we're entering an offline state from which we will never return (which is a serious issue).

Thanks for your help sorting this out.

Note You need to log in before you can comment on or make changes to this bug.