Bug 1015308 - Cannot log in offline with LDAP account
Cannot log in offline with LDAP account
Status: CLOSED EOL
Product: Fedora
Classification: Fedora
Component: sssd (Show other bugs)
19
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Jakub Hrozek
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-03 17:37 EDT by Jason Elwell
Modified: 2015-01-19 09:04 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-12-30 10:50:36 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
My sssd config file (1.48 KB, text/plain)
2013-10-03 17:37 EDT, Jason Elwell
no flags Details

  None (edit)
Description Jason Elwell 2013-10-03 17:37:08 EDT
Created attachment 807299 [details]
My sssd config file

Description of problem:
Cannot log in with LDAP account through GDM or TTY if not connected to LAN.

Version-Release number of selected component (if applicable):
sssd-proxy-1.11.1-1.fc19.x86_64
sssd-ipa-1.11.1-1.fc19.x86_64
sssd-common-1.11.1-1.fc19.x86_64
sssd-ad-1.11.1-1.fc19.x86_64
sssd-client-1.11.1-1.fc19.i686
sssd-client-1.11.1-1.fc19.x86_64
sssd-krb5-common-1.11.1-1.fc19.x86_64
sssd-1.11.1-1.fc19.x86_64
sssd-ldap-1.11.1-1.fc19.x86_64
python-sssdconfig-1.11.1-1.fc19.noarch
sssd-common-pac-1.11.1-1.fc19.x86_64
sssd-krb5-1.11.1-1.fc19.x86_64


How reproducible:
By virtue of not having connectivity to LDAP server, login fails.

Steps to Reproduce:
1.Log in with LDAP account while on lan or VPN. 
2.Reboot.
3.Unplug network cable or otherwise go offline.
4.Try to use LDAP account.
5.See fail.

Actual results:
Login failure

Expected results:
Login success

Additional info:
I use LDAP user info and Kerberos auth.  If I log into my VPN as a local user, and ssh user@localhost, I can then continue with using my LDAP account.  It is more than possible that I am stupid, but I have RTFM, and I'm still flummoxed.  Let me know what you need, if anything.
Comment 1 Dmitri Pal 2013-10-03 18:55:24 EDT
Seems like same as https://bugzilla.redhat.com/show_bug.cgi?id=1015089
Can you please verify that it is in fact the same issue?
Comment 2 Stephen Gallagher 2013-10-03 20:24:44 EDT
(In reply to Dmitri Pal from comment #1)
> Seems like same as https://bugzilla.redhat.com/show_bug.cgi?id=1015089
> Can you please verify that it is in fact the same issue?

No, I don't think the symptoms look alike at all, actually.


Jason, first of all: I assume you have
cache_credentials = True
in the [domain/DOMAINNAME] section of /etc/sssd/sssd.conf?

It sounds to me like something is failing during offline authentication, which could be any of a number of things. The first thing to check would be /var/log/secure at the time of the authentication attempt. Look for lines involving "pam_sss" and include them here.

If that line says something like

pam_sss(sshd:auth): received for user username: 7 (Authentication failure)

Then this probably means it didn't accept the password.


If you see

pam_sss(sshd:auth): received for user username: 4 (System Error)

then it means there's a bug in SSSD. In this case, please add the line
debug_level = 7

to both the [pam] and [domain/DOMAINNAME] sections of sssd.conf, restart SSSD and perform the login attempt again, then attach /var/log/sssd/sssd_pam.log and /var/log/sssd/sssd_DOMAINNAME.log to this ticket as a private attachment.
Comment 3 Jason Elwell 2013-10-04 14:41:02 EDT
(In reply to Stephen Gallagher from comment #2)
> (In reply to Dmitri Pal from comment #1)
> > Seems like same as https://bugzilla.redhat.com/show_bug.cgi?id=1015089
> > Can you please verify that it is in fact the same issue?
> 
> No, I don't think the symptoms look alike at all, actually.
> 
> 
> Jason, first of all: I assume you have
> cache_credentials = True
> in the [domain/DOMAINNAME] section of /etc/sssd/sssd.conf?
> 

Correct!

> It sounds to me like something is failing during offline authentication,
> which could be any of a number of things. The first thing to check would be
> /var/log/secure at the time of the authentication attempt. Look for lines
> involving "pam_sss" and include them here.
> 
> If that line says something like
> 
> pam_sss(sshd:auth): received for user username: 7 (Authentication failure)
> 
> Then this probably means it didn't accept the password.

Have a few of these. :(
> 
> 
> If you see
> 
> pam_sss(sshd:auth): received for user username: 4 (System Error)
> 
> then it means there's a bug in SSSD. In this case, please add the line
> debug_level = 7

I see NONE of these. but an "empty" auth failure is logged.
> 
> to both the [pam] and [domain/DOMAINNAME] sections of sssd.conf, restart
> SSSD and perform the login attempt again, then attach
> /var/log/sssd/sssd_pam.log and /var/log/sssd/sssd_DOMAINNAME.log to this
> ticket as a private attachment.

I didn't follow your directions to the T, __but I will if this yields nothing or at your behest__.  Here is what I came up with, sanitized.  Its as if it cant figure out that I'm offline. The file /var/log/sssd/sssd_pam.log is completely empty.  Logging was already at 7/0x1000.

REPRODUCED LIKE THIS:
restart sssd while on VPN
close VPN connection
ssh LDAPUSER@localhost (even _while_ logged in as LDAPUSER)
Fails with: Permission denied, please try again.

restart sssd while NOT on VPN
login succeeds
However, cannot be reproduced immediately once login succeeds.  Have not found "sweet spot".

/var/log/secure (grep failure /var/log/secure| grep "Oct  4"|grep -v 7\ \()
Oct  4 06:08:32 cosi001n gdm-password]: pam_unix(gdm-password:auth): authentication failure; logname=(unknown) uid=0 euid=0 tty=:0 ruser= rhost=  user=LDAPUSER
Oct  4 11:14:45 cosi001n sshd[11187]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:16:53 cosi001n sshd[11239]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:16:53 cosi001n sshd[11239]: pam_sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost user=LDAPUSER
Oct  4 11:17:51 cosi001n sshd[11289]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:20:17 cosi001n sshd[11718]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:27:49 cosi001n sshd[12243]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:29:36 cosi001n sshd[12475]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:30:12 cosi001n sshd[12596]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:31:51 cosi001n sshd[12662]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 11:33:53 cosi001n sshd[12903]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 12:07:51 cosi001n sshd[15197]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER
Oct  4 12:57:22 cosi001n sshd[18910]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost  user=LDAPUSER

/var/log/sssd/sssd_DOMAIN
### boot/service startup:
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [confdb_get_domain_internal] (0x1000): pwd_expiration_warning is -1
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [create_socket_symlink] (0x1000): Symlinking the dbus path /var/lib/sss/pipes/private/sbus-dp_domain.com.870 to a link /var/lib/sss/pipes/private/sbus-dp_domain.com
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Loading backend [ldap] with path [/usr/lib64/sssd/libsss_ldap.so].
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Loading backend [krb5] with path [/usr/lib64/sssd/libsss_krb5.so].
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Backend [krb5] already loaded.
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Backend [ldap] already loaded.
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Backend [ldap] already loaded.
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Backend [ldap] already loaded.
(Fri Oct  4 06:08:25 2013) [sssd[be[domain.com]]] [load_backend_module] (0x1000): Backend [ldap] already loaded.
### loops through these lines MANY times, where serverX is one of the servers I have listed in my config. (already attached)
(Fri Oct  4 06:08:31 2013) [sssd[be[domain.com]]] [get_port_status] (0x1000): Port status of port 389 for server 'serverX.domain.com' is 'neutral'
(Fri Oct  4 06:08:31 2013) [sssd[be[domain.com]]] [get_server_status] (0x1000): Status of server 'serverX.domain.com' is 'name not resolved'
(Fri Oct  4 06:08:31 2013) [sssd[be[domain.com]]] [be_resolve_server_process] (0x1000): Trying with the next one!
(Fri Oct  4 06:08:31 2013) [sssd[be[domain.com]]] [get_server_status] (0x1000): Status of server 'serverX.domain.com' is 'not working'
(Fri Oct  4 06:08:32 2013) [sssd[be[domain.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: 5
### Ends? with:
(Fri Oct  4 06:08:32 2013) [sssd[be[domain.com]]] [check_ccache_re] (0x1000): Ccache directory name [/run/user/####] does not contain illegal patterns.
(Fri Oct  4 06:08:32 2013) [sssd[be[domain.com]]] [child_sig_handler] (0x1000): Waiting for child [2082].
(Fri Oct  4 06:08:32 2013) [sssd[be[domain.com]]] [parse_krb5_child_response] (0x1000): child response [0][3][37].
(Fri Oct  4 06:08:33 2013) [sssd[be[domain.com]]] [check_wait_queue] (0x1000): Wait queue for user [LDAPUSER] is empty.
Comment 4 Jakub Hrozek 2013-10-05 16:14:00 EDT
Jason, what are the permissions on /run/user/#### ?
Comment 5 Jason Elwell 2013-10-07 07:21:29 EDT
(In reply to Jakub Hrozek from comment #4)/####
> Jason, what are the permissions on /run/user ?

$ ll /run/user
total 0
drwx------ 2 root    root            40 Oct  7 06:01 0
drwx------ 8 LDAPUSER 	 LDAPGRP 180 Oct  7 06:05 2746

$ id
uid=2746(LDAPUSER) gid=1000(LDAPGRP) groups=1000(LDAPGRP),987(vboxusers)
Comment 6 Jakub Hrozek 2013-10-07 07:41:30 EDT
This line from the logs:

(Fri Oct  4 06:08:32 2013) [sssd[be[domain.com]]] [parse_krb5_child_response] (0x1000): child response [0][3][37].

Says the krb5_child was able to prep the ccache for an offline login. Is there any chance you could provide logs from both the domain section and the pam section with a higher debug_level?
Comment 7 Jason Elwell 2013-10-07 08:14:47 EDT
(In reply to Jakub Hrozek from comment #6)
> Says the krb5_child was able to prep the ccache for an offline login. Is
> there any chance you could provide logs from both the domain section and the
> pam section with a higher debug_level?

No problem!  I should be in a position to see this error tomorrow morning.  I will test after noon, my time.  First, I will crank up the logging to all sections to 0x4000, unless you'd like me to specify a higher number.
Comment 8 Stephen Gallagher 2013-10-07 08:23:40 EDT
0x4000 is not actually a higher number. The log levels (when specified in hex) are tricky to understand. They present a mask so we can turn some levels on and some off. 0x4000 would *only* show the low-level tracing.

It would be much easier to specify the decimal version, which amounts to "this level and all that come before it".

I think you probably want
debug_level = 7

(8 and 9 are pretty noisy)
Comment 9 Jason Elwell 2013-10-07 08:51:44 EDT
Ok, will do.  I was going by the debug_level section of the man page sssd.conf(5).  I am changing my config now.
Comment 10 Jason Elwell 2013-10-08 09:30:05 EDT
I didn't see the issue this morning as expected.  I will try to reproduce the issue later today.
Comment 11 Jason Elwell 2013-10-09 08:14:01 EDT
It happened again this morning, not long after first log in.  I will attach logs in just a moment.  Here's what I was doing:
0. removed bonding module and rebuilt initramfs 
X. (could this logon issue be caused by a kernel upgrade?)
1. powered off
2. undock (network on dock)
3. powered on
4. log into GDM (x4)
5. no success
6. docked 
7. no success, first time
8. logged in successfully
Comment 12 Jason Elwell 2013-10-09 08:28:27 EDT
Created attachment 809883 [details]
Pam log from Oct 09

Please see failures at 07:05 local time.
Comment 13 Jason Elwell 2013-10-09 08:30:58 EDT
Can this bug or at least attachments be hidden?  I will upload the other file once that happens.
Comment 15 Stephen Gallagher 2013-10-09 08:51:44 EDT
Jason, when uploading the attachment, there is a checkbox at the bottom just above submit called "Privacy". This restricts visibility of the attachment to the private_comments group. (Essentially Red Hat engineers only).
Comment 16 Jason Elwell 2013-10-09 08:57:28 EDT
Created attachment 809899 [details]
SSSD domain log file

Please see 7:05 local time frame for errors.
Comment 17 Jason Elwell 2013-10-09 08:58:25 EDT
(In reply to Stephen Gallagher from comment #15)
> Jason, when uploading the attachment, there is a checkbox at the bottom just
> above submit called "Privacy". This restricts visibility of the attachment
> to the private_comments group. (Essentially Red Hat engineers only).

Stephen,  don't have that option when uploading.  Would you mark the second one for me?
Comment 19 Jakub Hrozek 2013-10-09 10:58:33 EDT
Jason, I assume it's Active Directory on the server side? 

Can you put:
ldap_referrals = False

to the [domain] section and restart the SSSD? It's probably not the issue that's been hitting you but should be set anyway..I've seen a number of referrals followed in the sssd logs and disabling referral chasing is the recommended thing to do.
Comment 20 Jakub Hrozek 2013-10-09 11:06:03 EDT
There is definitely something fishy going on. There is one authentication request at 05:52:13 that succeeds:

(Wed Oct  9 05:52:13 2013) [sssd[pam]] [sysdb_cache_auth] (0x0100): Hashes do match!

And then another one for the same user:
(Wed Oct  9 07:05:18 2013) [sssd[pam]] [sysdb_cache_auth] (0x0100): Cached credentials not available.

I'm just speculating so far, but I wonder if a search on the backend side could have succeeded but returned nothing so that we'd remove the user entry? I can't think of another reason the attribute would get lost.
Comment 21 Jason Elwell 2013-10-09 12:16:53 EDT
We are using AD for LDAP.  I set the referrals section (we have that in or RHEL environment, from which i copied).  

For background, I had been on the network when I started my laptop at the 5:52.  At 7:05 I had unplugged, it failed and then plugged back in.  Yesterday, everything had worked perfectly.  So, I know its working, but there is something keeping it from working consistently.  I had put in my password in wrongly one time.  Could that have triggered that error?  Or, is that responding to the pam_unix failure?


Oct  9 05:52:03 cosi001n gdm-password]: pam_unix(gdm-password:auth): authentication failure; logname=(unknown) uid=0 euid=0 tty=:0 ruser= rhost=  user=LDAPUSER
Oct  9 05:52:03 cosi001n gdm-password]: pam_sss(gdm-password:auth): authentication failure; logname=(unknown) uid=0 euid=0 tty=:0 ruser= rhost= user=LDAPUSER
Oct  9 05:52:03 cosi001n gdm-password]: pam_sss(gdm-password:auth): received for user LDAPUSER: 7 (Authentication failure)
Oct  9 05:52:13 cosi001n gdm-password]: pam_unix(gdm-password:auth): authentication failure; logname=(unknown) uid=0 euid=0 tty=:0 ruser= rhost=  user=LDAPUSER
Oct  9 05:52:13 cosi001n gdm-password]: pam_sss(gdm-password:auth): authentication success; logname=(unknown) uid=0 euid=0 tty=:0 ruser= rhost= user=LDAPUSER
Oct  9 05:52:13 cosi001n gdm-password]: pam_unix(gdm-password:session): session opened for user LDAPUSER by (unknown)(uid=0)
Oct  9 05:52:14 cosi001n gdm-launch-environment]: pam_unix(gdm-launch-environment:session): session closed for user gdm
Comment 22 Jason Elwell 2013-10-15 08:02:41 EDT
I logged on this morning with my account thinking my problem was solved.  However, I couldn't even log in until I was on my VPN and had restarted the sssd service.  I am about to upload all of the logs to you.
Comment 23 Jason Elwell 2013-10-15 08:08:17 EDT
Created attachment 812457 [details]
log files from 10-15

Files are tar gz'ed.  Please, once again, mark as private.
Comment 25 Jason Elwell 2013-10-25 13:02:51 EDT
I'm still experiencing the issue.  Is there anything more I can provide for troubleshooting?
Comment 26 Lukas Slebodnik 2014-12-30 14:03:03 EST
Jason,
do you have the same problem with fedora 20?
Comment 27 Jason Elwell 2015-01-19 09:04:02 EST
I have had no problems like this with 20 nor 21.  Everything is working great.

Note You need to log in before you can comment on or make changes to this bug.