Bug 1605781 - gdm smart card authentication does not work shortly after disconnecting from network.
Summary: gdm smart card authentication does not work shortly after disconnecting from ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: sssd
Version: 31
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Sumit Bose
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-20 16:57 UTC by Orion Poplawski
Modified: 2020-07-30 18:56 UTC (History)
9 users (show)

Fixed In Version: sssd-2.3.1-2.fc32
Clone Of:
Environment:
Last Closed: 2020-07-30 18:56:41 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
domain log (48.48 KB, text/plain)
2019-05-07 17:16 UTC, Orion Poplawski
no flags Details
pam log (28.95 KB, text/plain)
2019-05-07 17:17 UTC, Orion Poplawski
no flags Details
domain log with debug = 10 (37.29 KB, application/gzip)
2019-05-14 16:38 UTC, Orion Poplawski
no flags Details
sssd logs with debug_level 9 (256.80 KB, application/gzip)
2019-11-04 18:46 UTC, Orion Poplawski
no flags Details
test build (9.95 MB, application/gzip)
2020-06-08 04:46 UTC, Sumit Bose
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github SSSD sssd pull 5196 0 None closed ipa: add failover to subdomain override lookups 2021-01-06 08:40:01 UTC

Description Orion Poplawski 2018-07-20 16:57:53 UTC
Description of problem:

gdm smart card authentication does not appear to work when off-line.  It is working on EL7.5.

Version-Release number of selected component (if applicable):
sssd-1.16.2-4.fc28.2.x86_64 - custom version with override_homedir fix applied.
gdm-3.28.2-1.fc28.x86_64


How reproducible:
Pretty consistently shortly after disconnecting the network.  I have seen it work though.

Steps to Reproduce:
1. disconnect network
2. try to authenticate with smart card
3.

Actual results:
authentication fails

Expected results:
authentication works

Additional info:

The actual authentication works:
Jul 20 10:17:16 bld-pc1.cora.nwra.com gdm-smartcard][17304]: pam_sss(gdm-smartcard:auth): authentication success; logname= uid=0 euid=0 tty=/dev/tty1 ruser= rhost= user=orion.com

But authorization does not:
Jul 20 10:17:16 bld-pc1.cora.nwra.com gdm-smartcard][17304]: GdmSessionWorker: determining if authenticated user (password required:0) is authorized to session
Jul 20 10:17:29 bld-pc1.cora.nwra.com gdm-smartcard][17304]: GdmSessionWorker: user is not authorized to log in: Authentication service cannot retrieve authentication info

(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_dp_send_req] (0x0100): Sending request with the following data:
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): command: SSS_PAM_AUTHENTICATE
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): domain: ad.nwra.com
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): user: orion.com
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): service: gdm-smartcard
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): tty: /dev/tty1
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): ruser: not set
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): rhost: not set
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): authtok type: 4
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): newauthtok type: 0
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): priv: 1
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): cli_pid: 17304
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_print_data] (0x0100): logon name: orion.com
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [sbus_add_timeout] (0x2000): 0x562ae90cd8e0
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_dom_forwarder] (0x0100): pam_dp_send_req returned 0
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [sbus_remove_timeout] (0x2000): 0x562ae90cd8e0
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_dp_process_reply] (0x0200): received: [9 (Authentication service cannot retrieve authentication info)][ad.nwra.com]
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_reply] (0x0200): pam_reply called with result [9]: Authentication service cannot retrieve authentication info.
(Fri Jul 20 10:17:15 2018) [sssd[pam]] [pam_reply] (0x0040): Backend cannot handle Smartcard authentication, trying local Smartcard authentication.


(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [fo_resolve_service_send] (0x0020): No available servers for service 'IPA'
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [be_mark_dom_offline] (0x1000): Marking subdomain ad.nwra.com offline
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [be_mark_subdom_offline] (0x1000): Marking subdomain ad.nwra.com as inactive
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [child_handler_setup] (0x2000): Setting up signal handler up for pid [17536]
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [child_handler_setup] (0x2000): Signal handler set up for pid [17536]
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [write_pipe_handler] (0x0400): All data has been sent!
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [read_pipe_handler] (0x0400): EOF received, client finished
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [parse_krb5_child_response] (0x1000): child response [0][3][36].
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [fo_set_port_status] (0x0100): Marking port 0 of server 'europa.nwra.com' as 'working'
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [set_server_common_status] (0x0100): Marking server 'europa.nwra.com' as 'working'
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [fo_set_port_status] (0x0400): Marking port 0 of duplicate server 'europa.nwra.com' as 'working'
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [sysdb_set_entry_attr] (0x0200): Entry [name=orion.com,cn=users,cn=ad.nwra.com,cn=sysdb] has set [ts_cache] attrs.
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [krb5_auth_done] (0x0100): Backend is marked offline, retry later!
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [check_wait_queue] (0x1000): Wait queue for user [orion.com] is empty.
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [krb5_auth_queue_done] (0x1000): krb5_auth_queue request [0x559b44e43150] done.
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [dp_req_done] (0x0400): DP Request [PAM Authenticate #6]: Request handler finished [0]: Success
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [_dp_req_recv] (0x0400): DP Request [PAM Authenticate #6]: Receiving request data.
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [dp_req_destructor] (0x0400): DP Request [PAM Authenticate #6]: Request removed.
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [dp_req_destructor] (0x0400): Number of active DP request: 0
(Fri Jul 20 10:17:15 2018) [sssd[be[nwra.com]]] [dp_pam_reply] (0x1000): DP Request [PAM Authenticate #6]: Sending result [9][ad.nwra.com]


(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [krb5_auth_done] (0x0100): Backend is marked offline, retry later!
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [check_wait_queue] (0x1000): Wait queue for user [orion.com] is empty.
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [krb5_auth_queue_done] (0x1000): krb5_auth_queue request [0x559b44ebd5e0] done.
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [dp_req_done] (0x0400): DP Request [PAM Preauth #11]: Request handler finished [0]: Success
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [_dp_req_recv] (0x0400): DP Request [PAM Preauth #11]: Receiving request data.
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [dp_req_destructor] (0x0400): DP Request [PAM Preauth #11]: Request removed.
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [dp_req_destructor] (0x0400): Number of active DP request: 1
(Fri Jul 20 10:17:38 2018) [sssd[be[nwra.com]]] [dp_pam_reply] (0x1000): DP Request [PAM Preauth #11]: Sending result [9][ad.nwra.com]


I don't know if gdm is supposed to better handle the pam error, or sssd should be quicker about switching to offline authentication and not returning result code 9 in the first place.

Comment 1 Orion Poplawski 2018-07-20 17:02:51 UTC
Once sssd fully goes into offline mode, I'm able to log in.

Comment 2 Sumit Bose 2018-08-07 15:52:21 UTC
The result code '9' is not send directly to gdm but it is returned by the SSSD backend to the PAM responder to indicate the offline state and the PAM responder then tries local Smartcard authentication which looks successful.

Are there no other messages in the domain log between '10:17:15' and '10:17:38'? I would expect a SSS_PAM_ACCT_MGMT request in the PAM responder and domain logs after the successful authentication to check is the user is authorized.

Comment 3 Orion Poplawski 2018-09-10 22:03:06 UTC
sssd logs sent privately.

Comment 4 Orion Poplawski 2018-10-10 17:51:58 UTC
From private discussion, it appears that the pam handler is hitting the 60s client_idle_timeout (60s) because the backend has a long ldap timeout:

(Mon Sep 10 15:51:15 2018) [sssd[be[nwra.com]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14
(Mon Sep 10 15:51:15 2018) [sssd[be[nwra.com]]] [sdap_op_add] (0x2000): New operation 14 timeout 60

It also brought up the question as to why the ldap_search_ext timeout is the ldap_enumeration_search_timeout (60s) rather than ldap_search_timeout (6s).

I tried dropping ldap_enumeration_search_timeout to 6s, but now either authentication fails with:

Oct 10 11:09:06 bld-pc1.cora.nwra.com gdm-password][12346]: pam_sss(gdm-password:auth): received for user orion: 10 (User not known to the underlying authentication module)

or access fails with:

Oct 10 11:17:22 bld-pc1.cora.nwra.com gdm-password][12839]: pam_sss(gdm-password:auth): authentication success; logname= uid=0 euid=0 tty=/dev/tty1 ruser= rhost= user=orion
Oct 10 11:17:28 bld-pc1.cora.nwra.com gdm-password][12839]: pam_sss(gdm-password:account): Access denied for user orion: 4 (System error)

For the latter:

It appears that the AD domain gets marked as down, but not the IPA domain:

(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'IPA'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [get_server_status] (0x1000): Status of server 'europa.nwra.com' is 'working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [get_port_status] (0x1000): Port status of port 0 for server 'europa.nwra.com' is 'not working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [get_port_status] (0x0080): SSSD is unable to complete the full connection request, this internal status does not necessarily indicate network port issues.
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [get_server_status] (0x1000): Status of server 'ipa-boulder2.nwra.com' is 'not working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [get_port_status] (0x1000): Port status of port 0 for server '(no name)' is 'not working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [get_port_status] (0x0080): SSSD is unable to complete the full connection request, this internal status does not necessarily indicate network port issues.
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [fo_resolve_service_send] (0x0020): No available servers for service 'IPA'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [be_mark_dom_offline] (0x1000): Marking subdomain ad.nwra.com offline
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [be_mark_subdom_offline] (0x1000): Marking subdomain ad.nwra.com as inactive

(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [fo_set_port_status] (0x0100): Marking port 0 of server 'europa.nwra.com' as 'working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [set_server_common_status] (0x0100): Marking server 'europa.nwra.com' as 'working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [fo_set_port_status] (0x0400): Marking port 0 of duplicate server 'europa.nwra.com' as 'working'
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [sysdb_set_entry_attr] (0x0200): Entry [name=orion.com,cn=users,cn=ad.nwra.com,cn=sysdb] has set [ts_cache] attrs.
(Wed Oct 10 11:17:21 2018) [sssd[be[nwra.com]]] [krb5_auth_done] (0x0100): Backend is marked offline, retry later!



Now comes the access request:

(Wed Oct 10 11:17:22 2018) [sssd[pam]] [pam_print_data] (0x0100): command: SSS_PAM_ACCT_MGMT
(Wed Oct 10 11:17:22 2018) [sssd[pam]] [pam_print_data] (0x0100): domain: ad.nwra.com

(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sss_domain_get_state] (0x1000): Domain nwra.com is Active
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sss_domain_get_state] (0x1000): Domain ad.nwra.com is Inactive
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_access_send] (0x0400): Performing access check for user [orion.com]
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_account_expired_rhds] (0x0400): Performing RHDS access check for user [orion.com]
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_account_expired] (0x0400): IPA access control succeeded, checking AD access control
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_account_expired_ad] (0x0400): Performing AD access check for user [orion.com]
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_print_server] (0x2000): Searching <europa.nwra.com>:389
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(objectClass=ipaHost)(fqdn=bld-pc1.cora.nwra.com))][cn=accounts,dc=nwra,dc=com].
...
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 20
(Wed Oct 10 11:17:22 2018) [sssd[be[nwra.com]]] [sdap_op_add] (0x2000): New operation 20 timeout 6

(Wed Oct 10 11:17:22 2018) [sssd[pam]] [pam_dom_forwarder] (0x0100): pam_dp_send_req returned 0
(Wed Oct 10 11:17:26 2018) [sssd[pam]] [pam_initgr_cache_remove] (0x2000): [orion] removed from PAM initgroup cache
(Wed Oct 10 11:17:27 2018) [sssd[pam]] [pam_initgr_cache_remove] (0x2000): [orion] removed from PAM initgroup cache

this now times out in 6s:
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [sdap_op_timeout] (0x1000): Issuing timeout for 20
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [sdap_op_destructor] (0x1000): Abandoning operation 20
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [ipa_pam_access_handler_done] (0x0020): Unable to fetch HBAC rules [110]: Connection timed out
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [dp_req_done] (0x0400): DP Request [PAM Account #8]: Request handler finished [0]: Success
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [_dp_req_recv] (0x0400): DP Request [PAM Account #8]: Receiving request data.
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [dp_req_destructor] (0x0400): DP Request [PAM Account #8]: Request removed.
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [dp_req_destructor] (0x0400): Number of active DP request: 0

But we return system error:
(Wed Oct 10 11:17:28 2018) [sssd[be[nwra.com]]] [dp_pam_reply] (0x1000): DP Request [PAM Account #8]: Sending result [4][ad.nwra.com]

(Wed Oct 10 11:17:28 2018) [sssd[pam]] [sbus_remove_timeout] (0x2000): 0x55a557d37f00
(Wed Oct 10 11:17:28 2018) [sssd[pam]] [pam_dp_process_reply] (0x0200): received: [4 (System error)][ad.nwra.com]
(Wed Oct 10 11:17:28 2018) [sssd[pam]] [pam_reply] (0x0200): pam_reply called with result [4]: System error.

I tried a couple more times but they all failed.  europa.nwra.com stays marked as 'working'.  It seems to stay "resolved"?

(Wed Oct 10 11:17:51 2018) [sssd[be[nwra.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 6 seconds
(Wed Oct 10 11:17:51 2018) [sssd[be[nwra.com]]] [get_server_status] (0x1000): Status of server 'europa.nwra.com' is 'working'
(Wed Oct 10 11:17:51 2018) [sssd[be[nwra.com]]] [be_resolve_server_process] (0x1000): Saving the first resolved server
(Wed Oct 10 11:17:51 2018) [sssd[be[nwra.com]]] [be_resolve_server_process] (0x0200): Found address for server europa.nwra.com: [X.X.X.X] TTL 86400

and the sdap_account_expired_ad check that queries it and times out does not appear to affect its status.

Comment 5 Orion Poplawski 2018-10-10 17:56:14 UTC
As an aside - I see sssd reacting to network links coming up, but not going down.  I would think that if there were now active network links you could assume off-line status.

Comment 6 Sumit Bose 2018-10-12 09:29:05 UTC
Are the logs from above taken when the system was not connected to the network or was the system still connected?

If it was not connected, did you run an authentication for the same user while the system was online to make sure all needed data for the user are available in the cache?

bye,
Sumit

Comment 7 Orion Poplawski 2018-10-14 18:05:12 UTC
Network cable had been removed.  User had logged in and back out prior to make sure cache was fresh.

Comment 8 Ben Cotton 2019-05-02 19:50:20 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Lukas Slebodnik 2019-05-03 08:18:49 UTC
Does it work for you on f29 ?

Comment 10 Orion Poplawski 2019-05-06 21:34:05 UTC
No.  Again authentication is okay but authorization fails:

May 06 15:22:42 ares.cora.nwra.com gdm-password][9478]: pam_sss(gdm-password:account): Access denied for user orion: 4 (System error)
May 06 15:22:42 ares.cora.nwra.com gdm-password][9478]: GdmSessionWorker: user is not authorized to log in: System error

(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_dp_send_req] (0x0100): Sending request with the following data:
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): command: SSS_PAM_ACCT_MGMT
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): domain: ad.nwra.com
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): user: orion.com
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): service: gdm-password
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): tty: /dev/tty1
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): ruser: not set
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): rhost: not set
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): authtok type: 0
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): newauthtok type: 0
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): priv: 1
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): cli_pid: 9478
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): logon name: orion
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): flags: 0
(Mon May  6 15:21:42 2019) [sssd[pam]] [pam_dom_forwarder] (0x0100): pam_dp_send_req returned 0
(Mon May  6 15:22:42 2019) [sssd[pam]] [pam_dp_send_req_done] (0x0200): received: [4 (System error)][ad.nwra.com]
(Mon May  6 15:22:42 2019) [sssd[pam]] [pam_reply] (0x0200): pam_reply called with result [4]: System error.

(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [dp_pam_handler_send] (0x0100): Got request with the following data
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): command: SSS_PAM_ACCT_MGMT
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): domain: ad.nwra.com
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): user: orion.com
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): service: gdm-password
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): tty: /dev/tty1
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): ruser:
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): rhost:
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): authtok type: 0
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): newauthtok type: 0
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): priv: 1
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): cli_pid: 9478
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): logon name: not set
(Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100): flags: 0
(Mon May  6 15:22:42 2019) [sssd[be[nwra.com]]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out
(Mon May  6 15:22:42 2019) [sssd[be[nwra.com]]] [ipa_pam_access_handler_done] (0x0020): Unable to fetch HBAC rules [110]: Connection timed out
(Mon May  6 15:22:44 2019) [sssd[be[nwra.com]]] [dp_get_account_info_send] (0x0200): Got request for [0x14][BE_REQ_BY_CERT][cert=MIIIBT...

Comment 11 Sumit Bose 2019-05-07 05:25:38 UTC
(In reply to Orion Poplawski from comment #10)
> No.  Again authentication is okay but authorization fails:
> 
> May 06 15:22:42 ares.cora.nwra.com gdm-password][9478]:
> pam_sss(gdm-password:account): Access denied for user orion: 4 (System error)
> May 06 15:22:42 ares.cora.nwra.com gdm-password][9478]: GdmSessionWorker:
> user is not authorized to log in: System error
> 
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_dp_send_req] (0x0100): Sending
> request with the following data:
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): command:
> SSS_PAM_ACCT_MGMT
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): domain:
> ad.nwra.com
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): user:
> orion.com
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): service:
> gdm-password
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): tty:
> /dev/tty1
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): ruser: not
> set
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): rhost: not
> set
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): authtok
> type: 0
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): newauthtok
> type: 0
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): priv: 1
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): cli_pid:
> 9478
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): logon
> name: orion
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_print_data] (0x0100): flags: 0
> (Mon May  6 15:21:42 2019) [sssd[pam]] [pam_dom_forwarder] (0x0100):
> pam_dp_send_req returned 0
> (Mon May  6 15:22:42 2019) [sssd[pam]] [pam_dp_send_req_done] (0x0200):
> received: [4 (System error)][ad.nwra.com]
> (Mon May  6 15:22:42 2019) [sssd[pam]] [pam_reply] (0x0200): pam_reply
> called with result [4]: System error.
> 
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [dp_pam_handler_send]
> (0x0100): Got request with the following data
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> command: SSS_PAM_ACCT_MGMT
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> domain: ad.nwra.com
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> user: orion.com
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> service: gdm-password
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> tty: /dev/tty1
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> ruser:
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> rhost:
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> authtok type: 0
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> newauthtok type: 0
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> priv: 1
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> cli_pid: 9478
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> logon name: not set
> (Mon May  6 15:21:42 2019) [sssd[be[nwra.com]]] [pam_print_data] (0x0100):
> flags: 0
> (Mon May  6 15:22:42 2019) [sssd[be[nwra.com]]] [generic_ext_search_handler]
> (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out
> (Mon May  6 15:22:42 2019) [sssd[be[nwra.com]]]
> [ipa_pam_access_handler_done] (0x0020): Unable to fetch HBAC rules [110]:
> Connection timed out
> (Mon May  6 15:22:44 2019) [sssd[be[nwra.com]]] [dp_get_account_info_send]
> (0x0200): Got request for [0x14][BE_REQ_BY_CERT][cert=MIIIBT...

Hi,

looks like in this code patch SSSD does not switch into offline mode properly. I'll investigate.

bye,
Sumit

Comment 12 Sumit Bose 2019-05-07 07:40:03 UTC
Hi,

would it be possible to attach the pam and backend logs which cover the time of the SSS_PAM_PREAUTH and SSS_PAM_AUTHENTICATE steps as well? I would like to see if and how the backend switches into offline mode during this steps.

In the access control step it is expected that the backend is either online and the authentication was done online as well or that the system was offline and e.g. the authentication step already switched the backend into offline mode.

bye,
Sumit

Comment 13 Orion Poplawski 2019-05-07 17:16:58 UTC
Created attachment 1565310 [details]
domain log

Comment 14 Orion Poplawski 2019-05-07 17:17:28 UTC
Created attachment 1565311 [details]
pam log

Comment 15 Sumit Bose 2019-05-10 10:29:20 UTC
Hi,

can you repeat the test with debug_level=10 in the [domain/...] section and resend the logs?

bye,
Sumit

Comment 16 Orion Poplawski 2019-05-14 16:38:27 UTC
Created attachment 1568551 [details]
domain log with debug = 10

retried with debug=10

Comment 17 Ben Cotton 2019-10-31 19:20:42 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 18 Orion Poplawski 2019-11-04 18:43:29 UTC
Still present with sssd-2.2.2-3.fc31.x86_64

Comment 19 Orion Poplawski 2019-11-04 18:46:05 UTC
Created attachment 1632713 [details]
sssd logs with debug_level 9

Comment 20 Sumit Bose 2020-06-05 12:28:33 UTC
Hi,

thank you for your patience. I think I found the reason for the issue and it should be fixed by https://github.com/SSSD/sssd/pull/5196. If you let me know which version of SSSD you are currently using I can prepare a test build.

bye,
Sumit

Comment 21 Orion Poplawski 2020-06-05 17:30:10 UTC
Nice.  Test machine is running latest F31 - so: 2.2.3-13.fc31

Thanks!

Comment 22 Sumit Bose 2020-06-08 04:46:22 UTC
Created attachment 1695975 [details]
test build

Hi,

please find attached a tar-ball with a test-build based on sssd-2.2.3-13.

If there are still issues please attach new logs files.

bye,
Sumit

Comment 23 Orion Poplawski 2020-06-08 16:49:48 UTC
Sumit - LGTM.  Thanks!

Comment 24 Sumit Bose 2020-06-09 04:59:23 UTC
(In reply to Orion Poplawski from comment #23)
> Sumit - LGTM.  Thanks!

Thanks for the feedback.

bye,
Sumit

Comment 25 Pavel Březina 2020-06-18 10:21:27 UTC
Pushed PR: https://github.com/SSSD/sssd/pull/5196

* `master`
    * df632eec450791559a4a7644f241964397c10ff9 - ipa: add failover to subdomain override lookups
* `sssd-1-16`
    * 510f154b02f2c059aeb0c1a28f3a5f83ceca662c - ipa: add failover to subdomain override lookups

Comment 26 Fedora Update System 2020-07-27 14:07:12 UTC
FEDORA-2020-63a418c824 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-63a418c824

Comment 27 Fedora Update System 2020-07-28 15:20:04 UTC
FEDORA-2020-63a418c824 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-63a418c824`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-63a418c824

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 28 Fedora Update System 2020-07-30 18:56:41 UTC
FEDORA-2020-63a418c824 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.