Bug 1306707

Summary: Need better debug message when krb5_child returns an unhandled error, leading to a System Error PAM code
Product: Red Hat Enterprise Linux 7 Reporter: gagriogi
Component: sssdAssignee: SSSD Maintainers <sssd-maint>
Status: CLOSED ERRATA QA Contact: shridhar <sgadekar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.3CC: agk, grajaiya, jhrozek, lslebodn, mkosek, mzidek, pbrezina, sgoveas, sssd-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: sssd-1.15.2-33.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 08:58:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description gagriogi 2016-02-11 16:00:55 UTC
Description of problem:

======= Cloning from upstream ticket: #2955 =============

We need better handling of PAM system error 4.
Sometimes it needs to be redirected to another error code, other times it needs better/more explanation along with any other associated error code.

I have marked this as RFE but have not followed up with the official RFE template since it does not come directly from a customer, it's just something that feels a need of enhancement to allow sysadmins to understand better the nature of the issue they troubleshoot.


Version-Release number of selected component (if applicable):
sssd-1.12.4-47.el6.x86_64


-------------------------------------------------------
                  Example: [anonymized with $variables]
-------------------------------------------------------

$ cat etc/sssd/sssd.conf 
[sssd]
domains = $mydomain
config_file_version = 2
services = nss, pam, ssh, autofs
default_domain_suffix = $mydomain
debug_level = 9

[domain/$mydomain]
ad_domain = $mydomain
krb5_realm = $MYREALM
realmd_tags = manages-system joined-with-adcli
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u
#access_provider = simple
access_provider = ad 
ignore_group_members = True
#simple_allow_groups = $mydomain\est-level3-cloudadmins-openstack, $mydomain\est-level3-serveradmins-unix, $mydomain\est-level3-cloudadmins
debug_level = 9


$ cat etc/krb5.conf 
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[libdefaults]
default_realm = $MYREALM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
canonicalize = true

[realms]
$MYREALM = {
  kdc = $MYSERVER6.$mydomain
  admin_server = $MYSERVER6.$mydomain
 }

[domain_realm]
$mydomain = $MYREALM
 .$mydomain = $MYREALM


domain log:
---
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): domain: $mydomain
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): user: 55236
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): service: sshd
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): tty: ssh
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): ruser:
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): rhost: localhost
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): authtok type: 1
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): newauthtok type: 0
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): priv: 1
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): cli_pid: 10854
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [pam_print_data] (0x0100): logon name: not set
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [krb5_pam_handler] (0x1000): Wait queue of user [55236] is empty, running request immediately.
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [ldb] (0x4000): Added timed event "ltdb_callback": 0x138b180
 
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [ldb] (0x4000): Added timed event "ltdb_timeout": 0x13a14a0
 
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [ldb] (0x4000): Running timer event 0x138b180 "ltdb_callback"
 
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [ldb] (0x4000): Destroying timer event 0x13a14a0 "ltdb_timeout"
 
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [ldb] (0x4000): Ending timer event 0x138b180 "ltdb_callback"
 
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [krb5_auth_send] (0x0100): Home directory for user [55236] not known.
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [krb5_auth_prepare_ccache_name] (0x1000): No ccache file for user [55236] found.
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD'
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [get_server_status] (0x1000): Status of server '$$myserver.$mydomain' is 'name resolved'
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 6 seconds
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [resolve_srv_send] (0x0200): The status of SRV lookup is resolved
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [get_server_status] (0x1000): Status of server '$$myserver.$mydomain' is 'name resolved'
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [be_resolve_server_process] (0x1000): Saving the first resolved server
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [be_resolve_server_process] (0x0200): Found address for server $$myserver.$mydomain: [$myip] TTL 3102
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [child_handler_setup] (0x2000): Setting up signal handler up for pid [10867]
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [child_handler_setup] (0x2000): Signal handler set up for pid [10867]
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [write_pipe_handler] (0x0400): All data has been sent!
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [child_sig_handler] (0x1000): Waiting for child [10867].
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [child_sig_handler] (0x0100): child [10867] finished successfully.
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [read_pipe_handler] (0x0400): EOF received, client finished
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [parse_krb5_child_response] (0x1000): child response [1432158209][6][8].
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [check_wait_queue] (0x1000): Wait queue for user [55236] is empty.
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 4, <NULL>) [Success]
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [be_pam_handler_callback] (0x0100): Sending result [4][$mydomain]
(Wed Feb 10 12:49:50 2016) [sssd[be[$mydomain]]] [be_pam_handler_callback] (0x0100): Sent result [4][$mydomain]
---

krb5_child.log:
---
(Wed Feb 10 12:49:50 2016) [[sssd[krb5_child[10867]]]] [get_and_save_tgt] (0x0020): 1007: [-1765328377][Server not found in Kerberos database]
(Wed Feb 10 12:49:50 2016) [[sssd[krb5_child[10867]]]] [map_krb5_error] (0x0020): 1069: [-1765328377][Server not found in Kerberos database]
(Wed Feb 10 12:49:50 2016) [[sssd[krb5_child[10867]]]] [k5c_send_data] (0x0200): Received error code 1432158209
---

This was in fact invalid ccache credentials file (did not exist) but do we need a system error 4 for this or we can handle it separately/differently? And could we get more info on the kerberos child response? perhaps if it was more descriptive it could make more sense.

Apologies if this has been reported again, I could not find any relevant bugs in my search.

Thank you,
George - GSS

Comment 2 Jakub Hrozek 2016-02-16 15:21:56 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/2955

Comment 7 Jakub Hrozek 2017-01-13 08:25:56 UTC
master:
    b4dd0867c48e74703f8f7afbf9d5634238dffc1c 
sssd-1-14:
    eb730befa09cbb69ee61bdedd14fc2c8bf8a3434 
sssd-1-13:
    3ed51ca2740981bdce4c5efe965c15d5f707b220

Comment 11 shridhar 2017-05-18 10:24:41 UTC
As per commit, the additional information will be added in logs.

master:  b4dd0867c48e74703f8f7afbf9d5634238dffc1c

+        DEBUG(SSSDBG_IMPORTANT_INFO,
+              "The krb5_child process returned an error. Please inspect the "
+              "krb5_child.log file or the journal for more information\n");



I'm testing with sssd-1.15.2-30.el7.x86_64  and with modification as per #10, I don't see proposed information in the log files (/var/log/messages, secure, sssd_*)

With 700 permission on tmp, user authentication fails with system_error_4. Following messages are observed in various files:

in /var/log/messages:
May 18 05:59:23 shr-r7-permanent [sssd[krb5_child[24206]]]: Credentials cache permissions incorrect (filename: /tmp/user_616401111)
May 18 05:59:27 shr-r7-permanent [sssd[krb5_child[24215]]]: Credentials cache permissions incorrect (filename: /tmp/user_616401111)
May 18 05:59:29 shr-r7-permanent [sssd[krb5_child[24216]]]: Credentials cache permissions incorrect (filename: /tmp/user_616401111)

In /var/log/secure
May 18 05:59:27 shr-r7-permanent sshd[24203]: pam_sss(sshd:auth): received for user fu1: 4 (System error)
May 18 05:59:27 shr-r7-permanent sshd[24203]: Failed password for fu1 from ::1 port 33876 ssh2
May 18 05:59:


In /var/log/sssd_sssd16.qe
(Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [parse_krb5_child_response] (0x0020): message too short.
(Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [krb5_auth_done] (0x0040): Could not parse child response [22]: Invalid argument
(Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [check_wait_queue] (0x1000): Wait queue for user [fu1] is empty.
(Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [krb5_auth_queue_done] (0x0040): krb5_auth_recv failed with: 22
(Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [dp_req_done] (0x0400): DP Request [PAM Authenticate #3]: Request handler finished [0]: Success
(Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [_dp_req_recv] (0x0400): DP Request [PAM Authenticate #3]: Receiving request data.

In /var/log/sssd/krb5_child.log
(Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [create_ccache_dir] (0x0010): Check the ownership and permissions of krb5_ccachedir: [/tmp]
.
(Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [k5c_precreate_ccache] (0x0040): ccache creation failed.
(Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [k5c_ccache_setup] (0x0040): Cannot precreate ccache
(Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [privileged_krb5_setup] (0x0020): k5c_ccache_setup failed.
(Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [main] (0x0020): privileged_krb5_setup failed.
(Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [main] (0x0020): krb5_child failed!



* Which is the log file that should print information as per commit?
Or
* Is additional logs in /var/log/message sufficient?

Comment 12 Jakub Hrozek 2017-05-18 10:34:13 UTC
Thank you for testing, this case is then not handled well. Please switch the bug to assigned, the fix will be super-easy.

Comment 13 shridhar 2017-05-18 10:37:02 UTC
(In reply to Jakub Hrozek from comment #12)
> Thank you for testing, this case is then not handled well. Please switch the
> bug to assigned, the fix will be super-easy.

switching bug to assigned.

Comment 14 Lukas Slebodnik 2017-05-18 11:11:11 UTC
(In reply to shridhar from comment #11)
> As per commit, the additional information will be added in logs.
> 
> master:  b4dd0867c48e74703f8f7afbf9d5634238dffc1c
> 
> +        DEBUG(SSSDBG_IMPORTANT_INFO,
> +              "The krb5_child process returned an error. Please inspect the
> "
> +              "krb5_child.log file or the journal for more information\n");
> 
> 
> 
> I'm testing with sssd-1.15.2-30.el7.x86_64  and with modification as per
> #10, I don't see proposed information in the log files (/var/log/messages,
> secure, sssd_*)
> 
> With 700 permission on tmp, user authentication fails with system_error_4.
> Following messages are observed in various files:
> 
> in /var/log/messages:
> May 18 05:59:23 shr-r7-permanent [sssd[krb5_child[24206]]]: Credentials
> cache permissions incorrect (filename: /tmp/user_616401111)
> May 18 05:59:27 shr-r7-permanent [sssd[krb5_child[24215]]]: Credentials
> cache permissions incorrect (filename: /tmp/user_616401111)
> May 18 05:59:29 shr-r7-permanent [sssd[krb5_child[24216]]]: Credentials
> cache permissions incorrect (filename: /tmp/user_616401111)
> 
> In /var/log/secure
> May 18 05:59:27 shr-r7-permanent sshd[24203]: pam_sss(sshd:auth): received
> for user fu1: 4 (System error)
> May 18 05:59:27 shr-r7-permanent sshd[24203]: Failed password for
> fu1 from ::1 port 33876 ssh2
> May 18 05:59:
> 
> 
> In /var/log/sssd_sssd16.qe
> (Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [parse_krb5_child_response]
> (0x0020): message too short.
> (Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [krb5_auth_done] (0x0040):
> Could not parse child response [22]: Invalid argument
> (Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [check_wait_queue]
> (0x1000): Wait queue for user [fu1] is empty.
> (Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [krb5_auth_queue_done]
> (0x0040): krb5_auth_recv failed with: 22
> (Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [dp_req_done] (0x0400): DP
> Request [PAM Authenticate #3]: Request handler finished [0]: Success
> (Thu May 18 05:06:31 2017) [sssd[be[sssd16.qe]]] [_dp_req_recv] (0x0400): DP
> Request [PAM Authenticate #3]: Receiving request data.
> 
> In /var/log/sssd/krb5_child.log
> (Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [create_ccache_dir]
> (0x0010): Check the ownership and permissions of krb5_ccachedir: [/tmp]
> .
> (Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]]
> [k5c_precreate_ccache] (0x0040): ccache creation failed.
> (Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [k5c_ccache_setup]
> (0x0040): Cannot precreate ccache
> (Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]]
> [privileged_krb5_setup] (0x0020): k5c_ccache_setup failed.
> (Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [main] (0x0020):
> privileged_krb5_setup failed.
> (Thu May 18 05:06:31 2017) [[sssd[krb5_child[23515]]]] [main] (0x0020):
> krb5_child failed!
> 
> 
> 
> * Which is the log file that should print information as per commit?
> Or
> * Is additional logs in /var/log/message sufficient?

Error message will be in /var/log/sssd/sssd_${domain}.log
But you would need to increase debug_level to 2,
which is not enabled by default.

Comment 15 Jakub Hrozek 2017-05-22 08:01:33 UTC
https://github.com/SSSD/sssd/pull/276

Comment 16 Jakub Hrozek 2017-05-23 09:22:52 UTC
This use-case should be solved with the following commit:
7410f735b64937e0c2401c09b5cffc9c78b11849

Comment 18 shridhar 2017-05-25 15:22:23 UTC
verified with 

[root@shr-r7-permanent ~]# rpm -q sssd
sssd-1.15.2-33.el7.x86_64

root@shr-r7-permanent ~]# cat /etc/sssd/sssd.conf 

[sssd]
domains = sssd16.qe
config_file_version = 2
services = nss, pam

[domain/sssd16.qe]
ad_domain = sssd16.qe
krb5_realm = SSSD16.QE
realmd_tags = manages-system joined-with-adcli 
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
#ldap_sasl_authid = SHR-R7-PERMANEN$
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
access_provider = ad
ldap_user_search_base =  CN=Users,DC=sssd16,DC=qe
ldap_group_search_base =  CN=Users,DC=sssd16,DC=qe
debug_level = 2
 
[root@shr-r7-permanent ~]# service sssd stop ; rm -rf /var/lib/sss/db/* ; rm -rf /var/log/sssd/* ; service sssd start
Redirecting to /bin/systemctl stop  sssd.service
Redirecting to /bin/systemctl start  sssd.service
[root@shr-r7-permanent ~]# ll -d /tmp/
drwx------. 13 root root 4096 May 25 11:19 /tmp/
[root@shr-r7-permanent ~]# ssh -l fu1 localhost
fu1@localhost's password: 
Permission denied, please try again.
fu1@localhost's password: 

[root@shr-r7-permanent ~]# egrep 'krb5_auth_done' /var/log/sssd/sssd_sssd16.qe.log 
(Thu May 25 11:20:03 2017) [sssd[be[sssd16.qe]]] [krb5_auth_done] (0x0040): The krb5_child process returned an error. Please inspect the krb5_child.log file or the journal for more information
(Thu May 25 11:20:03 2017) [sssd[be[sssd16.qe]]] [krb5_auth_done] (0x0040): Could not parse child response [22]: Invalid argument

Comment 19 errata-xmlrpc 2017-08-01 08:58:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2294