Bug 1516266

Summary: Give a more detailed debug and system-log message if krb5_init_context() failed
Product: Red Hat Enterprise Linux 7 Reporter: Andrey Bondarenko <abondare>
Component: sssdAssignee: Sumit Bose <sbose>
Status: CLOSED ERRATA QA Contact: sssd-qe <sssd-qe>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.4CC: fidencio, grajaiya, jhrozek, lslebodn, mkosek, mzidek, pbrezina, sbose, sgadekar, sgoveas, tscherf
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: sssd-1.16.0-22.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 10:41:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs and configs none

Description Andrey Bondarenko 2017-11-22 11:41:29 UTC
Created attachment 1357406 [details]
logs and configs

Description of problem:

The 'includedir; directive makes sssd fail to start

Version-Release number of selected component (if applicable):

python-sss-murmur-1.15.2-50.el7_4.6.x86_64
sssd-client-1.15.2-50.el7_4.6.x86_64
sssd-1.15.2-50.el7_4.6.x86_64
sssd-krb5-common-1.15.2-50.el7_4.6.x86_64
sssd-proxy-1.15.2-50.el7_4.6.x86_64
libsss_nss_idmap-1.15.2-50.el7_4.6.x86_64
libsss_certmap-1.15.2-50.el7_4.6.x86_64
sssd-ipa-1.15.2-50.el7_4.6.x86_64
sssd-common-1.15.2-50.el7_4.6.x86_64
sssd-common-pac-1.15.2-50.el7_4.6.x86_64
sssd-ad-1.15.2-50.el7_4.6.x86_64
libsss_idmap-1.15.2-50.el7_4.6.x86_64
python-sssdconfig-1.15.2-50.el7_4.6.noarch
sssd-krb5-1.15.2-50.el7_4.6.x86_64
libsss_autofs-1.15.2-50.el7_4.6.x86_64
sssd-ldap-1.15.2-50.el7_4.6.x86_64
libsss_sudo-1.15.2-50.el7_4.6.x86_64
sssd-polkit-rules-1.15.2-50.el7_4.6.x86_64
krb5-libs-1.15.1-8.el7.x86_64
pam_krb5-2.4.8-6.el7.x86_64
krb5-libs-1.15.1-8.el7.i686
krb5-workstation-1.15.1-8.el7.x86_64


How reproducible:

include 'includedir' into krb5.conf, restart sssd


Actual results:

sssd fails

Expected results:

sssd works
Additional info:

Comment 2 Fabiano Fidêncio 2017-11-22 11:48:12 UTC
From a quick look in the logs, I can see that pac responder is not running, thus krb5_child fails to start ...

Could you check whether sssd-common-pac is installed? If not, could you install and give it a try again (and, if possible, update new logs with PAC installed).

Comment 3 Andrey Bondarenko 2017-11-22 12:28:44 UTC
It's here:


┌[root@slackware-13]─[~]
└─22/11 13:27:29-> rpm -ql sssd-common-pac
/usr/libexec/sssd/sssd_pac
/usr/share/licenses/sssd-common-pac-1.15.2
/usr/share/licenses/sssd-common-pac-1.15.2/COPYING

┌[root@slackware-13]─[~]
└─22/11 13:27:35-> rpm -qa|grep sssd-common-pac
sssd-common-pac-1.15.2-50.el7_4.6.x86_64

Comment 4 Sumit Bose 2017-11-22 12:41:34 UTC
@Fabiano, krb5_child does not depend on the PAC responder.

@Andrey, does /var/lib/sss/pubconf/krb5.include.d/ exists and has 

drwxr-xr-x.  2 root root system_u:object_r:sssd_public_t:s0     6 25. Okt 10:14 krb5.include.d

permissions? It should be part of sssd-krb5-common package.

Comment 5 Andrey Bondarenko 2017-11-22 12:59:23 UTC
Yes, they do. I have 7.4 with mininal changes:

┌[root@slackware-13]─[~]
└─22/11 13:57:42-> ls -Z /var/lib/sss/pubconf/krb5.include.d/ 
-rw-r--r--. root root system_u:object_r:sssd_public_t:s0 domain_realm_ipa_redhat_com
-rw-r--r--. root root system_u:object_r:sssd_public_t:s0 krb5_libdefaults
-rw-r--r--. root root system_u:object_r:sssd_public_t:s0 localauth_plugin



┌[root@slackware-13]─[~]
└─22/11 13:57:48-> ls -Z /var/lib/sss/pubconf/
-rw-r--r--. root root system_u:object_r:sssd_public_t:s0 kdcinfo.REDHAT.COM
-rw-r--r--. root root system_u:object_r:sssd_public_t:s0 kpasswdinfo.REDHAT.COM
drwxr-xr-x. sssd sssd system_u:object_r:sssd_public_t:s0 krb5.include.d

Comment 6 Sumit Bose 2017-11-22 14:13:09 UTC
Can you check if 'klist' or 'kinit' have issues as well reading krb5.conf? If yes, can you run it with strace like:

    strace -s 512 -o /tmp/krb5.strace -f klist

Comment 7 Andrey Bondarenko 2017-11-22 14:34:19 UTC
┌[root@slackware-13]─[~]
└─22/11 15:33:29-> klist
klist: Included profile directory could not be read while initializing krb5

Comment 9 Sumit Bose 2017-11-22 17:07:51 UTC
There are 2 white-space character at the end of

includedir /var/lib/sss/pubconf/krb5.include.d/
                                               ^^

Please remove them and try again.

Comment 10 Andrey Bondarenko 2017-11-27 11:08:31 UTC
Yes, that resolved the problem. May be the error message should be more visible?

Comment 11 Sumit Bose 2017-11-27 12:44:46 UTC
Would something like:

[root@f25 ~]# systemctl restart sssd
Job for sssd.service failed because the control process exited with error code.
See "systemctl status sssd.service" and "journalctl -xe" for details.
[root@f25 ~]# systemctl status sssd
● sssd.service - System Security Services Daemon
   Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/sssd.service.d
           └─core.conf, journal.conf
   Active: failed (Result: exit-code) since Mo 2017-11-27 13:43:01 CET; 3s ago
  Process: 9915 ExecStart=/usr/sbin/sssd -i ${DEBUG_LOGGER} (code=exited, status=1/FAILURE)
 Main PID: 9915 (code=exited, status=1/FAILURE)

Nov 27 13:43:00 f25.fed.devel sssd[pam][9928]: Starting up
Nov 27 13:43:00 f25.fed.devel sssd[pac][9930]: Starting up
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Starting up
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to init kerberos context [Included profile directory could not be read]
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to read keytab [default]: Ungültige Adresse
Nov 27 13:43:01 f25.fed.devel sssd[9915]: Exiting the SSSD. Could not restart critical service [ipaf26.devel].
Nov 27 13:43:01 f25.fed.devel systemd[1]: sssd.service: Main process exited, code=exited, status=1/FAILURE
Nov 27 13:43:01 f25.fed.devel systemd[1]: Failed to start System Security Services Daemon.
Nov 27 13:43:01 f25.fed.devel systemd[1]: sssd.service: Unit entered failed state.
Nov 27 13:43:01 f25.fed.devel systemd[1]: sssd.service: Failed with result 'exit-code'.
[root@f25 ~]# 


work for you?

Comment 12 Andrey Bondarenko 2017-11-27 13:19:58 UTC
Totally fine.

Comment 13 Sumit Bose 2017-11-27 15:03:44 UTC
Upstream ticket:
https://pagure.io/SSSD/sssd/issue/3586

Comment 14 Lukas Slebodnik 2017-11-28 13:59:56 UTC
master:
* 2c10819750a8d920ab755eba1278e6e20e684e93

Comment 16 shridhar 2018-07-06 10:26:05 UTC
[root@cypher ~]# rpm -q sssd
sssd-1.16.2-4.el7.x86_64

[root@cypher ~]# cat /etc/sssd/sssd.conf

        [sssd]
        config_file_version = 2
        sbus_timeout = 30
        services = nss, pam
        domains = LDAP
        user = sssd

        [domain/LDAP]
        id_provider = ldap
        ldap_uri = ldap://ldap.example.com
        ldap_search_base = dc=example,dc=com
        auth_provider = krb5
        krb5_server = kerberos.example.com, kerberos.example.com
        krb5_realm = EXAMPLE.COM
        cache_credentials = True
        krb5_store_password_if_offline = True
        debug_level = 8

 ]# cat /etc/krb5.conf


    [logging]
    default = FILE:/var/log/krb5libs.log

    [libdefaults]
    default_realm = EXAMPLE.COM
    dns_lookup_realm = false
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true
    rdns = false
    default_ccache_name = KEYRING:persistent:%{uid}

    [realms]
    EXAMPLE.COM = {
    }

    [domain_realm]
     examle.com = EXAMPLE.COM
    .example = EXAMPLE.COM



[root@cypher ~]# systemctl restart sssd
Job for sssd.service failed because the control process exited with error code. See "systemctl status sssd.service" and "journalctl -xe" for details.

[root@cypher ~]# systemctl status sssd
● sssd.service - System Security Services Daemon
Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
Active: activating (start) since Mon 2018-07-02 08:39:55 EDT; 7s ago
Main PID: 26727 (sssd)
CGroup: /system.slice/sssd.service
└─26727 /usr/sbin/sssd -i --logger=files

Jul 02 08:39:57 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26736]: Failed to read keytab [default]: Bad address
Jul 02 08:39:57 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Starting up
Jul 02 08:39:58 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Failed to init Kerberos context [Included profile direct...ead]
Jul 02 08:39:58 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Failed to read keytab [default]: Bad address
Jul 02 08:40:00 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26762]: Starting up
Jul 02 08:40:00 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26762]: Failed to init Kerberos context [Included profile direct...ead]
Jul 02 08:40:00 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26762]: Failed to read keytab [default]: Bad address
Jul 02 08:40:02 cypher.idmqe.lab.eng.bos.redhat.com sssd[nss][26779]: Starting up
Jul 02 08:40:02 cypher.idmqe.lab.eng.bos.redhat.com sssd[pam][26780]: Starting up
Jul 02 08:40:02 cypher.idmqe.lab.eng.bos.redhat.com sssd[pam][26782]: Starting up
Hint: Some lines were ellipsized, use -l to show in full.

Comment 17 shridhar 2018-07-09 08:43:56 UTC
Adding missing line of krb5.conf

[root@cypher ~]# rpm -q sssd
sssd-1.16.2-4.el7.x86_64

[root@cypher ~]# cat /etc/sssd/sssd.conf

        [sssd]
        config_file_version = 2
        sbus_timeout = 30
        services = nss, pam
        domains = LDAP
        user = sssd

        [domain/LDAP]
        id_provider = ldap
        ldap_uri = ldap://ldap.example.com
        ldap_search_base = dc=example,dc=com
        auth_provider = krb5
        krb5_server = kerberos.example.com, kerberos.example.com
        krb5_realm = EXAMPLE.COM
        cache_credentials = True
        krb5_store_password_if_offline = True
        debug_level = 8

 ]# cat /etc/krb5.conf
    includedir /var/lib/sss/pubconf/krb5.include.d/  

    [logging]
    default = FILE:/var/log/krb5libs.log

    [libdefaults]
    default_realm = EXAMPLE.COM
    dns_lookup_realm = false
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true
    rdns = false
    default_ccache_name = KEYRING:persistent:%{uid}

    [realms]
    EXAMPLE.COM = {
    }

    [domain_realm]
     examle.com = EXAMPLE.COM
    .example = EXAMPLE.COM



[root@cypher ~]# systemctl restart sssd
Job for sssd.service failed because the control process exited with error code. See "systemctl status sssd.service" and "journalctl -xe" for details.

[root@cypher ~]# systemctl status sssd
● sssd.service - System Security Services Daemon
Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
Active: activating (start) since Mon 2018-07-02 08:39:55 EDT; 7s ago
Main PID: 26727 (sssd)
CGroup: /system.slice/sssd.service
└─26727 /usr/sbin/sssd -i --logger=files

Jul 02 08:39:57 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26736]: Failed to read keytab [default]: Bad address
Jul 02 08:39:57 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Starting up
Jul 02 08:39:58 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Failed to init Kerberos context [Included profile direct...ead]
Jul 02 08:39:58 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Failed to read keytab [default]: Bad address
Jul 02 08:40:00 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26762]: Starting up
Jul 02 08:40:00 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26762]: Failed to init Kerberos context [Included profile direct...ead]
Jul 02 08:40:00 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26762]: Failed to read keytab [default]: Bad address
Jul 02 08:40:02 cypher.idmqe.lab.eng.bos.redhat.com sssd[nss][26779]: Starting up
Jul 02 08:40:02 cypher.idmqe.lab.eng.bos.redhat.com sssd[pam][26780]: Starting up
Jul 02 08:40:02 cypher.idmqe.lab.eng.bos.redhat.com sssd[pam][26782]: Starting up
Hint: Some lines were ellipsized, use -l to show in full.

Comment 18 shridhar 2018-07-23 13:11:01 UTC
Should 'systemctl status sssd.service' contain following log:
<snip>
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to init kerberos context [Included profile directory could not be read]
</snip>

Comment 19 Jakub Hrozek 2018-07-23 18:53:47 UTC
(In reply to shridhar from comment #18)
> Should 'systemctl status sssd.service' contain following log:
> <snip>
> Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to init
> kerberos context [Included profile directory could not be read]
> </snip>

(I'm trying to keep this moving while Sumit is on PTO)

As long as this is not an error during normal operation but something you triggered during your test (which seems to be the case according to comment #17) then yes, this is even sufficient to mark the bug verified I think as the error message matches the code that had been changed.

Comment 20 shridhar 2018-07-23 19:26:42 UTC
As per the comment #11 it seems that systemctl status sssd.service would contain following lines:
<snip>

[root@f25 ~]# systemctl status sssd
● sssd.service - System Security Services Daemon
   Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/sssd.service.d
           └─core.conf, journal.conf
   Active: failed (Result: exit-code) since Mo 2017-11-27 13:43:01 CET; 3s ago
  Process: 9915 ExecStart=/usr/sbin/sssd -i ${DEBUG_LOGGER} (code=exited, status=1/FAILURE)
 Main PID: 9915 (code=exited, status=1/FAILURE)

Nov 27 13:43:00 f25.fed.devel sssd[pam][9928]: Starting up
Nov 27 13:43:00 f25.fed.devel sssd[pac][9930]: Starting up
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Starting up
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to init kerberos context [Included profile directory could not be read]
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to read keytab [default]: Ungültige Adresse
Nov 27 13:43:01 f25.fed.devel sssd[9915]: Exiting the SSSD. Could not restart critical service [ipaf26.devel].
Nov 27 13:43:01 f25.fed.devel systemd[1]: sssd.service: Main process exited, code=exited, status=1/FAILURE
Nov 27 13:43:01 f25.fed.devel systemd[1]: Failed to start System Security Services Daemon.
Nov 27 13:43:01 f25.fed.devel systemd[1]: sssd.service: Unit entered failed state.
Nov 27 13:43:01 f25.fed.devel systemd[1]: sssd.service: Failed with result 'exit-code'.
[root@f25 ~]# 

</snip>

Jakub, I'm not able to see following specific log output from 'systemctl status sssd.service'.

<snip>
Nov 27 13:43:01 f25.fed.devel sssd[be[ipaf26.devel]][9931]: Failed to init
kerberos context [Included profile directory could not be read]
</snip>

I'm able to see above log from 'journalctl -xe'. Currently automation is stuck at this specific point whether above log is expected from 'systemctl status sssd.service' or not.

Comment 21 Jakub Hrozek 2018-07-23 19:55:51 UTC
According to comment #17 the status output did contain:
Jul 02 08:39:58 cypher.idmqe.lab.eng.bos.redhat.com sssd[be[sssd2016.com]][26737]: Failed to init Kerberos context [Included profile direct...ead]

Which is what you're looking for, right?

Comment 22 Steeve Goveas 2018-07-24 07:57:03 UTC
Jakub, the issue is that we get the expected error message with systemctl status command while testing in beaker, but the same grep fails when testing in openstack (downstream CI). The code has been modified to check the logs from journalctl instead, but do you know why this difference?

[ 11:42:06 ] :: [  BEGIN   ] :: Starting SSSD should fail :: actually running 'systemctl restart sssd.service'
 Job for sssd.service failed because the control process exited with error code. See "systemctl status sssd.service" and "journalctl -xe" for details.
[ 11:42:13 ] :: [   PASS   ] :: Starting SSSD should fail (Expected 1, got 1) 
[ 11:42:13 ] :: [  BEGIN   ] :: Running 'systemctl status sssd.service'
 ? sssd.service - System Security Services Daemon
    Loaded: loaded (/usr/lib/systemd/system/sssd.service; disabled; vendor preset: disabled)
    Active: failed (Result: exit-code) since Mon 2018-07-23 11:42:13 EDT; 48ms ago
   Process: 5642 ExecStart=/usr/sbin/sssd -i ${DEBUG_LOGGER} (code=exited, status=1/FAILURE)
  Main PID: 5642 (code=exited, status=1/FAILURE)
 Jul 23 11:42:06 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[5642]: Starting up
 Jul 23 11:42:07 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5643]: Starting up
 Jul 23 11:42:07 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5644]: Starting up
 Jul 23 11:42:09 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5645]: Starting up
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5651]: Starting up
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[5642]: Exiting the SSSD. Could not restart critical service [LDAP].
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com systemd[1]: sssd.service: main process exited, code=exited, status=1/FAILURE
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com systemd[1]: Failed to start System Security Services Daemon.
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com systemd[1]: Unit sssd.service entered failed state.
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com systemd[1]: sssd.service failed.
[ 11:42:13 ] :: [   FAIL   ] :: Command 'systemctl status sssd.service' (Expected 0, got 3) 
[ 11:42:13 ] :: [  BEGIN   ] :: Running 'journalctl -xe | grep -i 'Failed to init Kerberos context''
 Jul 23 11:42:07 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5643]: Failed to init Kerberos context [Included profile directory could not be read]
 Jul 23 11:42:07 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5644]: Failed to init Kerberos context [Included profile directory could not be read]
 Jul 23 11:42:09 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5645]: Failed to init Kerberos context [Included profile directory could not be read]
 Jul 23 11:42:13 host-8-252-208.host.centralci.eng.rdu2.redhat.com sssd[be[LDAP]][5651]: Failed to init Kerberos context [Included profile directory could not be read]
[ 11:42:13 ] :: [   PASS   ] :: Command 'journalctl -xe | grep -i 'Failed to init Kerberos context'' (Expected 0, got 0)

Comment 23 Jakub Hrozek 2018-07-24 08:36:45 UTC
Hmm, I don't know. Is the systemd version the same?

Comment 24 Steeve Goveas 2018-07-25 08:27:24 UTC
Dint check the version. CI pulls the latest available packages for 7.6. I think it will be more reliable to use journalctl for the logs.

Comment 25 Jakub Hrozek 2018-07-25 08:47:51 UTC
(In reply to Steeve Goveas from comment #24)
> Dint check the version. CI pulls the latest available packages for 7.6. I
> think it will be more reliable to use journalctl for the logs.

Perhaps. Maybe this would be worth checking with the systemd developers. From sssd's point of view, the error is sent to syslog(3), where the message then ends is up to systemd/journald. I also thought the critical error messages would end up in systemctl status. What sssd does code-wise is:

sss_log(SSS_LOG_CRIT, "Failed to init kerberos context [%s]\n", msg);

and sss_log calls either journal_send() or syslog().

Comment 26 Sumit Bose 2018-08-06 17:23:56 UTC
(In reply to Steeve Goveas from comment #24)
> Dint check the version. CI pulls the latest available packages for 7.6. I
> think it will be more reliable to use journalctl for the logs.

Is SSSD configured in both cases with the same logger option? It looks like '--logger=files' is used is the message is found by systemctl status. Does to Openstack setup use the same config?

Comment 31 errata-xmlrpc 2018-10-30 10:41:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3158