Description of problem: During lab testing, mass registration of IPA clients roughly on the same time leads to regular CPU load spikes on IPA servers, the load spikes are caused by all/most clients refreshing their LDAP connections (ldap_connection_expire_timeout) every 15 minutes. Version-Release number of selected component (if applicable): sssd-1.16.0-19 How reproducible: Always Steps to Reproduce: 1. Setup large IPA environment (10K+ Clients) with clients enrolled roughly at the same time. Actual results: IPA CPU load spikes every 15 minutes, even when the environment is idle. Expected results: LDAP Connection refresh interval can be randomized, by for example adding a sssd config parameter: ~~~ ldap_connection_timeout_factor = 120 ~~~ And a random value between 1-120 seconds will be added to ldap_connection_expire_timeout. Additional info: ldap_connection_expire_timeout (integer) Specifies a timeout (in seconds) that a connection to an LDAP server will be maintained. After this time, the connection will be re-established. If used in parallel with SASL/GSSAPI, the sooner of the two values (this value vs. the TGT lifetime) will be used. Default: 900 (15 minutes)
Upstream ticket: https://pagure.io/SSSD/sssd/issue/3630
Upstream PR https://github.com/SSSD/sssd/pull/951
Master: bd201746f8cf0e95615b3e98868555451b5e66b8
Sorry, moving back, to POST. Did not realize we are still waiting for the exception on this one.
sssd-krb5-common-2.2.3-16.el8.x86_64 sssd-winbind-idmap-2.2.3-16.el8.x86_64 sssd-ldap-2.2.3-16.el8.x86_64 sssd-kcm-2.2.3-16.el8.x86_64 sssd-common-pac-2.2.3-16.el8.x86_64 sssd-2.2.3-16.el8.x86_64 sssd-nfs-idmap-2.2.3-16.el8.x86_64 python3-sssdconfig-2.2.3-16.el8.noarch sssd-common-2.2.3-16.el8.x86_64 sssd-ad-2.2.3-16.el8.x86_64 sssd-proxy-2.2.3-16.el8.x86_64 sssd-tools-2.2.3-16.el8.x86_64 sssd-client-2.2.3-16.el8.x86_64 sssd-krb5-2.2.3-16.el8.x86_64 sssd-dbus-2.2.3-16.el8.x86_64 sssd-ipa-2.2.3-16.el8.x86_64 1. Setup 389-ds 2. Setup kerberos server 3. Configure client sssd.conf as given below: [sssd] config_file_version = 2 services = nss, pam domains = example1 [domain/example1] ldap_search_base = dc=example,dc=test id_provider = ldap auth_provider = krb5 ldap_user_home_directory = /home/%u ldap_uri = ldaps://vm-10-0-153-149.hosted.upshift.rdu2.redhat.com ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem use_fully_qualified_names = False debug_level = 9 ldap_sasl_mech = GSSAPI krb5_realm = EXAMPLE.TEST krb5_server = vm-10-0-153-149.hosted.upshift.rdu2.redhat.com 5. Run getent passwd and query users at every 20 second delay: [root@vm-10-0-154-50 ~]# sss_cache -E;getent passwd foo8@example1 foo8:*:14583108:14564100:foo8 User::/bin/bash 6. Run netstat to check the established connection status to ldap server tcp 0 0 192.168.122.238:60436 192.168.122.5:636 ESTABLISHED 7730/sssd_be tcp 0 0 192.168.122.238:60436 192.168.122.5:636 ESTABLISHED 7730/sssd_be tcp 0 0 192.168.122.238:60436 192.168.122.5:636 ESTABLISHED 7730/sssd_be tcp 0 0 192.168.122.238:60436 192.168.122.5:636 ESTABLISHED 7730/sssd_be tcp 0 0 192.168.122.238:60436 192.168.122.5:636 ESTABLISHED 7730/sssd_be 7. Run the lookup query after 20 second delay + sss_cache -E + getent passwd foo1@example1 foo1:*:14583101:14564100:foo1 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo2@example1 foo2:*:14583102:14564100:foo2 User::/bin/bash + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo3@example1 foo3:*:14583103:14564100:foo3 User::/bin/bash + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo4@example1 foo4:*:14583104:14564100:foo4 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo5@example1 foo5:*:14583105:14564100:foo5 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo6@example1 foo6:*:14583106:14564100:foo6 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo7@example1 foo7:*:14583107:14564100:foo7 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo8@example1 foo8:*:14583108:14564100:foo8 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo9@example1 foo9:*:14583109:14564100:foo9 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo10@example1 + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 + sss_cache -E + getent passwd foo11@example1 + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60438 192.168.122.5:636 ESTABLISHED 7730/sssd_be + sleep 20 As we can see from the above when ldap_connection_expire_timeout and ldap_connection_expire_offset are not set the client port of the tcp connection is same. set the below in sssd.conf in domain section ldap_connection_expire_timeout = 15 ldap_connection_expire_offset = 5 Restart sssd + sss_cache -E + getent passwd foo1@example1 foo1:*:14583101:14564100:foo1 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60442 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo2@example1 foo2:*:14583102:14564100:foo2 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60446 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo3@example1 foo3:*:14583103:14564100:foo3 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60448 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo4@example1 foo4:*:14583104:14564100:foo4 User::/bin/bash + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60450 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo5@example1 foo5:*:14583105:14564100:foo5 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60452 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo6@example1 foo6:*:14583106:14564100:foo6 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60454 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo7@example1 foo7:*:14583107:14564100:foo7 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60456 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo8@example1 foo8:*:14583108:14564100:foo8 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60458 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo9@example1 foo9:*:14583109:14564100:foo9 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60460 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo10@example1 + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60462 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 + sss_cache -E + getent passwd foo11@example1 + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60464 192.168.122.5:636 ESTABLISHED 8002/sssd_be + sleep 20 As we can see the client port of the tcp connection changes with every lookup.
Using below configuration: [sssd] config_file_version = 2 services = nss, pam domains = example1 [domain/example1] ldap_search_base = dc=example,dc=test id_provider = ldap auth_provider = krb5 ldap_user_home_directory = /home/%u ldap_uri = ldaps://vm-10-0-153-149.hosted.upshift.rdu2.redhat.com ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem use_fully_qualified_names = False debug_level = 9 ldap_sasl_mech = GSSAPI krb5_realm = EXAMPLE.TEST krb5_server = vm-10-0-153-149.hosted.upshift.rdu2.redhat.com ldap_connection_expire_timeout = 15 ldap_connection_expire_offset = 5 Using below script for i in `seq 1 10`;do sss_cache -E;getent passwd foo$i@example1 netstat -plantu | grep sss sleep 17 done + for i in `seq 1 10` + sss_cache -E + getent passwd foo1@example1 foo1:*:14583101:14564100:foo1 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60886 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo2@example1 foo2:*:14583102:14564100:foo2 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60888 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo3@example1 foo3:*:14583103:14564100:foo3 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60890 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo4@example1 foo4:*:14583104:14564100:foo4 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60892 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo5@example1 foo5:*:14583105:14564100:foo5 User::/bin/bash + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60894 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo6@example1 foo6:*:14583106:14564100:foo6 User::/bin/bash + grep sss + netstat -plantu tcp 0 0 192.168.122.238:60896 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo7@example1 foo7:*:14583107:14564100:foo7 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60898 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo8@example1 foo8:*:14583108:14564100:foo8 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60900 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo9@example1 foo9:*:14583109:14564100:foo9 User::/bin/bash + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60902 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17 + for i in `seq 1 10` + sss_cache -E + getent passwd foo10@example1 + netstat -plantu + grep sss tcp 0 0 192.168.122.238:60904 192.168.122.5:636 ESTABLISHED 9730/sssd_be + sleep 17
Tomas, should re-connection time be randomized at every re-connect or once during startup?
yes, re-connection is also randomized. To test I would suggest to set ldap_connection_expire_timeout = 2 ldap_connection_expire_offset = 15 and watch whether re-connection occurs somewhere between 2 and 17s. Anyway the randomized expire timeout is also written in log when connecting... DEBUG(SSSDBG_CONF_SETTINGS, "expire timeout is %d\n", expire_timeout); Might be easier and more reliable to look in the log.
(In reply to Tomas Halman from comment #21) > Anyway the randomized expire timeout is also written in log when > connecting... > > DEBUG(SSSDBG_CONF_SETTINGS, "expire timeout is %d\n", expire_timeout); > > Might be easier and more reliable to look in the log. No, that's wrong. One can't assert that sw behaves as expected relying on logs only :)
Following script was used to check the expire connection timeout #!/bin/bash USER_TO_RESOLVE=foo1@example1 get_addr_port() { netstat -plantu | awk '/ESTABLISHED.+sss/{ print $4 }' } trigger_ldap() { sss_cache -E; id $USER_TO_RESOLVE >/dev/null 2>&1 echo -n "." } trigger_ldap port=`get_addr_port` for a in `seq 1 10`; do SECONDS=0 while [ "$port" = "`get_addr_port`" ] ; do sleep 1 trigger_ldap done echo "" port=`get_addr_port` echo $SECONDS done [root@vm-idm-037 ~]# sh check_sss_timeout.sh .. 2 .............................................................. 85 . 9 . 3 . 8 . 6 . 6 . 5 . 6 . 2 From the logs the expire timeout is logged: (Tue Feb 25 22:24:59 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649702 (Tue Feb 25 22:25:01 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] (0x4000): releasing expired cached connection (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736102] (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 13 (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649716 (Tue Feb 25 22:25:08 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736110] (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 9 (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649720 (Tue Feb 25 22:25:13 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] (0x4000): releasing expired cached connection (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736114] (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 11 (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649725 (Tue Feb 25 22:25:17 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736119] (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 11 (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649730 (Tue Feb 25 22:25:22 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736125] (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 13 (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649738 (Tue Feb 25 22:25:30 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736131] (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 12 (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649743 (Tue Feb 25 22:25:35 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736136] (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 10 (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649746 (Tue Feb 25 22:25:40 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] (0x4000): releasing expired cached connection (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736203] (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 10 (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649813 (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736206] (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 14 (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649820 (Tue Feb 25 22:26:52 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(In reply to Niranjan Mallapadi Raghavender from comment #23) > Following script was used to check the expire connection timeout > > #!/bin/bash > > USER_TO_RESOLVE=foo1@example1 > > get_addr_port() { > netstat -plantu | awk '/ESTABLISHED.+sss/{ print $4 }' > } > > trigger_ldap() { > sss_cache -E; id $USER_TO_RESOLVE >/dev/null 2>&1 > echo -n "." > } > > trigger_ldap > port=`get_addr_port` > for a in `seq 1 10`; do > SECONDS=0 > while [ "$port" = "`get_addr_port`" ] ; do > sleep 1 > trigger_ldap > done > echo "" > port=`get_addr_port` > echo $SECONDS > done > What config settings were used: ldap_connection_expire_timeout = 2 ldap_connection_expire_offset = 15 ? > > > [root@vm-idm-037 ~]# sh check_sss_timeout.sh > .. > 2 > .............................................................. > 85 85 doesn't fit (2+15) limit. > . > 9 > . > 3 > . > 8 > . > 6 > . > 6 > . > 5 > . > 6 > . > 2 > > From the logs the expire timeout is logged: > Do those logs correspond above script output? Values logged here (13, 9, 11, 11, 13, 12, 10 ,10, 14) do not match value output by script (2, 85, 9, 3, 8, 6, 6, 5, 6, 2). > > > (Tue Feb 25 22:24:59 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649702 > (Tue Feb 25 22:25:01 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] > (0x4000): releasing expired cached connection > (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736102] > (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 13 > (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649716 > (Tue Feb 25 22:25:08 2020) [sssd[be[example1]]] > [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, > releasing it > (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736110] > (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 9 > (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649720 > (Tue Feb 25 22:25:13 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] > (0x4000): releasing expired cached connection > (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736114] > (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 11 > (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649725 > (Tue Feb 25 22:25:17 2020) [sssd[be[example1]]] > [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, > releasing it > (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736119] > (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 11 > (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649730 > (Tue Feb 25 22:25:22 2020) [sssd[be[example1]]] > [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, > releasing it > (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736125] > (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 13 > (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649738 > (Tue Feb 25 22:25:30 2020) [sssd[be[example1]]] > [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, > releasing it > (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736131] > (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 12 > (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649743 > (Tue Feb 25 22:25:35 2020) [sssd[be[example1]]] > [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, > releasing it > (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736136] > (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 10 > (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649746 > (Tue Feb 25 22:25:40 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] > (0x4000): releasing expired cached connection > (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736203] > (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 10 > (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649813 > (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] > (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], > expired on [1582736206] > (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x0100): expire timeout is 14 > (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step] > (0x1000): the connection will expire at 1582649820 > (Tue Feb 25 22:26:52 2020) [sssd[be[example1]]] > [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, > releasing it
Retesting this with below configuration and version: python3-sssdconfig-2.2.3-18.el8.noarch sssd-nfs-idmap-2.2.3-18.el8.x86_64 sssd-winbind-idmap-2.2.3-18.el8.x86_64 sssd-common-2.2.3-18.el8.x86_64 sssd-dbus-2.2.3-18.el8.x86_64 sssd-krb5-common-2.2.3-18.el8.x86_64 sssd-krb5-2.2.3-18.el8.x86_64 sssd-ad-2.2.3-18.el8.x86_64 sssd-2.2.3-18.el8.x86_64 sssd-tools-2.2.3-18.el8.x86_64 sssd-common-pac-2.2.3-18.el8.x86_64 sssd-client-2.2.3-18.el8.x86_64 sssd-ldap-2.2.3-18.el8.x86_64 sssd-kcm-2.2.3-18.el8.x86_64 sssd-proxy-2.2.3-18.el8.x86_64 sssd-ipa-2.2.3-18.el8.x86_64 cat /etc/sssd/sssd.conf [sssd] config_file_version = 2 services = nss, pam domains = example1 [domain/example1] ldap_search_base = dc=example,dc=test id_provider = ldap auth_provider = krb5 ldap_user_home_directory = /home/%u ldap_uri = ldaps://kvm-06-guest31.hv2.lab.eng.bos.redhat.com ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem use_fully_qualified_names = False debug_level = 9 ldap_sasl_mech = GSSAPI krb5_realm = EXAMPLE.TEST krb5_server = kvm-06-guest31.hv2.lab.eng.bos.redhat.com ldap_connection_expire_timeout = 60 ldap_connection_expire_offset = 10 From another terminal we are querying user foo1 every two minutes: sss_cache -E; getent passwd foo1@example1 the following was noted in the logs after some time: [root@kvm-02-guest09 ~]# grep expire /var/log/sssd/sssd_example1.log (Wed Mar 4 13:54:00 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value (Wed Mar 4 13:54:00 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 900 (Wed Mar 4 13:54:00 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 0 (Wed Mar 4 13:54:00 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire (Wed Mar 4 13:54:00 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires (Wed Mar 4 13:54:01 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900 (Wed Mar 4 13:54:01 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583327341 (Wed Mar 4 13:55:01 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value (Wed Mar 4 13:55:01 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 900 (Wed Mar 4 13:55:01 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 0 (Wed Mar 4 13:55:01 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire (Wed Mar 4 13:55:01 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires (Wed Mar 4 13:55:12 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583412912] (Wed Mar 4 13:55:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900 (Wed Mar 4 13:55:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583327412 (Wed Mar 4 14:10:04 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:10:12 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583413812] (Wed Mar 4 14:10:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900 (Wed Mar 4 14:10:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328312 (Wed Mar 4 14:18:14 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value (Wed Mar 4 14:18:14 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 60 (Wed Mar 4 14:18:14 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 10 (Wed Mar 4 14:18:14 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire (Wed Mar 4 14:18:14 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires (Wed Mar 4 14:18:25 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414305] (Wed Mar 4 14:18:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 63 (Wed Mar 4 14:18:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583327968 (Wed Mar 4 14:19:20 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:23:46 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value (Wed Mar 4 14:23:46 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 60 (Wed Mar 4 14:23:46 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 10 (Wed Mar 4 14:23:46 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire (Wed Mar 4 14:23:46 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires (Wed Mar 4 14:23:57 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414637] (Wed Mar 4 14:23:57 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 63 (Wed Mar 4 14:23:57 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328300 (Wed Mar 4 14:24:52 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:27:05 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value (Wed Mar 4 14:27:05 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 60 (Wed Mar 4 14:27:05 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 10 (Wed Mar 4 14:27:05 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire (Wed Mar 4 14:27:05 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires (Wed Mar 4 14:27:16 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414836] (Wed Mar 4 14:27:16 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 67 (Wed Mar 4 14:27:16 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328503 (Wed Mar 4 14:28:15 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:29:33 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414973] (Wed Mar 4 14:29:33 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 69 (Wed Mar 4 14:29:33 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328642 (Wed Mar 4 14:30:34 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:31:34 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583415094] (Wed Mar 4 14:31:34 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 67 (Wed Mar 4 14:31:34 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328761 (Wed Mar 4 14:32:33 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:33:36 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583415216] (Wed Mar 4 14:33:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 70 (Wed Mar 4 14:33:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328886 (Wed Mar 4 14:34:38 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it (Wed Mar 4 14:35:38 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583415336] (Wed Mar 4 14:35:38 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 62 (Wed Mar 4 14:35:38 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583329000 (Wed Mar 4 14:36:32 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
Based on the results in comment #comment 34 , I would say with expire timeout as 60 and offset as 10 , the results seem to be consistent, where the expire timeout is at-least not below 52 seconds considering the default ldap_opt_timeout is 8 seconds. Marking it as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1863