Bug 1583592

Summary: [RFE] Add configurable randomness to SSSD ldap connection timeout
Product: Red Hat Enterprise Linux 8 Reporter: Ahmed Nazmy <anazmy>
Component: sssdAssignee: Tomas Halman <thalman>
Status: CLOSED ERRATA QA Contact: sssd-qe <sssd-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.0CC: atikhono, dpal, grajaiya, hartsjc, jhrozek, jwboyer, kbanerje, lslebodn, mkosek, mniranja, mzidek, pbrezina, sgoveas, thalman, therman, tscherf
Target Milestone: pre-dev-freezeKeywords: FutureFeature
Target Release: 8.2   
Hardware: All   
OS: Linux   
Whiteboard: sync-to-jira
Fixed In Version: sssd-2.2.3-10.el8 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:55:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ahmed Nazmy 2018-05-29 10:17:33 UTC
Description of problem:

During lab testing, mass registration of IPA clients roughly on the same time leads to regular CPU load spikes on IPA servers, the load spikes are caused by all/most clients refreshing their LDAP connections (ldap_connection_expire_timeout) every 15 minutes.

Version-Release number of selected component (if applicable):
sssd-1.16.0-19

How reproducible:
Always

Steps to Reproduce:
1. Setup large IPA environment (10K+ Clients) with clients enrolled roughly at the same time.

Actual results:
IPA CPU load spikes every 15 minutes, even when the environment is idle.

Expected results:

LDAP Connection refresh interval can be randomized, by for example adding a sssd config parameter: 
~~~
ldap_connection_timeout_factor = 120 
~~~
And a random value between 1-120 seconds will be added to ldap_connection_expire_timeout.

Additional info:

ldap_connection_expire_timeout (integer) 
  Specifies a timeout (in seconds) that a connection to an LDAP server will be maintained. After this time, the connection will be re-established. If used in parallel with SASL/GSSAPI, the sooner of the two values (this value vs. the TGT lifetime) will be used.
Default: 900 (15 minutes)

Comment 2 Jakub Hrozek 2018-06-07 03:29:24 UTC
Upstream ticket:
https://pagure.io/SSSD/sssd/issue/3630

Comment 6 Tomas Halman 2019-12-02 10:28:14 UTC
Upstream PR https://github.com/SSSD/sssd/pull/951

Comment 7 Alexey Tikhonov 2019-12-05 10:39:46 UTC
Master: bd201746f8cf0e95615b3e98868555451b5e66b8

Comment 12 Michal Zidek 2019-12-15 20:45:26 UTC
Sorry, moving back, to POST. Did not realize we are still waiting for the exception on this one.

Comment 18 Niranjan Mallapadi Raghavender 2020-02-24 11:31:28 UTC
sssd-krb5-common-2.2.3-16.el8.x86_64
sssd-winbind-idmap-2.2.3-16.el8.x86_64
sssd-ldap-2.2.3-16.el8.x86_64
sssd-kcm-2.2.3-16.el8.x86_64
sssd-common-pac-2.2.3-16.el8.x86_64
sssd-2.2.3-16.el8.x86_64
sssd-nfs-idmap-2.2.3-16.el8.x86_64
python3-sssdconfig-2.2.3-16.el8.noarch
sssd-common-2.2.3-16.el8.x86_64
sssd-ad-2.2.3-16.el8.x86_64
sssd-proxy-2.2.3-16.el8.x86_64
sssd-tools-2.2.3-16.el8.x86_64
sssd-client-2.2.3-16.el8.x86_64
sssd-krb5-2.2.3-16.el8.x86_64
sssd-dbus-2.2.3-16.el8.x86_64
sssd-ipa-2.2.3-16.el8.x86_64

1. Setup 389-ds 
2. Setup kerberos server
3. Configure client sssd.conf as given below:

[sssd]
config_file_version = 2
services = nss, pam
domains = example1

[domain/example1]
ldap_search_base = dc=example,dc=test
id_provider = ldap
auth_provider = krb5
ldap_user_home_directory = /home/%u
ldap_uri = ldaps://vm-10-0-153-149.hosted.upshift.rdu2.redhat.com
ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem
use_fully_qualified_names = False
debug_level = 9
ldap_sasl_mech = GSSAPI
krb5_realm = EXAMPLE.TEST
krb5_server = vm-10-0-153-149.hosted.upshift.rdu2.redhat.com

5. Run getent passwd and query users at every 20 second delay:


[root@vm-10-0-154-50 ~]# sss_cache -E;getent passwd foo8@example1
foo8:*:14583108:14564100:foo8 User::/bin/bash


6. Run netstat to check the established connection status to ldap server
tcp        0      0 192.168.122.238:60436   192.168.122.5:636       ESTABLISHED 7730/sssd_be
tcp        0      0 192.168.122.238:60436   192.168.122.5:636       ESTABLISHED 7730/sssd_be
tcp        0      0 192.168.122.238:60436   192.168.122.5:636       ESTABLISHED 7730/sssd_be
tcp        0      0 192.168.122.238:60436   192.168.122.5:636       ESTABLISHED 7730/sssd_be
tcp        0      0 192.168.122.238:60436   192.168.122.5:636       ESTABLISHED 7730/sssd_be

7. Run the lookup query after 20 second delay

+ sss_cache -E
+ getent passwd foo1@example1
foo1:*:14583101:14564100:foo1 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo2@example1
foo2:*:14583102:14564100:foo2 User::/bin/bash
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo3@example1
foo3:*:14583103:14564100:foo3 User::/bin/bash
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo4@example1
foo4:*:14583104:14564100:foo4 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo5@example1
foo5:*:14583105:14564100:foo5 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo6@example1
foo6:*:14583106:14564100:foo6 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo7@example1
foo7:*:14583107:14564100:foo7 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo8@example1
foo8:*:14583108:14564100:foo8 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo9@example1
foo9:*:14583109:14564100:foo9 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo10@example1
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo11@example1
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60438   192.168.122.5:636       ESTABLISHED 7730/sssd_be
+ sleep 20

As we can see from the above when ldap_connection_expire_timeout and ldap_connection_expire_offset are not set the 
client port of the tcp connection is same. 

set the below in sssd.conf in domain section 

ldap_connection_expire_timeout = 15
ldap_connection_expire_offset = 5

Restart sssd

+ sss_cache -E
+ getent passwd foo1@example1
foo1:*:14583101:14564100:foo1 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60442   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo2@example1
foo2:*:14583102:14564100:foo2 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60446   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo3@example1
foo3:*:14583103:14564100:foo3 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60448   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo4@example1
foo4:*:14583104:14564100:foo4 User::/bin/bash
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60450   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo5@example1
foo5:*:14583105:14564100:foo5 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60452   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo6@example1
foo6:*:14583106:14564100:foo6 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60454   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo7@example1
foo7:*:14583107:14564100:foo7 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60456   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo8@example1
foo8:*:14583108:14564100:foo8 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60458   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo9@example1
foo9:*:14583109:14564100:foo9 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60460   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo10@example1
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60462   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20
+ sss_cache -E
+ getent passwd foo11@example1
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60464   192.168.122.5:636       ESTABLISHED 8002/sssd_be
+ sleep 20


As we can see the client port of the tcp connection changes with every lookup.

Comment 19 Niranjan Mallapadi Raghavender 2020-02-25 03:43:07 UTC
Using below configuration:

[sssd]
config_file_version = 2
services = nss, pam
domains = example1

[domain/example1]
ldap_search_base = dc=example,dc=test
id_provider = ldap
auth_provider = krb5
ldap_user_home_directory = /home/%u
ldap_uri = ldaps://vm-10-0-153-149.hosted.upshift.rdu2.redhat.com
ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem
use_fully_qualified_names = False
debug_level = 9
ldap_sasl_mech = GSSAPI
krb5_realm = EXAMPLE.TEST
krb5_server = vm-10-0-153-149.hosted.upshift.rdu2.redhat.com
ldap_connection_expire_timeout = 15
ldap_connection_expire_offset = 5


Using below script

for i in `seq 1 10`;do
sss_cache -E;getent passwd foo$i@example1
netstat -plantu  | grep sss
sleep 17
done

+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo1@example1
foo1:*:14583101:14564100:foo1 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60886   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo2@example1
foo2:*:14583102:14564100:foo2 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60888   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo3@example1
foo3:*:14583103:14564100:foo3 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60890   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo4@example1
foo4:*:14583104:14564100:foo4 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60892   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo5@example1
foo5:*:14583105:14564100:foo5 User::/bin/bash
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60894   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo6@example1
foo6:*:14583106:14564100:foo6 User::/bin/bash
+ grep sss
+ netstat -plantu
tcp        0      0 192.168.122.238:60896   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo7@example1
foo7:*:14583107:14564100:foo7 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60898   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo8@example1
foo8:*:14583108:14564100:foo8 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60900   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo9@example1
foo9:*:14583109:14564100:foo9 User::/bin/bash
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60902   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17
+ for i in `seq 1 10`
+ sss_cache -E
+ getent passwd foo10@example1
+ netstat -plantu
+ grep sss
tcp        0      0 192.168.122.238:60904   192.168.122.5:636       ESTABLISHED 9730/sssd_be
+ sleep 17

Comment 20 Alexey Tikhonov 2020-02-25 14:07:16 UTC
 Tomas, should re-connection time be randomized at every re-connect or once during startup?

Comment 21 Tomas Halman 2020-02-25 14:45:09 UTC
yes, re-connection is also randomized.

To test I would suggest to set

    ldap_connection_expire_timeout = 2
    ldap_connection_expire_offset = 15 

and watch whether re-connection occurs somewhere between 2 and 17s.

Anyway the randomized expire timeout is also written in log when connecting...

    DEBUG(SSSDBG_CONF_SETTINGS, "expire timeout is %d\n", expire_timeout);

Might be easier and more reliable to look in the log.

Comment 22 Alexey Tikhonov 2020-02-25 15:04:54 UTC
(In reply to Tomas Halman from comment #21)
> Anyway the randomized expire timeout is also written in log when
> connecting...
> 
>     DEBUG(SSSDBG_CONF_SETTINGS, "expire timeout is %d\n", expire_timeout);
> 
> Might be easier and more reliable to look in the log.

No, that's wrong. One can't assert that sw behaves as expected relying on logs only :)

Comment 23 Niranjan Mallapadi Raghavender 2020-02-25 17:02:41 UTC
Following script was used to check the expire connection timeout

#!/bin/bash

USER_TO_RESOLVE=foo1@example1

get_addr_port() {
    netstat -plantu  | awk '/ESTABLISHED.+sss/{ print $4 }'
}

trigger_ldap() {
    sss_cache -E; id $USER_TO_RESOLVE >/dev/null 2>&1
    echo -n "."
}

trigger_ldap
port=`get_addr_port`
for a in `seq 1 10`; do
    SECONDS=0
    while [ "$port" = "`get_addr_port`" ] ; do
        sleep 1
        trigger_ldap
    done
    echo ""
    port=`get_addr_port`
    echo $SECONDS
done



[root@vm-idm-037 ~]# sh  check_sss_timeout.sh
..
2
..............................................................
85
.
9
.
3
.
8
.
6
.
6
.
5
.
6
.
2

From the logs the expire timeout is logged:



(Tue Feb 25 22:24:59 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649702
(Tue Feb 25 22:25:01 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] (0x4000): releasing expired cached connection
(Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736102]
(Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 13
(Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649716
(Tue Feb 25 22:25:08 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736110]
(Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 9
(Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649720
(Tue Feb 25 22:25:13 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] (0x4000): releasing expired cached connection
(Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736114]
(Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 11
(Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649725
(Tue Feb 25 22:25:17 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736119]
(Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 11
(Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649730
(Tue Feb 25 22:25:22 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736125]
(Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 13
(Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649738
(Tue Feb 25 22:25:30 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736131]
(Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 12
(Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649743
(Tue Feb 25 22:25:35 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736136]
(Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 10
(Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649746
(Tue Feb 25 22:25:40 2020) [sssd[be[example1]]] [sdap_id_op_connect_step] (0x4000): releasing expired cached connection
(Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736203]
(Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 10
(Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649813
(Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1582736206]
(Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 14
(Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1582649820
(Tue Feb 25 22:26:52 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it

Comment 24 Alexey Tikhonov 2020-02-26 09:48:22 UTC
(In reply to Niranjan Mallapadi Raghavender from comment #23)
> Following script was used to check the expire connection timeout
> 
> #!/bin/bash
> 
> USER_TO_RESOLVE=foo1@example1
> 
> get_addr_port() {
>     netstat -plantu  | awk '/ESTABLISHED.+sss/{ print $4 }'
> }
> 
> trigger_ldap() {
>     sss_cache -E; id $USER_TO_RESOLVE >/dev/null 2>&1
>     echo -n "."
> }
> 
> trigger_ldap
> port=`get_addr_port`
> for a in `seq 1 10`; do
>     SECONDS=0
>     while [ "$port" = "`get_addr_port`" ] ; do
>         sleep 1
>         trigger_ldap
>     done
>     echo ""
>     port=`get_addr_port`
>     echo $SECONDS
> done
> 


What config settings were used:
    ldap_connection_expire_timeout = 2
    ldap_connection_expire_offset = 15 
?


> 
> 
> [root@vm-idm-037 ~]# sh  check_sss_timeout.sh
> ..
> 2
> ..............................................................
> 85

85 doesn't fit (2+15) limit.

> .
> 9
> .
> 3
> .
> 8
> .
> 6
> .
> 6
> .
> 5
> .
> 6
> .
> 2
> 
> From the logs the expire timeout is logged:
> 


Do those logs correspond above script output?
Values logged here (13, 9, 11, 11, 13, 12, 10 ,10, 14) do not match value output by script (2, 85, 9, 3, 8, 6, 6, 5, 6, 2).

> 
> 
> (Tue Feb 25 22:24:59 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649702
> (Tue Feb 25 22:25:01 2020) [sssd[be[example1]]] [sdap_id_op_connect_step]
> (0x4000): releasing expired cached connection
> (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736102]
> (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 13
> (Tue Feb 25 22:25:03 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649716
> (Tue Feb 25 22:25:08 2020) [sssd[be[example1]]]
> [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
> releasing it
> (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736110]
> (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 9
> (Tue Feb 25 22:25:11 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649720
> (Tue Feb 25 22:25:13 2020) [sssd[be[example1]]] [sdap_id_op_connect_step]
> (0x4000): releasing expired cached connection
> (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736114]
> (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 11
> (Tue Feb 25 22:25:14 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649725
> (Tue Feb 25 22:25:17 2020) [sssd[be[example1]]]
> [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
> releasing it
> (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736119]
> (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 11
> (Tue Feb 25 22:25:19 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649730
> (Tue Feb 25 22:25:22 2020) [sssd[be[example1]]]
> [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
> releasing it
> (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736125]
> (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 13
> (Tue Feb 25 22:25:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649738
> (Tue Feb 25 22:25:30 2020) [sssd[be[example1]]]
> [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
> releasing it
> (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736131]
> (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 12
> (Tue Feb 25 22:25:31 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649743
> (Tue Feb 25 22:25:35 2020) [sssd[be[example1]]]
> [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
> releasing it
> (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736136]
> (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 10
> (Tue Feb 25 22:25:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649746
> (Tue Feb 25 22:25:40 2020) [sssd[be[example1]]] [sdap_id_op_connect_step]
> (0x4000): releasing expired cached connection
> (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736203]
> (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 10
> (Tue Feb 25 22:26:43 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649813
> (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_get_tgt_recv]
> (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST],
> expired on [1582736206]
> (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x0100): expire timeout is 14
> (Tue Feb 25 22:26:46 2020) [sssd[be[example1]]] [sdap_cli_auth_step]
> (0x1000): the connection will expire at 1582649820
> (Tue Feb 25 22:26:52 2020) [sssd[be[example1]]]
> [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
> releasing it

Comment 34 Niranjan Mallapadi Raghavender 2020-03-04 13:38:09 UTC
Retesting this with below configuration and version:
python3-sssdconfig-2.2.3-18.el8.noarch
sssd-nfs-idmap-2.2.3-18.el8.x86_64
sssd-winbind-idmap-2.2.3-18.el8.x86_64
sssd-common-2.2.3-18.el8.x86_64
sssd-dbus-2.2.3-18.el8.x86_64
sssd-krb5-common-2.2.3-18.el8.x86_64
sssd-krb5-2.2.3-18.el8.x86_64
sssd-ad-2.2.3-18.el8.x86_64
sssd-2.2.3-18.el8.x86_64
sssd-tools-2.2.3-18.el8.x86_64
sssd-common-pac-2.2.3-18.el8.x86_64
sssd-client-2.2.3-18.el8.x86_64
sssd-ldap-2.2.3-18.el8.x86_64
sssd-kcm-2.2.3-18.el8.x86_64
sssd-proxy-2.2.3-18.el8.x86_64
sssd-ipa-2.2.3-18.el8.x86_64

cat /etc/sssd/sssd.conf
[sssd]
config_file_version = 2
services = nss, pam
domains = example1

[domain/example1]
ldap_search_base = dc=example,dc=test
id_provider = ldap
auth_provider = krb5
ldap_user_home_directory = /home/%u
ldap_uri = ldaps://kvm-06-guest31.hv2.lab.eng.bos.redhat.com
ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem
use_fully_qualified_names = False
debug_level = 9
ldap_sasl_mech = GSSAPI
krb5_realm = EXAMPLE.TEST
krb5_server = kvm-06-guest31.hv2.lab.eng.bos.redhat.com
ldap_connection_expire_timeout = 60
ldap_connection_expire_offset = 10


From another terminal we are querying user foo1 every two minutes:
sss_cache -E; getent passwd foo1@example1

the following was noted in the logs after some time:

[root@kvm-02-guest09 ~]# grep expire /var/log/sssd/sssd_example1.log
(Wed Mar  4 13:54:00 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value
(Wed Mar  4 13:54:00 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 900
(Wed Mar  4 13:54:00 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 0
(Wed Mar  4 13:54:00 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire
(Wed Mar  4 13:54:00 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires
(Wed Mar  4 13:54:01 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900
(Wed Mar  4 13:54:01 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583327341
(Wed Mar  4 13:55:01 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value
(Wed Mar  4 13:55:01 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 900
(Wed Mar  4 13:55:01 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 0
(Wed Mar  4 13:55:01 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire
(Wed Mar  4 13:55:01 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires
(Wed Mar  4 13:55:12 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583412912]
(Wed Mar  4 13:55:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900
(Wed Mar  4 13:55:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583327412
(Wed Mar  4 14:10:04 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:10:12 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583413812]
(Wed Mar  4 14:10:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900
(Wed Mar  4 14:10:12 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328312
(Wed Mar  4 14:18:14 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value
(Wed Mar  4 14:18:14 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 60
(Wed Mar  4 14:18:14 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 10
(Wed Mar  4 14:18:14 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire
(Wed Mar  4 14:18:14 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires
(Wed Mar  4 14:18:25 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414305]
(Wed Mar  4 14:18:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 63
(Wed Mar  4 14:18:25 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583327968
(Wed Mar  4 14:19:20 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:23:46 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value
(Wed Mar  4 14:23:46 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 60
(Wed Mar  4 14:23:46 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 10
(Wed Mar  4 14:23:46 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire
(Wed Mar  4 14:23:46 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires
(Wed Mar  4 14:23:57 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414637]
(Wed Mar  4 14:23:57 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 63
(Wed Mar  4 14:23:57 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328300
(Wed Mar  4 14:24:52 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:27:05 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_account_expire_policy has no value
(Wed Mar  4 14:27:05 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_timeout has value 60
(Wed Mar  4 14:27:05 2020) [sssd[be[example1]]] [dp_get_options] (0x0400): Option ldap_connection_expire_offset has value 10
(Wed Mar  4 14:27:05 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_shadow_expire has value shadowExpire
(Wed Mar  4 14:27:05 2020) [sssd[be[example1]]] [sdap_get_map] (0x0400): Option ldap_user_ad_account_expires has value accountExpires
(Wed Mar  4 14:27:16 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414836]
(Wed Mar  4 14:27:16 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 67
(Wed Mar  4 14:27:16 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328503
(Wed Mar  4 14:28:15 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:29:33 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583414973]
(Wed Mar  4 14:29:33 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 69
(Wed Mar  4 14:29:33 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328642
(Wed Mar  4 14:30:34 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:31:34 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583415094]
(Wed Mar  4 14:31:34 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 67
(Wed Mar  4 14:31:34 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328761
(Wed Mar  4 14:32:33 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:33:36 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583415216]
(Wed Mar  4 14:33:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 70
(Wed Mar  4 14:33:36 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583328886
(Wed Mar  4 14:34:38 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it
(Wed Mar  4 14:35:38 2020) [sssd[be[example1]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_EXAMPLE.TEST], expired on [1583415336]
(Wed Mar  4 14:35:38 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x0100): expire timeout is 62
(Wed Mar  4 14:35:38 2020) [sssd[be[example1]]] [sdap_cli_auth_step] (0x1000): the connection will expire at 1583329000
(Wed Mar  4 14:36:32 2020) [sssd[be[example1]]] [sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire, releasing it

Comment 37 Niranjan Mallapadi Raghavender 2020-03-04 17:18:22 UTC
Based on the results in comment #comment 34 , I would say with expire timeout as 60 and offset as 10 , 
the results seem to be consistent, where the expire timeout is at-least not  below 52 seconds considering
the default ldap_opt_timeout is 8 seconds.  Marking it as verified.

Comment 39 errata-xmlrpc 2020-04-28 16:55:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1863