Bug 1851112
Summary: | LDAP bind can fail due to unconfigurable DNS server timeouts that inhibit SSSD failover [rhel-7.9.z] | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | David Ward <david.ward> |
Component: | sssd | Assignee: | SSSD Maintainers <sssd-maint> |
Status: | CLOSED ERRATA | QA Contact: | Dan Lavu <dlavu> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.9 | CC: | atikhono, dlavu, g63it, grajaiya, jhrozek, jreznik, lslebodn, mzidek, pbrezina, sgoveas, thalman, tscherf |
Target Milestone: | rc | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | sync-to-jira qetodo | ||
Fixed In Version: | sssd-1.16.5-10.el7_9.6 | Doc Type: | If docs needed, set a value |
Doc Text: |
If this bug requires documentation, please select an appropriate Doc Type value.sssd-1.16.5-10.el7_9.5.x86_64
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-12-15 11:22:17 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Ward
2020-06-25 15:54:24 UTC
* `sssd-1-16` * 2c7c69485883ab0f408cf2f8cdcc4617462d68ec - failover: add dns_resolver_server_timeout option * 9b1f6f88b1245bc8737a4b2616299a8427d822e0 - man: fix description of dns_resolver_timeout * f768f73903bf0c3d0d1b4361ebff6b111a918470 - man: fix description of dns_resolver_op_timeout (In reply to David Ward from comment #0) > > The following upstream commits from the sssd-1-16 branch [2], which are very > simple, can be added directly to the RHEL 7 sssd package to resolve this: > - c317452 (config: add dns_resolver_op_timeout to option list) Can't find this ^^ hash. Did you mean https://github.com/SSSD/sssd/commit/7cb6a9f70dd65c88e0f9be97c79702566ab37afb ? This one was included with bz 1719718 > - f768f73 (man: fix description of dns_resolver_op_timeout) > - 9b1f6f8 (man: fix description of dns_resolver_timeout) > - 2c7c694 (failover: add dns_resolver_server_timeout option) (In reply to Alexey Tikhonov from comment #2) > > - c317452 (config: add dns_resolver_op_timeout to option list) > > Can't find this ^^ hash. > Did you mean > https://github.com/SSSD/sssd/commit/7cb6a9f70dd65c88e0f9be97c79702566ab37afb > ? This one was included with bz 1719718 Yes, my apologies. Please disregard that commit since it's covered in RHBZ#1719718. Hi David, could you please open a customer case to back up this request? This would increase approval chances. Verified against sssd-ad-1.16.5-10.el7_9.6.x86_64 [root@rhel79 x86_64]# cat /etc/sssd/sssd.conf [sssd] domains = example.com config_file_version = 2 services = nss, pam [domain/example.com] debug_level = 9 ad_domain = example.com krb5_realm = EXAMPLE.COM realmd_tags = manages-system joined-with-samba cache_credentials = True id_provider = ad krb5_store_password_if_offline = True default_shell = /bin/bash ldap_id_mapping = True use_fully_qualified_names = True fallback_homedir = /home/%u@%d access_provider = ad dns_resolver_server_timeout = 300 dns_resolver_op_timeout = 5 The sssd-ad-1.16.5-10.el7_9.5.x86_64 complained about the new parameters (2020-12-07 9:03:50:673718): [sssd] [sss_ini_call_validators] (0x0020): [rule/allowed_domain_options]: Attribute 'dns_resolver_server_timeout' is not allowed in section 'domain/example.com'. Check for typos. ad1 is offline, logs containing dns lookups. (2020-12-07 10:14:55): [be[example.com]] [sdap_connect_host_resolv_done] (0x0400): Connecting to ldap://ad2.example.com:389 (2020-12-07 10:14:55): [be[example.com]] [sssd_async_socket_init_send] (0x4000): Using file descriptor [25] for the connection. (2020-12-07 10:14:55): [be[example.com]] [sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting (2020-12-07 10:14:55): [be[example.com]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://ad2.example.com:389/??base] with fd [25]. (2020-12-07 10:14:55): [be[example.com]] [sdap_connect_host_done] (0x0400): Successful connection to ldap://ad2.example.com:389 (2020-12-07 10:14:55): [be[example.com]] [sdap_print_server] (0x2000): Searching 192.168.75.5:389 (2020-12-07 10:14:55): [be[example.com]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(DnsDomain=example.com)(NtVer=\14\00\00\00))][]. (2020-12-07 10:14:55): [be[example.com]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [netlogon] (2020-12-07 10:14:55): [be[example.com]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 1 (2020-12-07 10:14:55): [be[example.com]] [sdap_op_add] (0x2000): New operation 1 timeout 6 (2020-12-07 10:14:55): [be[example.com]] [sdap_process_result] (0x2000): Trace: sh[0x55f2720bc9c0], connected[1], ops[0x55f2720b8f90], ldap[0x55f2720c02e0] (2020-12-07 10:14:55): [be[example.com]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_ENTRY] (2020-12-07 10:14:55): [be[example.com]] [sdap_parse_entry] (0x1000): OriginalDN: []. (2020-12-07 10:14:55): [be[example.com]] [sdap_parse_range] (0x2000): No sub-attributes for [netlogon] (2020-12-07 10:14:55): [be[example.com]] [sdap_process_result] (0x2000): Trace: sh[0x55f2720bc9c0], connected[1], ops[0x55f2720b8f90], ldap[0x55f2720c02e0] (2020-12-07 10:14:55): [be[example.com]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] (2020-12-07 10:14:55): [be[example.com]] [sdap_get_generic_op_finished] (0x0400): Search result: Success(0), no errmsg set (2020-12-07 10:14:55): [be[example.com]] [sdap_op_destructor] (0x2000): Operation 1 finished (2020-12-07 10:14:55): [be[example.com]] [sdap_handle_release] (0x2000): Trace: sh[0x55f2720bc9c0], connected[1], ops[(nil)], ldap[0x55f2720c02e0], destructor_lock[0], release_memory[0] (2020-12-07 10:14:55): [be[example.com]] [remove_connection_callback] (0x4000): Successfully removed connection callback. (2020-12-07 10:14:55): [be[example.com]] [ad_get_client_site_done] (0x0400): Found site: Default-First-Site-Name (2020-12-07 10:14:55): [be[example.com]] [ad_get_client_site_done] (0x0400): Found forest: example.com (2020-12-07 10:14:55): [be[example.com]] [ad_srv_plugin_site_done] (0x0400): About to discover primary and backup servers (2020-12-07 10:14:55): [be[example.com]] [fo_discover_servers_send] (0x0400): Looking up primary servers (2020-12-07 10:14:55): [be[example.com]] [resolv_discover_srv_next_domain] (0x0400): SRV resolution of service 'ldap'. Will use DNS discovery domain 'Default-First-Site-Name._sites.example.com' (2020-12-07 10:14:55): [be[example.com]] [resolv_getsrv_send] (0x0100): Trying to resolve SRV record of '_ldap._tcp.Default-First-Site-Name._sites.example.com' (2020-12-07 10:14:55): [be[example.com]] [schedule_request_timeout] (0x2000): Scheduling a timeout of 5 seconds (2020-12-07 10:14:55): [be[example.com]] [schedule_timeout_watcher] (0x2000): Scheduling DNS timeout watcher (2020-12-07 10:14:56): [be[example.com]] [check_fd_timeouts] (0x4000): Checking for DNS timeouts (2020-12-07 10:14:56): [be[example.com]] [unschedule_timeout_watcher] (0x4000): Unscheduling DNS timeout watcher (2020-12-07 10:14:56): [be[example.com]] [resolv_getsrv_done] (0x1000): Using TTL [600] (2020-12-07 10:14:56): [be[example.com]] [request_watch_destructor] (0x0400): Deleting request watch (2020-12-07 10:14:56): [be[example.com]] [fo_discover_srv_done] (0x0400): Got answer. Processing... (2020-12-07 10:14:56): [be[example.com]] [fo_discover_srv_done] (0x0400): Got 3 servers (2020-12-07 10:14:56): [be[example.com]] [fo_discover_servers_primary_done] (0x0400): Looking up backup servers (2020-12-07 10:14:56): [be[example.com]] [resolv_discover_srv_next_domain] (0x0400): SRV resolution of service 'ldap'. Will use DNS discovery domain 'example.com' (2020-12-07 10:14:56): [be[example.com]] [resolv_getsrv_send] (0x0100): Trying to resolve SRV record of '_ldap._tcp.example.com' (2020-12-07 10:14:56): [be[example.com]] [schedule_request_timeout] (0x2000): Scheduling a timeout of 5 seconds (2020-12-07 10:14:56): [be[example.com]] [schedule_timeout_watcher] (0x2000): Scheduling DNS timeout watcher (2020-12-07 10:14:57): [be[example.com]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (2020-12-07 10:14:57): [be[example.com]] [request_watch_destructor] (0x0400): Deleting request watch (2020-12-07 10:14:57): [be[example.com]] [be_resolve_server_done] (0x1000): Server resolution failed: [14]: Bad address (2020-12-07 10:14:57): [be[example.com]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error]) (2020-12-07 10:14:57): [be[example.com]] [be_mark_offline] (0x2000): Going offline! (2020-12-07 10:14:57): [be[example.com]] [be_mark_offline] (0x2000): Initialize check_if_online_ptask. (2020-12-07 10:14:57): [be[example.com]] [be_ptask_create] (0x0400): Periodic task [Check if online (periodic)] was created (2020-12-07 10:14:57): [be[example.com]] [be_ptask_schedule] (0x0400): Task [Check if online (periodic)]: scheduling task 69 seconds from now [1607354166] (2020-12-07 10:14:57): [be[example.com]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks. (2020-12-07 10:14:57): [be[example.com]] [sdap_id_op_connect_done] (0x4000): notify offline to op #1 (2020-12-07 10:14:57): [be[example.com]] [ad_subdomains_refresh_connect_done] (0x0020): Unable to connect to LDAP [11]: Resource temporarily unavailable (2020-12-07 10:14:57): [be[example.com]] [ad_subdomains_refresh_connect_done] (0x0080): No AD server is available, cannot get the subdomain list while offline (2020-12-07 10:14:57): [be[example.com]] [be_ptask_done] (0x0040): Task [Subdomains Refresh]: failed with [1432158212]: SSSD is offline (2020-12-07 10:14:57): [be[example.com]] [be_ptask_schedule] (0x0400): Task [Subdomains Refresh]: scheduling task 14400 seconds from now [1607368497] (2020-12-07 10:14:57): [be[example.com]] [sdap_id_op_connect_done] (0x4000): notify offline to op #2 (2020-12-07 10:14:57): [be[example.com]] [ad_subdomains_refresh_connect_done] (0x0020): Unable to connect to LDAP [11]: Resource temporarily unavailable (2020-12-07 10:14:57): [be[example.com]] [ad_subdomains_refresh_connect_done] (0x0080): No AD server is available, cannot get the subdomain list while offline (2020-12-07 10:14:57): [be[example.com]] [dp_req_done] (0x0400): DP Request [Subdomains #0]: Request handler finished [0]: Success (2020-12-07 10:14:57): [be[example.com]] [_dp_req_recv] (0x0400): DP Request [Subdomains #0]: Receiving request data. (2020-12-07 10:14:57): [be[example.com]] [dp_req_reply_list_success] (0x0400): DP Request [Subdomains #0]: Finished. Success. (2020-12-07 10:14:57): [be[example.com]] [dp_req_reply_std] (0x1000): DP Request [Subdomains #0]: Returning [Provider is Offline]: 1,1432158212,Offline (2020-12-07 10:14:57): [be[example.com]] [dp_req_reply_std] (0x1000): DP Request [Subdomains #0]: Returning [Provider is Offline]: 1,1432158212,Offline (2020-12-07 10:14:57): [be[example.com]] [dp_table_value_destructor] (0x0400): Removing [8:7:0000:<ALL>] from reply table (2020-12-07 10:14:57): [be[example.com]] [dp_req_destructor] (0x0400): DP Request [Subdomains #0]: Request removed. (2020-12-07 10:14:57): [be[example.com]] [dp_req_destructor] (0x0400): Number of active DP request: 0 (2020-12-07 10:14:57): [be[example.com]] [sdap_id_release_conn_data] (0x4000): releasing unused connection Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (sssd bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5459 |