RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1365218 - SSSD does not fail over to next GC
Summary: SSSD does not fail over to next GC
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd
Version: 6.7
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: SSSD Maintainers
QA Contact: Steeve Goveas
URL:
Whiteboard:
Depends On:
Blocks: 1269194 1365846
TreeView+ depends on / blocked
 
Reported: 2016-08-08 16:17 UTC by German Parente
Modified: 2020-05-04 10:57 UTC (History)
10 users (show)

Fixed In Version: sssd-1.13.3-32.el6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-21 09:57:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github SSSD sssd issues 4051 0 None None None 2020-05-04 10:57:29 UTC
Red Hat Product Errata RHBA-2017:0632 0 normal SHIPPED_LIVE sssd bug fix and enhancement update 2017-03-21 12:30:13 UTC

Description German Parente 2016-08-08 16:17:13 UTC
Description of problem:

this is the RHEL6 version of bz 1318996 

Please, see the details there.

I am asking to have it backported to RHEL6 branch since the fix is just a one liner commit + two debug messages.


How reproducible: difficult to reproduce. Customer has reproduced and applied a test rpm.


Additional info: upstream bug is 


https://fedorahosted.org/sssd/ticket/3010

Comment 5 Jakub Hrozek 2016-09-22 08:04:14 UTC
sssd-1-13:
 * 6ff40ad8f0d604620210c9680ef8b1f9ed1e0417
 * 9218ad4a750b46c8fc89a3a30c9a8411e620141d

Comment 7 Dan Lavu 2017-01-12 18:53:12 UTC
This is failing against sssd-1.13.3-53.el6.x86_64

(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [ad_get_dc_servers_done] (0x0400): Found 2 domain controllers in domain sssdad2012r2.com
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [ad_srv_plugin_dcs_done] (0x0400): About to locate suitable site
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [sdap_connect_host_send] (0x0400): Resolving host bsod2-bdc.sssdad2012r2.com
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_is_address] (0x4000): [bsod2-bdc.sssdad2012r2.com] does not look like an IP address
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_step] (0x2000): Querying files
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_files_send] (0x0100): Trying to resolve A record of 'bsod2-bdc.sssdad2012r2.com' in files
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_step] (0x2000): Querying files
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_files_send] (0x0100): Trying to resolve AAAA record of 'bsod2-bdc.sssdad2012r2.com' in files
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_next] (0x0200): No more address families to retry
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_step] (0x2000): Querying DNS
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_dns_query] (0x0100): Trying to resolve A record of 'bsod2-bdc.sssdad2012r2.com' in DNS
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [schedule_request_timeout] (0x2000): Scheduling a timeout of 6 seconds
(Thu Jan 12 16:46:02 2017) [sssd[be[sssdad2012r2.com]]] [schedule_timeout_watcher] (0x2000): Scheduling DNS timeout watcher
(Thu Jan 12 16:46:03 2017) [sssd[be[sssdad2012r2.com]]] [check_fd_timeouts] (0x4000): Checking for DNS timeouts
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [check_fd_timeouts] (0x4000): Checking for DNS timeouts
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [unschedule_timeout_watcher] (0x4000): Unscheduling DNS timeout watcher
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_dns_parse] (0x1000): Parsing an A reply
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [request_watch_destructor] (0x0400): Deleting request watch
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [sdap_connect_host_resolv_done] (0x0400): Connecting to ldap://bsod2-bdc.sssdad2012r2.com:389
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [sss_ldap_init_send] (0x4000): Using file descriptor [23] for LDAP connection.
(Thu Jan 12 16:46:04 2017) [sssd[be[sssdad2012r2.com]]] [sss_ldap_init_send] (0x0400): Setting 6 seconds timeout for connecting

<---- SNIP ---->

(Thu Jan 12 16:46:16 2017) [sssd[be[sssdad2012r2.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: 14
(Thu Jan 12 16:46:16 2017) [sssd[be[sssdad2012r2.com]]] [check_online_callback] (0x0100): Backend returned: (1, 0, <NULL>) [Provider is Offline]
(Thu Jan 12 16:46:16 2017) [sssd[be[sssdad2012r2.com]]] [fo_reset_services] (0x1000): Resetting all servers in all services

Comment 10 Pavel Březina 2017-01-13 11:59:05 UTC
We reached a service resolution timeout. https://fedorahosted.org/sssd/ticket/3217

Try to set
dns_resolver_timeout = 3
ldap_opt_timeout = 9

Comment 11 Dan Lavu 2017-01-13 13:21:53 UTC
Sadly, no luck. 

After issuing 

iptables -A INPUT -s $PRIMARY_DC -j DROP ; iptables -A OUTPUT -s $PRIMARY_DC -j DROP

causes the timeout issue to the secondary_dc

sssd.conf
===============
[sssd]
config_file_version = 2
services = nss, pam
domains = sssdad2012r2.com

[nss]
default_shell = /bin/bash

[domain/sssdad2012r2.com]
debug_level = 0xFFF0
ad_enable_gc = true
id_provider = ad
cache_credentials = True
krb5_store_password_if_offline = True
use_fully_qualified_names = True
fallback_homedir = /home/%d/%u
dns_resolver_timeout = 3
ldap_opt_timeout = 9


logs
===============
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [ad_get_dc_servers_done] (0x0400): Found 2 domain controllers in domain sssdad2012r2.com
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [ad_srv_plugin_dcs_done] (0x0400): About to locate suitable site
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [sdap_connect_host_send] (0x0400): Resolving host bsod2.sssdad2012r2.com
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_is_address] (0x4000): [bsod2.sssdad2012r2.com] does not look like an IP address
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_step] (0x2000): Querying files
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_files_send] (0x0100): Trying to resolve A record of 'bsod2.sssdad2012r2.com' in files
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_step] (0x2000): Querying files
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_files_send] (0x0100): Trying to resolve AAAA record of 'bsod2.sssdad2012r2.com' in files
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_next] (0x0200): No more address families to retry
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_step] (0x2000): Querying DNS
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [resolv_gethostbyname_dns_query] (0x0100): Trying to resolve A record of 'bsod2.sssdad2012r2.com' in DNS
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [schedule_request_timeout] (0x2000): Scheduling a timeout of 6 seconds
(Fri Jan 13 11:18:14 2017) [sssd[be[sssdad2012r2.com]]] [schedule_timeout_watcher] (0x2000): Scheduling DNS timeout watcher
(Fri Jan 13 11:18:15 2017) [sssd[be[sssdad2012r2.com]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached
(Fri Jan 13 11:18:15 2017) [sssd[be[sssdad2012r2.com]]] [request_watch_destructor] (0x0400): Deleting request watch
(Fri Jan 13 11:18:15 2017) [sssd[be[sssdad2012r2.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: 14
(Fri Jan 13 11:18:15 2017) [sssd[be[sssdad2012r2.com]]] [check_online_callback] (0x0100): Backend returned: (1, 0, <NULL>) [Provider is Offline]
(Fri Jan 13 11:18:15 2017) [sssd[be[sssdad2012r2.com]]] [check_fd_timeouts] (0x4000): Checking for DNS timeouts
(Fri Jan 13 11:18:16 2017) [sssd[be[sssdad2012r2.com]]] [check_fd_timeouts] (0x4000): Checking for DNS timeouts
(Fri Jan 13 11:18:16 2017) [sssd[be[sssdad2012r2.com]]] [unschedule_timeout_watcher] (0x4000): Unscheduling DNS timeout watcher
(Fri Jan 13 11:18:18 2017) [sssd[be[sssdad2012r2.com]]] [be_ptask_execute] (0x0400): Back end is offline

Comment 12 Lukas Slebodnik 2017-01-13 15:09:00 UTC
(In reply to Dan Lavu from comment #11)
> Sadly, no luck. 
> 
> After issuing 
> 
> iptables -A INPUT -s $PRIMARY_DC -j DROP ; iptables -A OUTPUT -s $PRIMARY_DC
> -j DROP
> 
> causes the timeout issue to the secondary_dc
> 
Do you hit a timeout issue with REJECT instead of DROP?

Comment 13 Dan Lavu 2017-01-13 17:52:11 UTC
Yes, there is no difference between REJECT/DROP.

Comment 22 Dan Lavu 2017-02-16 16:10:53 UTC
This is now verified against sssd-1.13.3-56.el6.x86_64

This was a client mis-configuration, only the primary name server was in /etc/resolv.conf and was unable to perform the lookup for the IP for the secondary GC when attempting to fail over.

Comment 24 errata-xmlrpc 2017-03-21 09:57:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0632.html


Note You need to log in before you can comment on or make changes to this bug.