Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/sssd/ticket/2904
When using sssd on CentOS 6.6 with the AD backend against a Samba 4.2 AD domain, sssd does not handle a rare failure condition; when the SRV records point at a DC but the A record for that domain controller is missing. sssd_be periodically crashes, it restarts a couple times but generally does not recover:
Dec 16 09:44:05 host kernel: sssd_be[107682]: segfault at 0 ip 00007fd12c5e018b sp 00007fffba8db420 error 4 in libsss_ad.so[7fd12c5ca000+20000]
Dec 16 09:44:05 host abrtd: Directory 'ccpp-2015-12-16-09:44:05-107682' creation detected
Dec 16 09:44:05 host abrt[107687]: Saved core dump of pid 107682 (/usr/libexec/sssd/sssd_be) to /var/spool/abrt/ccpp-2015-12-16-09:44:05-107682 (1978368 bytes)
The SIGSEGV appears to happen in sss_ldap_init_send(), src/util/sss_ldap.c:331.
Getting into this condition is rare - it's a Samba bug that I'm working on separately. The situation could probably be replicated by poisoning DNS though. My expected behavior would be to give up on this DC, try any other DCs in the Site, then try other DCs in other Sites.
I have ABRT crashes and cores / backtraces from GDB.
Verified against sssd-client-1.14.0-42.el7.x86_64 , though it's sanity *only* since we are testing against MS AD instead of Samba AD.
Removed A records for NS servers in zone file.
[root@dell-per230-02 ~]# nslookup bsod2.sssdad2012r2.com
Server: 10.12.0.159
Address: 10.12.0.159#53
*** Can't find bsod2.sssdad2012r2.com: No answer
[root@dell-per230-02 ~]# rm -rf /var/lib/sss/db/*
[root@dell-per230-02 ~]# service sssd restart
[root@dell-per230-02 ~]# id Administrator
uid=1196000500(administrator) gid=1196000513(domain users) groups=1196000513(domain users),1196000519(enterprise admins),1196000518(schema admins),1196000512(domain admins),1196000520(group policy creator owners),1196000572(denied rodc password replication group)
Ran for a few hours and SSSD did not crash.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHEA-2016-2476.html