Bug 872683

Summary: sssd_be segfaults with enumeration enabled and anonymous LDAP access disabled
Product: Red Hat Enterprise Linux 6 Reporter: Aron Parsons <parsonsa>
Component: sssdAssignee: Jakub Hrozek <jhrozek>
Status: CLOSED ERRATA QA Contact: Kaushik Banerjee <kbanerje>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3CC: grajaiya, jgalipea, pbrezina
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.9.2-8.el6 Doc Type: Bug Fix
Doc Text:
Cause: When anonymous bind is disabled and enumeration enabled, SSSD touched invalid array element during enumeration because the array was not NULL terminated. Consequence: sssd_be process crashed. Fix: The array is now NULL terminated. Result: sssd_be process does not crash during enumeration when anonymous bind is disabled.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 09:39:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
core dump
none
sssd.conf
none
updated sssd.conf
none
backtrace
none
sssd log none

Description Aron Parsons 2012-11-02 17:55:45 UTC
Created attachment 637183 [details]
core dump

Description of problem:
sssd_be[5730]: segfault at 81 ip 00007f88567a7b51 sp 00007fff607fa508 error 4 in libc-2.12.so[7f8856727000+189000]

sssd_be segfaults with a series of non-default configurations (all of which are sane and justified).  It will respawn most of the time and things continue on, but it also likes to not respawn, breaking sssd completely.

- anonymous access disabled on IPA servers
- _srv_ records not used on client
- enumeration enabled in sssd
- debug_level set to 0x0150 to log all failures

Version-Release number of selected component (if applicable):
sssd-1.8.0-32.el6.x86_64

How reproducible:
always

Steps to Reproduce:
server side:
- 2-node IPA install (vanilla, all defaults)
  - ipa-server-install ...
  - ipa-replica-prepare ...
  - disable anonymous access on both servers (ipactl stop; sed -i '/nsslapd-allow-anonymous-access/s/: on/: off/g' /etc/dirsrv/slapd-EXAMPLE-COM/dse.ldif; ipactl start)

client side:
- join IPA domain
- enable enumeration (enumerate = True in [domain] section)
- disable _srv_ records in ipa_server line
- set debug_level = 0x0150 in [domain] section
- clear caches and restart sssd to force enumeration (service sssd stop; rm -rf /var/lib/sss/db/*; service sssd start)
  
Actual results:
sssd_be segfaults multiple times and sometimes does not respawn, breaking sssd completely

Expected results:
sssd_be does not segfault

Additional info:
Based on additional debugging in the environment (not this recreate), it seems to be a race condition with anonymous access to the rootDSE failing and trying to determine if LDAP referrals are enabled (based on the sssd logs).  Setting 'ldap_referrals = False' in sssd.conf prevents the segfault.

Comment 1 Aron Parsons 2012-11-02 17:57:09 UTC
Created attachment 637184 [details]
sssd.conf

Comment 2 Aron Parsons 2012-11-02 17:58:12 UTC
Created attachment 637186 [details]
updated sssd.conf

uncomment the right debug_level for accuracy in reproducing

Comment 4 Jakub Hrozek 2012-11-04 22:47:07 UTC
Hello Aron.

Thank you for submitting the bug report. You mentioned you had logs available..would you mind attaching them? Feel free to sanitize sensitive data before attaching the logs.

Unfortunately, the core file doesn't load correctly for me..is it possible to also attach a backtrace?

You'd need to install the sssd-debuginfo, then load the coredump in the gdb:
# gdb -c coredump /usr/libexec/sssd/sssd_be

and the type bt full to get the back trace.

Thank you!

Comment 5 Jakub Hrozek 2012-11-05 10:00:12 UTC
Pavel, in the meantime, can you try and reproduce the bug while I'm travelling?

Comment 6 Aron Parsons 2012-11-05 16:21:25 UTC
Jakub,
I'll get the backtrace and logs over to you tonight; I don't have my system with the reproducer VMs with me right now.  I easily reproduced in a test environment after noticing the segfaults in production, so hopefully you guys can as well.

Comment 7 Aron Parsons 2012-11-05 23:07:11 UTC
Created attachment 638953 [details]
backtrace

Comment 8 Aron Parsons 2012-11-05 23:07:25 UTC
Created attachment 638954 [details]
sssd log

Comment 9 Jakub Hrozek 2012-11-06 23:31:42 UTC
I was able to reproduce with 6.3. This valgrind output pretty much matches Aron's backtrace:

==3181== Process terminating with default action of signal 11 (SIGSEGV)
==3181==  Access not within mapped region at address 0x110
==3181==    at 0x851C389: vfprintf (vfprintf.c:1597)
==3181==    by 0x851DF1F: buffered_vfprintf (vfprintf.c:2262)
==3181==    by 0x8518F4D: vfprintf (vfprintf.c:1291)
==3181==    by 0x85D3E56: __vfprintf_chk (vfprintf_chk.c:35)
==3181==    by 0x449C94: debug_fn (stdio2.h:128)
==3181==    by 0xFE6A939: sdap_get_generic_ext_step (sdap_async.c:1163)
==3181==    by 0xFE6D190: sdap_get_generic_ext_send.clone.0 (sdap_async.c:1127)
==3181==    by 0xFE6D2F1: sdap_get_generic_send (sdap_async.c:1402)
==3181==    by 0xFE99DC5: sdap_get_services_next_base (sdap_async_services.c:143)
==3181==    by 0xFE9A0A3: sdap_get_services_send (sdap_async_services.c:112)
==3181==    by 0xFE9A410: enum_services_send (sdap_async_services.c:578)
==3181==    by 0xFE4BA07: ldap_id_enum_groups_done (ldap_id_enum.c:393)


So far I've been unable to reproduce with 6.4. I need to check which commit fixed the bug (if it was in fact fixed)

Comment 10 Jakub Hrozek 2012-11-06 23:42:39 UTC
OK, the attrs array was not properly NULL-terminated in 6.3, but it is in 6.4. 

Kaushik, would you like to keep this bug open for book-keeping purposes and include it in the errata? The fix is already in.

Comment 11 Kaushik Banerjee 2012-11-07 04:41:08 UTC
(In reply to comment #10)
> Kaushik, would you like to keep this bug open for book-keeping purposes and
> include it in the errata? The fix is already in.

Yes. I would like this bug to be included in the 6.4 errata.

Comment 12 Jakub Hrozek 2012-11-07 08:18:45 UTC
OK, then ack please :-)

The reproducer is fairly simple, just enable enumeration and eventually the sssd_be would fail.

Comment 13 Jakub Hrozek 2012-11-11 21:50:46 UTC
The Fixed In Version field is probably not completely correct, I think the bug was simply fixed during 1.9 development. But sssd-1.9.2-8.el6 would do as well, I think.

Comment 15 Kaushik Banerjee 2013-01-02 09:13:10 UTC
Verified in version 1.9.2-41

Output from beaker automation run:
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: rfc2307_018 bz872683 disable anonymous bind on server and enumerate=true
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Stopping sssd: [  OK  ]
Starting sssd: [  OK  ]
[  OK  ]
:: [03:31:02] ::  Sleeping for 5 seconds
testuser1:*:11111:11111:Test user 1:/home/testuser1:
:: [   PASS   ] :: Running 'getent passwd | grep testuser1'
spawn ssh -q -l testuser1 localhost echo 'login successful'
testuser1@localhost's password: 
Could not chdir to home directory /home/testuser1: No such file or directory
login successful
:: [   PASS   ] :: Authentication successful, as expected
:: [   PASS   ] :: Running 'auth_success testuser1 Secret123'
:: [   PASS   ] :: File '/var/log/messages' should not contain 'segfault'
'47e19d4f-5358-4955-8f22-077ddebf93ef'
rfc2307-018-bz872683-disable-anonymous-bind-on-server-and-enumerate-true result: PASS

Comment 16 errata-xmlrpc 2013-02-21 09:39:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0508.html