Bug 804783 - [abrt] Segfault during LDAP 'services' lookup
Summary: [abrt] Segfault during LDAP 'services' lookup
Status: CLOSED DUPLICATE of bug 805566
Alias: None
Product: Fedora
Classification: Fedora
Component: sssd
Version: 16
Hardware: x86_64
OS: Unspecified
Target Milestone: ---
Assignee: Stephen Gallagher
QA Contact: Fedora Extras Quality Assurance
Whiteboard: abrt_hash:d719b9efcbdd52df88a87eb9a8b...
Depends On:
TreeView+ depends on / blocked
Reported: 2012-03-19 18:47 UTC by James Cape
Modified: 2018-09-19 23:17 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2012-03-30 15:22:01 UTC
Type: ---

Attachments (Terms of Use)
File: dso_list (6.02 KB, text/plain)
2012-03-19 18:47 UTC, James Cape
no flags Details
File: maps (27.49 KB, text/plain)
2012-03-19 18:47 UTC, James Cape
no flags Details
File: backtrace (24.39 KB, text/plain)
2012-03-19 18:47 UTC, James Cape
no flags Details

Description James Cape 2012-03-19 18:47:26 UTC
libreport version: 2.0.8
abrt_version:   2.0.7
backtrace_rating: 4
cmdline:        /usr/libexec/sssd/sssd_be --domain eladian.com --debug-to-files
crash_function: __strlen_sse2_pminub
executable:     /usr/libexec/sssd/sssd_be
kernel:         3.2.7-1.fc16.x86_64
pid:            25436
pwd:            /
reason:         Process /usr/libexec/sssd/sssd_be was killed by signal 11 (SIGSEGV)
time:           Mon 19 Mar 2012 01:09:50 PM EDT
uid:            0
username:       root

backtrace:      Text file, 24973 bytes
dso_list:       Text file, 6167 bytes
maps:           Text file, 28146 bytes



:Mar 19 13:09:28 jamescape kernel: [1621561.543262] sssd_be[25280]: segfault at 70 ip 00007fa86ffeb8e5 sp 00007fffb8dff1b8 error 4 in libc-2.14.90.so[7fa86fe90000+1ad000]
:Mar 19 13:09:29 jamescape abrt[25383]: Saved core dump of pid 25280 (/usr/libexec/sssd/sssd_be) to /var/spool/abrt/ccpp-2012-03-19-13:09:28-25280 (21028864 bytes)
:Mar 19 13:09:40 jamescape kernel: [1621573.102579] sssd_be[25390]: segfault at 70 ip 00007f260775a8e5 sp 00007fff433cd398 error 4 in libc-2.14.90.so[7f26075ff000+1ad000]
:Mar 19 13:09:40 jamescape abrt[25435]: Not dumping repeating crash in '/usr/libexec/sssd/sssd_be'
:Mar 19 13:09:50 jamescape kernel: [1621583.407887] sssd_be[25436]: segfault at 70 ip 00007f95e65d48e5 sp 00007fff304546d8 error 4 in libc-2.14.90.so[7f95e6479000+1ad000]
:Mar 19 13:09:51 jamescape abrt[25548]: Saved core dump of pid 25436 (/usr/libexec/sssd/sssd_be) to /var/spool/abrt/ccpp-2012-03-19-13:09:50-25436 (21811200 bytes)

Comment 1 James Cape 2012-03-19 18:47:30 UTC
Created attachment 571207 [details]
File: dso_list

Comment 2 James Cape 2012-03-19 18:47:32 UTC
Created attachment 571208 [details]
File: maps

Comment 3 James Cape 2012-03-19 18:47:33 UTC
Created attachment 571209 [details]
File: backtrace

Comment 4 Stephen Gallagher 2012-03-19 20:23:57 UTC
Thanks for the bug report. Can you reproduce the issue? It looks from the backtrace that it's occurring while processing lookups for the NSS 'services' map, so probably it's being invoked implicitly by another process.

If possible, could you add "debug_level = 6" to /etc/sssd/sssd.conf in the [domain/DOMAINNAME] section (substituting DOMAINNAME as appropriate) and restarting SSSD with 'systemctl restart sssd.service', reproduce the issue and then attach /var/log/sssd/sssd_DOMAINNAME.log to this BZ (either sanitized or as a private attachment)?

That would help us track this down faster.

Comment 5 James Cape 2012-03-20 17:20:01 UTC
Yes, this crash has taken out three of the four workstations which our users have installed it on so-far---we've reverted to 1.6.4 and are in the process of pinning the package now so users don't accidentally update to 1.8.1.

Yesterday we were able to help it survive a bit longer if we put debug_level=0x7777 and timeout=1 in the domain section in the configs, but that didn't work on the workstation which was updated today.

As these are people's actual workstations it's more important that they're working right now, but I'll attempt to re-update/reproduce in a minute.

Comment 6 James Cape 2012-03-20 17:32:03 UTC
On my own system, I can reproduce this bug (sssd_be is restarted several times, then fails to come back on the last attempt, typically within 60 seconds) without debugging. With debug_level = 6, it's been fairly solid.

Comment 7 Stephen Gallagher 2012-03-20 17:33:45 UTC
For the machines that are failing, you can work around the problem by removing the 'sss' from the 'services:' line of /etc/nsswitch.conf. That should allow them to continue running SSSD 1.8.1 without hitting this issue.

Comment 8 Stephen Gallagher 2012-03-20 17:35:54 UTC
(In reply to comment #6)
> On my own system, I can reproduce this bug (sssd_be is restarted several times,
> then fails to come back on the last attempt, typically within 60 seconds)

Yeah, if we detect the backend crashing that often, we stop trying to restart it. Strange behavior though. I'm guessing you must have some service in your environment that's querying the NSS 'service' map constantly.

> without debugging. With debug_level = 6, it's been fairly solid.

Could you examine those logs and see if you see a lot of repeated requests for the same information (specifically service information)? It's possible we have a race condition that the debug logs are hiding.

Comment 9 James Cape 2012-03-20 18:14:59 UTC
(This just hit workstation #5)

None of the machines have sss in the services line, and the only repeated requests I see in the logs for debug_level = 6 what looks like the initial user/group load.

Comment 10 Stephen Gallagher 2012-03-20 19:27:22 UTC
Ah, damn. I just noticed that this is happening during enumeration, not lookup. So we probably have an enumeration bug. I'll try to reproduce this. Sorry for the confusion.

I'll dive into this first thing tomorrow.

Comment 11 Fedora Update System 2012-03-21 11:45:10 UTC
sssd-1.8.1-8.fc16 has been submitted as an update for Fedora 16.

Comment 12 James Cape 2012-03-21 15:36:18 UTC
Still crashing, see #805566

Comment 13 Fedora Update System 2012-03-22 01:56:07 UTC
Package sssd-1.8.1-8.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sssd-1.8.1-8.fc16'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).

Comment 14 abrt-bot 2012-03-30 15:22:01 UTC
Backtrace analysis found this bug to be similar to bug #805566, closing as duplicate.

This comment is automatically generated.

*** This bug has been marked as a duplicate of bug 805566 ***

Note You need to log in before you can comment on or make changes to this bug.