Bug 804783

Summary: [abrt] Segfault during LDAP 'services' lookup
Product: [Fedora] Fedora Reporter: James Cape <jamescape777>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: jhrozek, sbose, sgallagh, ssorce
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:d719b9efcbdd52df88a87eb9a8be0d63eebf1c22
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-30 15:22:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: dso_list
none
File: maps
none
File: backtrace none

Description James Cape 2012-03-19 18:47:26 UTC
libreport version: 2.0.8
abrt_version:   2.0.7
backtrace_rating: 4
cmdline:        /usr/libexec/sssd/sssd_be --domain eladian.com --debug-to-files
crash_function: __strlen_sse2_pminub
executable:     /usr/libexec/sssd/sssd_be
kernel:         3.2.7-1.fc16.x86_64
pid:            25436
pwd:            /
reason:         Process /usr/libexec/sssd/sssd_be was killed by signal 11 (SIGSEGV)
time:           Mon 19 Mar 2012 01:09:50 PM EDT
uid:            0
username:       root

backtrace:      Text file, 24973 bytes
dso_list:       Text file, 6167 bytes
maps:           Text file, 28146 bytes

build_ids:
:28579c89bda7f08dace218884d7a5561f3f5f89d
:b06c064de933351790864300dddec868b2a6401e
:ccd23fb93a7924ddd27ea9e7f054fa5309c92f32
:5083c24b4fd7dbf3cac9cf4c15876b0dfdceb4bb
:bb2eda4e80e107a3874f5cf7612166a1d81cd08b
:34e8b3cee73a6d0b8ac17c69ba4b1170724db64c
:ef95ed2b74ff19753827117d4e556f38b33e74b5
:065f24e263b3a0d4316bfeedf855df8c2c35c03a
:59292332bd88fce31f7557e60ff08d44ce254ec8
:c49811439aeff6168ae82e7cf808a8c03b1c9e48
:b010321c92dc1b5abf0007da423b778542817883
:552354979eebf3d6c27346691a5b520ccf8f4e75
:0ce9819eb7014de7e72eec40979dd5e4d6566dca
:1507cec972ed5cc28e0afd7e646afe0133930db5
:b1532c8fcd1888fb7dc99186b3b8ec6875c72fa4
:9181b91e3dd3b2786bf09199d2f89ec2a27d3652
:77ca13b377dc2850e7bde1c91e160c0f19421fb1
:638db099ff5fb986d1a092629d1bb1dca5be4904
:e5429e0905bedecc534b057783916ba43e06a66e
:d0b5f1b1804ecdb5af0fe6d86e1117b7f377d47b
:2b09b8bb44cdbcce3b7e6a786adb0c0ffb4a27b9
:2639b5594fe4d7897db63ecd1ebb9681438317a8
:bb5c9b5cca6d04252b061a2f840ab4ddfda9e733
:83cb4627bed5c756d4e6f23badf8a6b4b042f4e6
:a0b2903547454e47fec356c1fa1a69a7593893c5
:c95e8094ddf00e9e03f98e4bd796b7c60a50d919
:e2e67b3ae2579e1667b73ad385f61552931024e9
:b9746e6e639b060dcd2809a4c46655ee77838bf1
:942f7573daf3ab10375f201214e8d466fccec6fb
:8ea3a1b50dfbd33351111dac86cb8d6b6a73976f
:3ed9e0f5bbcd8d9d5b77f7dd4b562ec83d7ea767
:ea44661a40777d6be2f79058c161d3bb4962891a
:3903219097bcd7fef6702a8fc24e50484083e23e
:c314e9cb57367719196b6f665fe6773a2d08add6
:4b4285058f7f6b39b7e7c45df83fd36c6833bfe4
:6502dd4813f98137c70e6e05cc43828f5c2263c1
:b52ecaab000fad35642f76aaf4396f4e7ae01c45
:dacf32a9a2f2b5077ba944e7b835cca6f637f78f
:186430a109712d3f99c968fdb8d897a7aca2eb77
:2ae4bcf5a249f300d9879500800c546ed525a6dc
:20efc30fde6abc23cfe693739cff23d67ee9d333
:53b01cfc73e8eba8c6af3feefd3f4378439692b9
:789c958162e98ef211c3bebd162ab587eb9ece93
:0a49b6fd90960659888f0be3bb576495f9d41959
:be507c791e34415e8f42f0e6030c889b2895cf9a
:0bd515c1c778bb4b8f870a900d1543da39d6b662
:14251252341032537663aeb61fc56edf2330e584
:51e3df2e35f101b59291fb3c73f3341bbfc055d3
:c2692dcb73a5877a23c9b3943bb4603ab19168de
:e2d68a0ee0872365dcc8c6640b7cd7b94c8276f2
:82df68f406427a9efe3b2e41cec4374c1faf8dab
:3691cff3e332d944d8ad00a816c50216dab79a34
:b9d3a4213a482d034bebc7dc1ed2901a734a894d
:e814ef8432b7dec42a0b8ec12b9abd9fc7f57b40
:a28b15796672928bb44b262c88052ae601bf5955
:8822ec50a7ca08984b34af0b3096794a5085c473
:caeeab6968f72c5f76b50b0d94ee93db61784288
:5ac56dbb5866c475e1b0c3e4ceeb390b0f7508c0
:1bf947579687b870375bdb51b4eef277f76f9280
:22dcc84f73e86def63c643428320fff70a44a679
:6cfd35e0ead3e8ae8b000a42efe467c28762a08a
:36fc045d11fc55f1b455247938e651666e2c127e
:5cb5f8da286abd58aeab3bd6676e661d52ec2b5d
:1a212c7f1515542b310ba92f6109efc9b5bf2b6e
:ee2f04900ae1f07517d91eba300ef385fccab1b8
:89017ab0b75e533e4294f74ee8f7db3daceea536
:b824b7802995d5bec090260e699812e417068227
:f1e8182609d9eae40a8ddbbb72c9346166014f1a
:96be920f9f506a5bdcb7c4d7a8565b4f0791fc61

environ:
:BOOT_IMAGE=/vmlinuz-3.2.7-1.fc16.x86_64
:PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
:SYSFONT=latarcyrheb-sun16
:LANG=en_US.UTF-8
:PWD=/
:KEYTABLE=us
:_SSS_LOOPS=NO
:KRB5RCACHEDIR=/var/cache/krb5rcache

var_log_messages:
:Mar 19 13:09:28 jamescape kernel: [1621561.543262] sssd_be[25280]: segfault at 70 ip 00007fa86ffeb8e5 sp 00007fffb8dff1b8 error 4 in libc-2.14.90.so[7fa86fe90000+1ad000]
:Mar 19 13:09:29 jamescape abrt[25383]: Saved core dump of pid 25280 (/usr/libexec/sssd/sssd_be) to /var/spool/abrt/ccpp-2012-03-19-13:09:28-25280 (21028864 bytes)
:Mar 19 13:09:40 jamescape kernel: [1621573.102579] sssd_be[25390]: segfault at 70 ip 00007f260775a8e5 sp 00007fff433cd398 error 4 in libc-2.14.90.so[7f26075ff000+1ad000]
:Mar 19 13:09:40 jamescape abrt[25435]: Not dumping repeating crash in '/usr/libexec/sssd/sssd_be'
:Mar 19 13:09:50 jamescape kernel: [1621583.407887] sssd_be[25436]: segfault at 70 ip 00007f95e65d48e5 sp 00007fff304546d8 error 4 in libc-2.14.90.so[7f95e6479000+1ad000]
:Mar 19 13:09:51 jamescape abrt[25548]: Saved core dump of pid 25436 (/usr/libexec/sssd/sssd_be) to /var/spool/abrt/ccpp-2012-03-19-13:09:50-25436 (21811200 bytes)

Comment 1 James Cape 2012-03-19 18:47:30 UTC
Created attachment 571207 [details]
File: dso_list

Comment 2 James Cape 2012-03-19 18:47:32 UTC
Created attachment 571208 [details]
File: maps

Comment 3 James Cape 2012-03-19 18:47:33 UTC
Created attachment 571209 [details]
File: backtrace

Comment 4 Stephen Gallagher 2012-03-19 20:23:57 UTC
Thanks for the bug report. Can you reproduce the issue? It looks from the backtrace that it's occurring while processing lookups for the NSS 'services' map, so probably it's being invoked implicitly by another process.

If possible, could you add "debug_level = 6" to /etc/sssd/sssd.conf in the [domain/DOMAINNAME] section (substituting DOMAINNAME as appropriate) and restarting SSSD with 'systemctl restart sssd.service', reproduce the issue and then attach /var/log/sssd/sssd_DOMAINNAME.log to this BZ (either sanitized or as a private attachment)?

That would help us track this down faster.

Comment 5 James Cape 2012-03-20 17:20:01 UTC
Yes, this crash has taken out three of the four workstations which our users have installed it on so-far---we've reverted to 1.6.4 and are in the process of pinning the package now so users don't accidentally update to 1.8.1.

Yesterday we were able to help it survive a bit longer if we put debug_level=0x7777 and timeout=1 in the domain section in the configs, but that didn't work on the workstation which was updated today.

As these are people's actual workstations it's more important that they're working right now, but I'll attempt to re-update/reproduce in a minute.

Comment 6 James Cape 2012-03-20 17:32:03 UTC
On my own system, I can reproduce this bug (sssd_be is restarted several times, then fails to come back on the last attempt, typically within 60 seconds) without debugging. With debug_level = 6, it's been fairly solid.

Comment 7 Stephen Gallagher 2012-03-20 17:33:45 UTC
For the machines that are failing, you can work around the problem by removing the 'sss' from the 'services:' line of /etc/nsswitch.conf. That should allow them to continue running SSSD 1.8.1 without hitting this issue.

Comment 8 Stephen Gallagher 2012-03-20 17:35:54 UTC
(In reply to comment #6)
> On my own system, I can reproduce this bug (sssd_be is restarted several times,
> then fails to come back on the last attempt, typically within 60 seconds)

Yeah, if we detect the backend crashing that often, we stop trying to restart it. Strange behavior though. I'm guessing you must have some service in your environment that's querying the NSS 'service' map constantly.

> without debugging. With debug_level = 6, it's been fairly solid.

Could you examine those logs and see if you see a lot of repeated requests for the same information (specifically service information)? It's possible we have a race condition that the debug logs are hiding.

Comment 9 James Cape 2012-03-20 18:14:59 UTC
(This just hit workstation #5)

None of the machines have sss in the services line, and the only repeated requests I see in the logs for debug_level = 6 what looks like the initial user/group load.

Comment 10 Stephen Gallagher 2012-03-20 19:27:22 UTC
Ah, damn. I just noticed that this is happening during enumeration, not lookup. So we probably have an enumeration bug. I'll try to reproduce this. Sorry for the confusion.

I'll dive into this first thing tomorrow.

Comment 11 Fedora Update System 2012-03-21 11:45:10 UTC
sssd-1.8.1-8.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/sssd-1.8.1-8.fc16

Comment 12 James Cape 2012-03-21 15:36:18 UTC
Still crashing, see #805566

Comment 13 Fedora Update System 2012-03-22 01:56:07 UTC
Package sssd-1.8.1-8.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sssd-1.8.1-8.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-4404/sssd-1.8.1-8.fc16
then log in and leave karma (feedback).

Comment 14 abrt-bot 2012-03-30 15:22:01 UTC
Backtrace analysis found this bug to be similar to bug #805566, closing as duplicate.

This comment is automatically generated.

*** This bug has been marked as a duplicate of bug 805566 ***