Bug 1118541 - Floating point exception using ldap
Summary: Floating point exception using ldap
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd
Version: 6.5
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Jakub Hrozek
QA Contact: Kaushik Banerjee
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-11 02:38 UTC by vcommarieu
Modified: 2019-07-11 08:03 UTC (History)
10 users (show)

Fixed In Version: sssd-1.11.6-12.el6
Doc Type: Bug Fix
Doc Text:
Cause: Race condition in initialization of fast memory cache in SSSD client libraries. Consequence: Multi threaded application may be signaled with SIGSEGV or SIGFPE depending on the conditions met. Fix: Locks were added around the initialization code making it thread safe. Result: Race condition was removed.
Clone Of:
Environment:
Last Closed: 2014-10-14 04:49:01 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1375 normal SHIPPED_LIVE sssd bug fix and enhancement update 2014-10-14 01:06:25 UTC
Red Hat Knowledge Base (Solution) 1142153 None None None Never

Description vcommarieu 2014-07-11 02:38:36 UTC
Description of problem:
Floating point exception with core generated
dmesg:
 a1[13223] trap divide error ip:7fedf417d06b sp:7fedef7fbca8 error:0
a1[13219] trap divide error ip:7fedf417d06b sp:7fedf5d8dca8 error:0 in libnss_sss.so.2[7fedf4178000+7000]
 in libnss_sss.so.2[7fedf4178000+7000]

We are using ldap/sssd authentication with Windows2012 active directory.
Happens only for ldap users. not root


Version-Release number of selected component (if applicable):
uname -a: 2.6.39-400.214.6.el6uek.x86_64

How reproducible:
This simple program reproduces the crash:

#include <pwd.h>
#include <unistd.h>
#include <pthread.h>

void *tr(void *) {
        struct passwd pwd;
        char buf[8192];
        struct passwd *res;

        getpwuid_r(getuid(), &pwd, buf, sizeof(buf), &res); }

#define NTH 100
pthread_t t[NTH];
int main()
{
        int i;
        for (i=0; i<NTH; ++i) {
                pthread_create(&t[i], NULL, tr, NULL);
        }
        for (i=0; i<NTH; ++i) {
                pthread_join(t[i], NULL);
        }
        return 0;
}


--------------


$ g++ -lpthread -o a1 a1.cpp

$ ./a1

$ ./a1

$ ./a1

$ ./a1

$ ./a1

$ ./a1
Floating point exception

$ ./a1
Segmentation fault

$ ./a1
Floating point exception

$ ./a1

$ ./a1

$ ./a1
Floating point exception

$ ./a1

$ ./a1

$ ./a1
Floating point exception

$ ./a1

$ ./a1

Comment 2 Lukas Slebodnik 2014-07-11 08:30:09 UTC
I was able to reproduce problem with "Segmentation fault". There is a race condition in multi threaded application with initialisation fast memory cache.

Could you install debug symbols "debuginfo-install sssd"; compile example program ad provide backtrace for "Floating point exception". If there is different backtraces can you provide all of them?

Comment 3 Lukas Slebodnik 2014-07-11 08:32:51 UTC
One more time.
Could you:
  * install debug symbols for sssd "debuginfo-install sssd"
  * compile example program with debug information "g++ -g  ..."
  * provide backtrace for "Floating point exception"

Comment 4 Jakub Hrozek 2014-07-11 08:48:45 UTC
Thanks a lot for the detailed bug report, much appreciated! Cloning upstream..

Comment 5 Lukas Slebodnik 2014-07-11 08:51:24 UTC
There is a workaround how you can solve your problem. getpwuid_r should be called for first time in main thread before creating other threads. So fast memory cache will be initialized properly and reused by other threads.
sssd-client use two different kind of fast meory caches. The first one for password file entries and the second one for  group file entries. You should call getpwuid and getgrgid in main thread to be safe.

I would be still interested in backtrace for "Floating point exception"

Comment 6 Jakub Hrozek 2014-07-11 08:55:06 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/2380

Comment 7 Jakub Hrozek 2014-07-23 19:17:57 UTC
    master:
        0d22416f94dff7756091e983518ed3684cc9597a
        9d876108620931e0941a115adf60bfd8d67459d9 
    sssd-1-11:
        423acb81f83bc8d2954746f370c640fb426b773a
        7b8b213f8babc718d977b5ef5b260e62f6300388

Comment 9 Brad Hubbard 2014-07-25 10:37:27 UTC
*** Bug 1123291 has been marked as a duplicate of this bug. ***

Comment 19 Jakub Hrozek 2014-08-22 14:42:40 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/2380

Comment 20 Jakub Hrozek 2014-08-22 14:43:35 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/2409

Comment 21 Nirupama Karandikar 2014-09-01 11:44:48 UTC
Tested with sssd-1.11.6-23.1.el6.x86_64

1. Configured sssd to authenticate against 389 DS.

2. Compile the same program using following command.

# g++ -lpthread -o a1 a1.cpp

3. Run "a1" atleast 20 times and not able to see the crash.

# ./a1

Comment 27 Lukas Slebodnik 2014-09-18 12:49:33 UTC
We got a different coredump, which seemed to be related to this BZ.

After long analysis, I found out that it is different race condition in the same function. It can happen when sssd cache is invalidated e.g. with command sss_cache -U (-G -E). 

If it is a critical bug which cause lot of problems I can provide workaround how to disable fast memory cache in client code. The fast memory cache will be still used by sssd in server side, which is single threaded.

Comment 28 Jakub Hrozek 2014-09-18 15:09:55 UTC
(In reply to Lukas Slebodnik from comment #27)
> We got a different coredump, which seemed to be related to this BZ.
> 

This means we need another bugzilla. Brad, can you file one so we can link the support case with the new bugzilla? Thank you

> After long analysis, I found out that it is different race condition in the
> same function. It can happen when sssd cache is invalidated e.g. with
> command sss_cache -U (-G -E). 
> 

Thanks for the analysis.

> If it is a critical bug which cause lot of problems I can provide workaround
> how to disable fast memory cache in client code. The fast memory cache will
> be still used by sssd in server side, which is single threaded.

Just to make sure -- disabling the fastcache would only have performance impact on identity lookups. There is no functional effect.

Comment 32 errata-xmlrpc 2014-10-14 04:49:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1375.html


Note You need to log in before you can comment on or make changes to this bug.