Bug 1118541
| Summary: | Floating point exception using ldap | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | vcommarieu |
| Component: | sssd | Assignee: | Jakub Hrozek <jhrozek> |
| Status: | CLOSED ERRATA | QA Contact: | Kaushik Banerjee <kbanerje> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.5 | CC: | bhubbard, dpal, grajaiya, jgalipea, lslebodn, mkosek, nkarandi, pbrezina, preichl, vcommarieu |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | sssd-1.11.6-12.el6 | Doc Type: | Bug Fix |
| Doc Text: |
Cause: Race condition in initialization of fast memory cache in SSSD client libraries.
Consequence: Multi threaded application may be signaled with SIGSEGV or SIGFPE depending on the conditions met.
Fix: Locks were added around the initialization code making it thread safe.
Result: Race condition was removed.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-10-14 04:49:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I was able to reproduce problem with "Segmentation fault". There is a race condition in multi threaded application with initialisation fast memory cache. Could you install debug symbols "debuginfo-install sssd"; compile example program ad provide backtrace for "Floating point exception". If there is different backtraces can you provide all of them? One more time. Could you: * install debug symbols for sssd "debuginfo-install sssd" * compile example program with debug information "g++ -g ..." * provide backtrace for "Floating point exception" Thanks a lot for the detailed bug report, much appreciated! Cloning upstream.. There is a workaround how you can solve your problem. getpwuid_r should be called for first time in main thread before creating other threads. So fast memory cache will be initialized properly and reused by other threads. sssd-client use two different kind of fast meory caches. The first one for password file entries and the second one for group file entries. You should call getpwuid and getgrgid in main thread to be safe. I would be still interested in backtrace for "Floating point exception" Upstream ticket: https://fedorahosted.org/sssd/ticket/2380 master:
0d22416f94dff7756091e983518ed3684cc9597a
9d876108620931e0941a115adf60bfd8d67459d9
sssd-1-11:
423acb81f83bc8d2954746f370c640fb426b773a
7b8b213f8babc718d977b5ef5b260e62f6300388
*** Bug 1123291 has been marked as a duplicate of this bug. *** Upstream ticket: https://fedorahosted.org/sssd/ticket/2380 Upstream ticket: https://fedorahosted.org/sssd/ticket/2409 Tested with sssd-1.11.6-23.1.el6.x86_64 1. Configured sssd to authenticate against 389 DS. 2. Compile the same program using following command. # g++ -lpthread -o a1 a1.cpp 3. Run "a1" atleast 20 times and not able to see the crash. # ./a1 We got a different coredump, which seemed to be related to this BZ. After long analysis, I found out that it is different race condition in the same function. It can happen when sssd cache is invalidated e.g. with command sss_cache -U (-G -E). If it is a critical bug which cause lot of problems I can provide workaround how to disable fast memory cache in client code. The fast memory cache will be still used by sssd in server side, which is single threaded. (In reply to Lukas Slebodnik from comment #27) > We got a different coredump, which seemed to be related to this BZ. > This means we need another bugzilla. Brad, can you file one so we can link the support case with the new bugzilla? Thank you > After long analysis, I found out that it is different race condition in the > same function. It can happen when sssd cache is invalidated e.g. with > command sss_cache -U (-G -E). > Thanks for the analysis. > If it is a critical bug which cause lot of problems I can provide workaround > how to disable fast memory cache in client code. The fast memory cache will > be still used by sssd in server side, which is single threaded. Just to make sure -- disabling the fastcache would only have performance impact on identity lookups. There is no functional effect. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1375.html |
Description of problem: Floating point exception with core generated dmesg: a1[13223] trap divide error ip:7fedf417d06b sp:7fedef7fbca8 error:0 a1[13219] trap divide error ip:7fedf417d06b sp:7fedf5d8dca8 error:0 in libnss_sss.so.2[7fedf4178000+7000] in libnss_sss.so.2[7fedf4178000+7000] We are using ldap/sssd authentication with Windows2012 active directory. Happens only for ldap users. not root Version-Release number of selected component (if applicable): uname -a: 2.6.39-400.214.6.el6uek.x86_64 How reproducible: This simple program reproduces the crash: #include <pwd.h> #include <unistd.h> #include <pthread.h> void *tr(void *) { struct passwd pwd; char buf[8192]; struct passwd *res; getpwuid_r(getuid(), &pwd, buf, sizeof(buf), &res); } #define NTH 100 pthread_t t[NTH]; int main() { int i; for (i=0; i<NTH; ++i) { pthread_create(&t[i], NULL, tr, NULL); } for (i=0; i<NTH; ++i) { pthread_join(t[i], NULL); } return 0; } -------------- $ g++ -lpthread -o a1 a1.cpp $ ./a1 $ ./a1 $ ./a1 $ ./a1 $ ./a1 $ ./a1 Floating point exception $ ./a1 Segmentation fault $ ./a1 Floating point exception $ ./a1 $ ./a1 $ ./a1 Floating point exception $ ./a1 $ ./a1 $ ./a1 Floating point exception $ ./a1 $ ./a1