Bug 906398
Summary: | sssd_be crashes sometimes | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Kaushik Banerjee <kbanerje> | ||||||||
Component: | sssd | Assignee: | Jakub Hrozek <jhrozek> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Kaushik Banerjee <kbanerje> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 6.4 | CC: | chhudson, dpal, grajaiya, jgalipea, jwest, lslebodn, mkosek, nkarandi, pbrezina, tlavigne | ||||||||
Target Milestone: | rc | Keywords: | Regression, ZStream | ||||||||
Target Release: | 6.5 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | sssd-1.9.2-89.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
Cause: There was a get_attribute call used in the group processing codebase that, if a nonexistent attribute was requested, would allocate an empty attribute instead and reallocate the previous attribute array. The reallocation might invalidate existing pointers that were pointing to the array previously.
Consequence: In case a group contained no members at all, the array would be reallocated and existing pointers invalidated, resulting in a crash.
Fix: Another get_attribute was used that returns ENOENT instead of creating an empty attribute
Result: Requesting an empty group no longer crashes the sssd
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-11-21 22:14:06 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 956136 | ||||||||||
Attachments: |
|
Created attachment 691000 [details]
gzipped coredump
Upstream ticket: https://fedorahosted.org/sssd/ticket/1799 Created attachment 701952 [details]
gzipped sssd_domain.log
I managed to get the domain logs at the time the crash occurred. Hope this helps.
From the domain logs I attached in comment #5 , it seems the number of groups members are shown in negative. And sssd_be crashes just after the following lines in log: (Fri Feb 22 14:13:59 2013) [sssd[be[LDAP]]] [sdap_process_ghost_members] (0x0400): The group has 0 members (Fri Feb 22 14:13:59 2013) [sssd[be[LDAP]]] [sdap_process_ghost_members] (0x0400): Group has -401273744 members (Fri Feb 22 14:13:59 2013) [sssd[be[LDAP]]] [sdap_save_group] (0x0400): Storing info for group Group1 I think I finally reproduced locally: ==12658== Invalid read of size 8 ==12658== at 0x12BF81A2: sdap_process_ghost_members (sdap_async_groups.c:366) ==12658== by 0x12BFAB8A: sdap_save_group (sdap_async_groups.c:592) ==12658== by 0x12BFC21C: sdap_save_groups (sdap_async_groups.c:782) ==12658== by 0x12C01FED: sdap_get_groups_process (sdap_async_groups.c:1687) ==12658== by 0x12BEBFAA: sdap_get_generic_done (sdap_async.c:1558) ==12658== by 0x12BEB780: sdap_get_generic_ext_done (sdap_async.c:1449) ==12658== by 0x12BE4677: sdap_process_message (sdap_async.c:366) ==12658== by 0x12BE3BBF: sdap_process_result (sdap_async.c:209) ==12658== by 0x12BE3211: sdap_ldap_next_result (sdap_async.c:159) ==12658== by 0x54DDD3F: tevent_common_loop_timer_delay (in /usr/lib64/libtevent.so.0.9.17) ==12658== by 0x54DD3EB: ??? (in /usr/lib64/libtevent.so.0.9.17) ==12658== by 0x54DA05F: _tevent_loop_once (in /usr/lib64/libtevent.so.0.9.17) ==12658== Address 0x1560bb38 is 312 bytes inside a block of size 320 free'd ==12658== at 0x4C2AA2E: realloc (vg_replace_malloc.c:662) ==12658== by 0x56EB10E: _talloc_realloc (in /usr/lib64/libtalloc.so.2.0.8) ==12658== by 0x5264EB6: sysdb_attrs_get_el_ext (sysdb.c:319) ==12658== by 0x5265004: sysdb_attrs_get_el (sysdb.c:347) ==12658== by 0x12BF77FD: sdap_process_ghost_members (sdap_async_groups.c:326) ==12658== by 0x12BFAB8A: sdap_save_group (sdap_async_groups.c:592) ==12658== by 0x12BFC21C: sdap_save_groups (sdap_async_groups.c:782) ==12658== by 0x12C01FED: sdap_get_groups_process (sdap_async_groups.c:1687) ==12658== by 0x12BEBFAA: sdap_get_generic_done (sdap_async.c:1558) ==12658== by 0x12BEB780: sdap_get_generic_ext_done (sdap_async.c:1449) ==12658== by 0x12BE4677: sdap_process_message (sdap_async.c:366) ==12658== by 0x12BE3BBF: sdap_process_result (sdap_async.c:209) ==12658== ==12658== Invalid read of size 4 ==12658== at 0x12BF81B8: sdap_process_ghost_members (sdap_async_groups.c:367) ==12658== by 0x12BFAB8A: sdap_save_group (sdap_async_groups.c:592) ==12658== by 0x12BFC21C: sdap_save_groups (sdap_async_groups.c:782) ==12658== by 0x12C01FED: sdap_get_groups_process (sdap_async_groups.c:1687) ==12658== by 0x12BEBFAA: sdap_get_generic_done (sdap_async.c:1558) ==12658== by 0x12BEB780: sdap_get_generic_ext_done (sdap_async.c:1449) ==12658== by 0x12BE4677: sdap_process_message (sdap_async.c:366) ==12658== by 0x12BE3BBF: sdap_process_result (sdap_async.c:209) ==12658== by 0x12BE3211: sdap_ldap_next_result (sdap_async.c:159) ==12658== by 0x54DDD3F: tevent_common_loop_timer_delay (in /usr/lib64/libtevent.so.0.9.17) ==12658== by 0x54DD3EB: ??? (in /usr/lib64/libtevent.so.0.9.17) ==12658== by 0x54DA05F: _tevent_loop_once (in /usr/lib64/libtevent.so.0.9.17) ==12658== Address 0x1560bb30 is 304 bytes inside a block of size 320 free'd ==12658== at 0x4C2AA2E: realloc (vg_replace_malloc.c:662) ==12658== by 0x56EB10E: _talloc_realloc (in /usr/lib64/libtalloc.so.2.0.8) ==12658== by 0x5264EB6: sysdb_attrs_get_el_ext (sysdb.c:319) ==12658== by 0x5265004: sysdb_attrs_get_el (sysdb.c:347) ==12658== by 0x12BF77FD: sdap_process_ghost_members (sdap_async_groups.c:326) ==12658== by 0x12BFAB8A: sdap_save_group (sdap_async_groups.c:592) ==12658== by 0x12BFC21C: sdap_save_groups (sdap_async_groups.c:782) ==12658== by 0x12C01FED: sdap_get_groups_process (sdap_async_groups.c:1687) ==12658== by 0x12BEBFAA: sdap_get_generic_done (sdap_async.c:1558) ==12658== by 0x12BEB780: sdap_get_generic_ext_done (sdap_async.c:1449) ==12658== by 0x12BE4677: sdap_process_message (sdap_async.c:366) ==12658== by 0x12BE3BBF: sdap_process_result (sdap_async.c:209) ==12658== Jakub, since you were able to reproduce and fix the issue, can you share the reproducer steps with us? (In reply to comment #12) > Jakub, since you were able to reproduce and fix the issue, can you share the > reproducer steps with us? I set shorter enum_cache_timeout and ldap_enumeration_refresh_timeout to force enumeration to run more frequently, basically. I think that the fact that I had an empty group on my LDAP server also played a role. The crash is no longer seen with automation runs using the build 1.9.2-123 Verified SanityOnly Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1680.html |
Created attachment 690999 [details] Backtrace of the crash Description of problem: sssd_be crashes sometimes. Version-Release number of selected component (if applicable): 1.9.2-82 How reproducible: Can't reproduce. The crash appears from the sssd automation runs sometimes. Steps to Reproduce: 1. None. But, from the timings of the crash, I know that "enumerate=true" and "ldap_schema=rfc2307bis" was set in sssd.conf when sssd_be crashed. 2. Actual results: sssd_be crashes. Will attach the backtrace and coredump. Expected results: Additional info: