Bug 692251
| Summary: | SSSD consumes GBs of RAM, possible memory leak | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Trond H. Amundsen <trondham> | |
| Component: | libtdb | Assignee: | Stephen Gallagher <sgallagh> | |
| Status: | CLOSED ERRATA | QA Contact: | Chandrasekar Kannan <ckannan> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 6.1 | CC: | benl, dpal, georgios, grajaiya, jgalipea, kevinu, nsoman, prc, ssorce | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | libtdb-1.2.1-3.el6 | Doc Type: | Bug Fix | |
| Doc Text: |
When processing very large database updates, occasionally, tdb incorrectly allocated unnecessarily large amounts of memory. As a result, an OOM (Out Of Memory) situation occurred. This was caused by libtdb calling the tdb_expand() function to increase its size by creating space to store an extra hundred records of the size of the largest record it currently has. With this update, this behavior has been removed and tdb no longer allocates unnecessarily large amounts of memory.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 693785 694146 (view as bug list) | Environment: | ||
| Last Closed: | 2011-05-19 14:30:54 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 693785, 694146, 694149, 694217 | |||
| Attachments: | ||||
Created attachment 488891 [details]
Log file created with 'debug_level = 9'
This is a log from when this happened. This particular server only has 1GB of memory, so it ended with an OOM panic.
I'm marking this bug as security sensitive, since the attached log includes lots of user and group names. This bug should not be marked as security-sensitive. Security team, please clear the security-sensitive box. I have moved the log to private. The logs don't seem suspect (no obvious errors). I do however note that there are some ENORMOUS groups here. My best guess right now is that we're holding onto too much memory during the processing of the groups (I think we may not be freeing anything until the request is fully-complete). We probably need to investigate what we can free earlier. I'd guess that the user is getting into this state whenever multiple very large groups have an ongoing request simultaneously. I did some additional testing just to see how much memory was allocated if the process was allowed to grow. Reproducing this issue on a server with 72GB memory I was able to bring sssd_be to >70GB memory usage, before I had to kill it off to avoid a panic. This happened when I did 'id user' on a user which is member of several large groups. I'm not familiar with the sssd code, but I'm thinking that this points toward some sort of race condition while processing the cache. I have tried to mitigate the problem by using 'filter_groups' to filter out groups with many members (>2k). This did not seem to help, but I have to test it some more to be 100% sure. Created attachment 489330 [details]
Valgrind analysis of sssd when the issue occurs
I managed to run sssd to valgrind and capture the output. In this particular case, I ran 'id user' on a user who is not member of many large groups, so the memory usage stopped at 4.2GB before sssd_be finished. Since it didn't run until killed by oom-killer, valgrind could conclude.
The command to start sssd through valgrind was:
valgrind -v --log-file=/root/valgrind-sssd-%p.log --show-reachable=yes --trace-children=yes --leak-check=full /usr/sbin/sssd -f -D
I hope this helps.
One thing I have noticed is that when the memory is allocated and sssd_be is able to finish up what it is doing (i.e. not killed by oom-killer), the memory is not freed. In the case above, the memory usage stayed at 4.2GB even after sssd_be was finished processing.
-trond
Created attachment 489379 [details]
Valgrind analysis of sssd when the issue occurs (with debuginfo)
I have re-run sssd through valgrind with all relevant debuginfo packages installed, so symbols are included. Valgrind/sssd was started in the same way as described in my previous comment.
Created attachment 489396 [details]
perf.data captured with 'perf record -p <pid of sssd_be>'
Created attachment 489407 [details]
perf report created with 'perf report --stdio -n -v'
I have played around with this during the weekend and have made a few observations of what happens when the issue occurs.
Using 'perf top -z -p <pid of sssd_be>' I was able to determine what sssd_be does when memory usage grows through the roof. In every test I have made, when the issue occurs the function ldb_db_explode() is heavily used.
Here is a snapshot of 'perf top' when the issue is in progress:
-------------------------------------------------------------------------------
PerfTop: 4607 irqs/sec kernel:14.1% exact: 0.0% [1000Hz cpu-clock-msecs], (target_pid: 26904)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _________________________________ __________________
1565.00 17.0% _int_malloc libc-2.12.so
789.00 8.6% _int_free libc-2.12.so
758.00 8.2% ldb_dn_explode libldb.so.0.9.10
681.00 7.4% _talloc_free_internal libtalloc.so.2.0.1
497.00 5.4% memcpy libc-2.12.so
492.00 5.3% __GI___libc_malloc libc-2.12.so
387.00 4.2% talloc_strdup libtalloc.so.2.0.1
357.00 3.9% native_set_pte_at [kernel.kallsyms]
304.00 3.3% clear_page_c [kernel.kallsyms]
291.00 3.2% __GI___strcasecmp_l_ssse3 libc-2.12.so
246.00 2.7% malloc_consolidate libc-2.12.so
173.00 1.9% unmap_vmas [kernel.kallsyms]
172.00 1.9% __strlen_sse2 libc-2.12.so
155.00 1.7% ldb_schema_attribute_by_name_inte libldb.so.0.9.10
136.00 1.5% ltdb_unpack_data libldb.so.0.9.10
Using tcpdump I have found that during the memory growth, sssd does not communicate with the LDAP server. Iotop shows no disk activity (unless logging is turned on).
Using 'debug_level = 9', the only thing it writes to the log during the memory allocation is lots and lots of this:
(Sun Apr 3 16:35:37 2011) [sssd[be[default]]] [ldb] (9): tevent: Added timed ev
ent "ltdb_timeout": 0x91069030
(Sun Apr 3 16:35:37 2011) [sssd[be[default]]] [ldb] (9): tevent: Destroying tim
er event 0x9106eb20 "ltdb_timeout"
(Sun Apr 3 16:35:37 2011) [sssd[be[default]]] [ldb] (9): tevent: Ending timer e
vent 0x9106ed40 "ltdb_callback"
The last lines written to the log _before_ memory starts growing is:
(Sun Apr 3 16:35:01 2011) [sssd[be[default]]] [sdap_get_groups_done] (9): Group
s remaining: 0
(Sun Apr 3 16:35:01 2011) [sssd[be[default]]] [sdap_get_groups_done] (9): All g
roups processed
(Sun Apr 3 16:35:01 2011) [sssd[be[default]]] [ldb] (9): start ldb transaction
(nesting: 1)
(Sun Apr 3 16:35:01 2011) [sssd[be[default]]] [sdap_save_group] (7): Adding ori
ginal DN [cn=campus,cn=filegroups,cn=system,dc=uio,dc=no] to attributes of [camp
us].
(Sun Apr 3 16:35:01 2011) [sssd[be[default]]] [sdap_save_group] (7): Original U
SN value is not available for [campus].
(Sun Apr 3 16:35:01 2011) [sssd[be[default]]] [sdap_save_group] (6): Storing in
fo for group campus
One last interesting bit:
1. I trigger the issue by running 'id user'
2. sssd_be goes berserk and starts allocating memory
3. I kill off sssd_be with 'kill -9'
4. The command 'id user' then returns with correct information(!)
Hopefully these observations are helpful in arriving at a diagnosis.
PS. What I don't understand is that sssd works fine and performs well for an hour or two, if the cache is removed before sssd is started. If sssd is started with an pre-existing cache, I'm able to trigger the issue at once.
(In reply to comment #11) > PS. What I don't understand is that sssd works fine and performs well for an > hour or two, if the cache is removed before sssd is started. If sssd is started > with an pre-existing cache, I'm able to trigger the issue at once. I second that question. Same behavior observed exactly. GM Many thanks to Simo Sorce, who was able to track down the source of this problem today. The problem is actually in libtdb, not the SSSD itself. When libtdb calls tdb_expand() to increase its size, it is doing so by creating space to store an extra hundred records of the size of the largest record it currently has. In SSSD's case, we're dealing with records that can reach dozens of megabytes. Then tdb_expand() suddenly increases to take up gigabytes of space. There are two parts to this issue. One part involves some bad realloc() logic in libtdb. This bug will track that fix. I will clone this bug back to SSSD to resolve the other issue (which involves the SSSD's memberOf plugin)
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
When processing very large database updates, occasionally, tdb incorrectly allocated unnecessarily large amounts of memory. As a result, an OOM (Out Of Memory) situation occurred. This was caused by libtdb calling the tdb_expand() function to increase its size by creating space to store an extra hundred records of the size of the largest record it currently has. With this update, this behavior has been removed and tdb no longer allocates unnecessarily large amounts of memory.
Simo provided a tool (attaching here) to see the issue. He provided it - with directions and expectations: ========= the attached program can be used as a reproducer. The test is easy: - run once with old tdb package, annotate size of file test.tdb - run again with new tdb package: see that size of test.tdb is an order of magnitude smaller. If you want to keep around both files as proof you can also set the TEST_DB_PATH variable to put the test file in different directories w/o having to mv them after each tests. (you can also run the binary in different dirs, as by default it puts the file in the dir you are running the binary from. Compile it with: gcc -o tdb_expand_test -ltdb tdb_expand_test.c =========== Compiled, then ran the program with old and new libtdb. Using libtdb-1.2.1-2.el6.x86_64, size was: test.tdb 143933440 Using libtdb-1.2.1-3.el6.x86_64 size was: test.tdb 15482880 Created attachment 495785 [details]
test tool to reproduce this issue
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0808.html |
Created attachment 488890 [details] Sanitized sssd.conf We're having an issue with SSSD, where sssd_be suddenly goes insane and allocates several GBs of RAM very quickly, until it is killed by OOM. If I stop sssd, remove the cache files and start sssd again, everything is OK for a while (typically a couple of hours). The way I manage to trigger this is as follows: 1. Stop sssd, remove cache files, start sssd. This way we have a clean system to begin with. 2. Run 'id <username>' on a few users and/or 'ls -l' on files that have different owners/groups. sssd will build its cache. 3. Wait a few hours. 4. Repeat step 2, sssd_be goes crazy. This doesn't always work, but usually I'm able to make sssd go amok. After it has happened once, running 'id <someuser>' is usually enough to trigger the issue again, until the cache files are removed and sssd restarted manually. From what I can tell, when this erradic bahaviour happens sssd doesn't even contact the LDAP server. There seems to be something in the cache that triggers it. Some details about the environment: * LDAP used as both id and auth provider * LDAP schema: rfc2307 * ~70k users, ~9k groups * Some groups are very large, ~25k members I have tested this on RHEL6.0 (sssd-1.2.1-28.el6_0.4.x86_64) and RHEL6.1-beta (sssd-1.5.1-14.el6.x86_64) and the latest beta packages (sssd-1.5.1-23.el6.x86_64). Seemingly same behaviour.