Bug 1546064
Summary: | sssd_nss consumes more memory until restarted or machine swaps [rhel-7.4.z] | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Oneata Mircea Teodor <toneata> | ||||||||
Component: | sssd | Assignee: | SSSD Maintainers <sssd-maint> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | sssd-qe <sssd-qe> | ||||||||
Severity: | urgent | Docs Contact: | Lucie Vařáková <lmanasko> | ||||||||
Priority: | urgent | ||||||||||
Version: | 7.3 | CC: | aheverle, andrewg.gray, atolani, ddas, ekeck, enewland, fidencio, gparente, grajaiya, imntreal, jhrozek, jvilicic, jwooten, knweiss, lmanasko, lslebodn, mkosek, mmilgram, mniranja, mupadhye, mzidek, pbrezina, pdhamdhe, pkulkarn, serge.savard, sgoveas, sssd-maint, striker, tscherf, vmishra | ||||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | sssd-1.15.2-50.el7_4.12 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
Previously, due to a wrong memory hierarchy in the network security service (NSS) responder, the memory that was needed for printing a specific name was allocated to a permanent memory context. As a consequence, the memory usage of the sssd_nss service kept growing. This
update allocates the memory to a correct operation context and the
memory usage of sssd_nss no longer grows.
|
Story Points: | --- | ||||||||
Clone Of: | 1482231 | Environment: | |||||||||
Last Closed: | 2018-04-17 16:35:29 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 1482231 | ||||||||||
Bug Blocks: | |||||||||||
Attachments: |
|
Description
Oneata Mircea Teodor
2018-02-16 09:01:58 UTC
Versions: ========= sssd-common-1.15.2-50.el7_4.13.x86_64 sssd-ad-1.15.2-50.el7_4.13.x86_64 sssd-1.15.2-50.el7_4.13.x86_64 libsss_nss_idmap-1.15.2-50.el7_4.11.x86_64 sssd-client-1.15.2-50.el7_4.13.x86_64 libsss_sudo-1.15.2-50.el7_4.13.x86_64 libsss_autofs-1.15.2-50.el7_4.13.x86_64 sssd-krb5-common-1.15.2-50.el7_4.13.x86_64 sssd-ipa-1.15.2-50.el7_4.13.x86_64 sssd-ldap-1.15.2-50.el7_4.13.x86_64 sssd-proxy-1.15.2-50.el7_4.13.x86_64 sssd-dbus-1.15.2-50.el7_4.13.x86_64 libsss_certmap-1.15.2-50.el7_4.11.x86_64 libsss_idmap-1.15.2-50.el7_4.13.x86_64 python-sssdconfig-1.15.2-50.el7_4.13.noarch sssd-common-pac-1.15.2-50.el7_4.13.x86_64 sssd-krb5-1.15.2-50.el7_4.13.x86_64 libsss_simpleifp-1.15.2-50.el7_4.13.x86_64 Configuration: ============= 1. Configure sssd to authenticate against ldap (389-ds) [sssd] domains = EXAMPLE.TEST config_file_version = 2 services = nss, pam, ifp [domain/EXAMPLE.TEST] enumerate = false id_provider = ldap ldap_uri = ldap://auto-hv-01-guest01.idmqe.lab.eng.bos.redhat.com ldap_search_base = dc=example,dc=test ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem auth_provider = ldap krb5_server = auto-hv-01-guest01.idmqe.lab.eng.bos.redhat.com krb5_kpasswd = auto-hv-01-guest01.idmqe.lab.eng.bos.redhat.com krb5_realm = EXAMPLE.TEST debug_level = 9 cache_credentials = True krb5_store_password_if_offline = True [nss] command = valgrind --leak-check=full --trace-children=yes --show-reachable=yes \ --track-origins=yes --read-var-info=yes --tool=memcheck --num-callers=50 -v \ --time-stamp=yes --log-file=/tmp/valgrind.%p.log /usr/libexec/sssd/sssd_nss debug_level = 9 1. Start sssd with the above options 2. Run the below script for an hour NAME_PREFIX=`for i in {1..245}; do printf "x"; done` for i in {1..1000}; do for j in {1..500}; do getent passwd -s sss "$NAME_PREFIX$j" done for j in {1..500}; do getent passwd -s sss "$NAME_PREFIX$j" done sleep 15 done Leak summary from ==00:01:17:03.719 9618== LEAK SUMMARY: ==00:01:17:03.719 9618== definitely lost: 0 bytes in 0 blocks ==00:01:17:03.719 9618== indirectly lost: 0 bytes in 0 blocks ==00:01:17:03.719 9618== possibly lost: 5,016 bytes in 18 blocks ==00:01:17:03.719 9618== still reachable: 321,866 bytes in 444 blocks ==00:01:17:03.719 9618== suppressed: 0 bytes in 0 blocks ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== 1 errors in context 1 of 13: ==00:01:17:03.719 9618== Conditional jump or move depends on uninitialised value(s) ==00:01:17:03.719 9618== at 0x529C087: become_user (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x52911AC: server_setup (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x11002F: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== by 0x9628C04: (below main) (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== Uninitialised value was created by a stack allocation ==00:01:17:03.719 9618== at 0x10FB48: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== 1 errors in context 2 of 13: ==00:01:17:03.719 9618== Syscall param chown(group) contains uninitialised byte(s) ==00:01:17:03.719 9618== at 0x96F1807: chown (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== by 0x8BDBB82: chown_debug_file (debug.c:335) ==00:01:17:03.719 9618== by 0x529118D: server_setup (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x11002F: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== by 0x9628C04: (below main) (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== Uninitialised value was created by a stack allocation ==00:01:17:03.719 9618== at 0x10FB48: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== 1 errors in context 3 of 13: ==00:01:17:03.719 9618== Syscall param chown(owner) contains uninitialised byte(s) ==00:01:17:03.719 9618== at 0x96F1807: chown (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== by 0x8BDBB82: chown_debug_file (debug.c:335) ==00:01:17:03.719 9618== by 0x529118D: server_setup (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x11002F: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== by 0x9628C04: (below main) (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== Uninitialised value was created by a stack allocation ==00:01:17:03.719 9618== at 0x10FB48: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0) Created attachment 1417565 [details]
valgrind logs
Versions: ========= Reproduced the issue on below version: ===================================== sssd-krb5-1.15.2-50.el7_4.11.x86_64 sssd-debuginfo-1.15.2-50.el7_4.11.x86_64 sssd-common-pac-1.15.2-50.el7_4.11.x86_64 sssd-client-1.15.2-50.el7_4.11.x86_64 sssd-ad-1.15.2-50.el7_4.11.x86_64 sssd-dbus-1.15.2-50.el7_4.11.x86_64 sssd-common-1.15.2-50.el7_4.11.x86_64 sssd-ldap-1.15.2-50.el7_4.11.x86_64 sssd-ipa-1.15.2-50.el7_4.11.x86_64 sssd-krb5-common-1.15.2-50.el7_4.11.x86_64 sssd-1.15.2-50.el7_4.11.x86_64 sssd-proxy-1.15.2-50.el7_4.11.x86_64 Configuration: Configuration: ============= 1. Configure sssd to authenticate against ldap (389-ds) [sssd] domains = EXAMPLE.TEST config_file_version = 2 services = nss, pam, ifp [domain/EXAMPLE.TEST] enumerate = false id_provider = ldap ldap_uri = ldap://auto-hv-01-guest01.idmqe.lab.eng.bos.redhat.com ldap_search_base = dc=example,dc=test ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem auth_provider = ldap krb5_server = auto-hv-01-guest01.idmqe.lab.eng.bos.redhat.com krb5_kpasswd = auto-hv-01-guest01.idmqe.lab.eng.bos.redhat.com krb5_realm = EXAMPLE.TEST debug_level = 9 cache_credentials = True krb5_store_password_if_offline = True [nss] command = valgrind --leak-check=full --trace-children=yes --show-reachable=yes \ --track-origins=yes --read-var-info=yes --tool=memcheck --num-callers=50 -v \ --time-stamp=yes --log-file=/tmp/valgrind.%p.log /usr/libexec/sssd/sssd_nss debug_level = 9 1. Start sssd with the above options 2. Run the below script for an hour NAME_PREFIX=`for i in {1..245}; do printf "x"; done` for i in {1..1000}; do for j in {1..500}; do getent passwd -s sss "$NAME_PREFIX$j" done for j in {1..500}; do getent passwd -s sss "$NAME_PREFIX$j" done sleep 15 done Leak are seen in responder(sized_output_name): ==00:00:03:21.560 16565== 517,176 bytes in 1,500 blocks are definitely lost in loss record 357 of 358 ==00:00:03:21.560 16565== at 0x4C29BE3: malloc (vg_replace_malloc.c:299) ==00:00:03:21.560 16565== by 0x8FF4EAB: __talloc_with_prefix (talloc.c:698) ==00:00:03:21.560 16565== by 0x8FF4EAB: __talloc (talloc.c:739) ==00:00:03:21.560 16565== by 0x8FF4EAB: __talloc_strlendup (talloc.c:2369) ==00:00:03:21.560 16565== by 0x8FF4EAB: talloc_strdup (talloc.c:2385) ==00:00:03:21.560 16565== by 0x52934F9: sss_output_name (usertools.c:808) ==00:00:03:21.560 16565== by 0x5293690: sss_output_fqname (usertools.c:863) ==00:00:03:21.560 16565== by 0x121249: sized_output_name (responder_common.c:1708) ==00:00:03:21.560 16565== by 0x113836: memcache_delete_entry (nss_get_object.c:112) ==00:00:03:21.560 16565== by 0x113C06: nss_get_object_done (nss_get_object.c:245) ==00:00:03:21.560 16565== by 0x8DE5291: _tevent_req_error (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x1277CE: cache_req_done (cache_req.c:1058) ==00:00:03:21.560 16565== by 0x8DE5291: _tevent_req_error (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x126B4E: cache_req_search_domains_done (cache_req.c:608) ==00:00:03:21.560 16565== by 0x8DE5291: _tevent_req_error (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x1218EB: sss_dp_internal_get_done (responder_dp.c:899) ==00:00:03:21.560 16565== by 0x5B14861: ??? (in /usr/lib64/libdbus-1.so.3.7.4) ==00:00:03:21.560 16565== by 0x5B17B50: dbus_connection_dispatch (in /usr/lib64/libdbus-1.so.3.7.4) ==00:00:03:21.560 16565== by 0x5286C51: sbus_dispatch (sssd_dbus_connection.c:93) ==00:00:03:21.560 16565== by 0x8DE8C96: tevent_common_loop_timer_delay (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x8DE9CA8: ??? (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x8DE82A6: ??? (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x8DE40CC: _tevent_loop_once (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x8DE42FA: tevent_common_loop_wait (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x8DE8246: ??? (in /usr/lib64/libtevent.so.0.9.31) ==00:00:03:21.560 16565== by 0x5291C72: server_loop (server.c:718) ==00:00:03:21.560 16565== by 0x11009C: main (nsssrv.c:560) Updated to below versions sssd-common-1.15.2-50.el7_4.13.x86_64 sssd-ad-1.15.2-50.el7_4.13.x86_64 sssd-1.15.2-50.el7_4.13.x86_64 sssd-client-1.15.2-50.el7_4.13.x86_64 libsss_sudo-1.15.2-50.el7_4.13.x86_64 libsss_autofs-1.15.2-50.el7_4.13.x86_64 sssd-krb5-common-1.15.2-50.el7_4.13.x86_64 sssd-ipa-1.15.2-50.el7_4.13.x86_64 sssd-ldap-1.15.2-50.el7_4.13.x86_64 sssd-proxy-1.15.2-50.el7_4.13.x86_64 sssd-dbus-1.15.2-50.el7_4.13.x86_64 libsss_idmap-1.15.2-50.el7_4.13.x86_64 python-sssdconfig-1.15.2-50.el7_4.13.noarch sssd-common-pac-1.15.2-50.el7_4.13.x86_64 sssd-krb5-1.15.2-50.el7_4.13.x86_64 libsss_simpleifp-1.15.2-50.el7_4.13.x86_64 Leak summary from ==00:01:17:03.719 9618== LEAK SUMMARY: ==00:01:17:03.719 9618== definitely lost: 0 bytes in 0 blocks ==00:01:17:03.719 9618== indirectly lost: 0 bytes in 0 blocks ==00:01:17:03.719 9618== possibly lost: 5,016 bytes in 18 blocks ==00:01:17:03.719 9618== still reachable: 321,866 bytes in 444 blocks ==00:01:17:03.719 9618== suppressed: 0 bytes in 0 blocks ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== 1 errors in context 1 of 13: ==00:01:17:03.719 9618== Conditional jump or move depends on uninitialised value(s) ==00:01:17:03.719 9618== at 0x529C087: become_user (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x52911AC: server_setup (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x11002F: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== by 0x9628C04: (below main) (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== Uninitialised value was created by a stack allocation ==00:01:17:03.719 9618== at 0x10FB48: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== 1 errors in context 2 of 13: ==00:01:17:03.719 9618== Syscall param chown(group) contains uninitialised byte(s) ==00:01:17:03.719 9618== at 0x96F1807: chown (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== by 0x8BDBB82: chown_debug_file (debug.c:335) ==00:01:17:03.719 9618== by 0x529118D: server_setup (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x11002F: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== by 0x9628C04: (below main) (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== Uninitialised value was created by a stack allocation ==00:01:17:03.719 9618== at 0x10FB48: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== 1 errors in context 3 of 13: ==00:01:17:03.719 9618== Syscall param chown(owner) contains uninitialised byte(s) ==00:01:17:03.719 9618== at 0x96F1807: chown (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== by 0x8BDBB82: chown_debug_file (debug.c:335) ==00:01:17:03.719 9618== by 0x529118D: server_setup (in /usr/lib64/sssd/libsss_util.so) ==00:01:17:03.719 9618== by 0x11002F: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== by 0x9628C04: (below main) (in /usr/lib64/libc-2.17.so) ==00:01:17:03.719 9618== Uninitialised value was created by a stack allocation ==00:01:17:03.719 9618== at 0x10FB48: ??? (in /usr/libexec/sssd/sssd_nss) ==00:01:17:03.719 9618== ==00:01:17:03.719 9618== ERROR SUMMARY: 13 errors from 13 contexts (suppressed: 0 from 0) The valgrind output doesn't have any leaks corresponding to sized_output_name function from respond Created attachment 1417680 [details]
valgrind output from unpatched version
Created attachment 1417681 [details]
valgrind output from patched version
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1144 |