Bug 1321995

Summary: memory leak in gnome-settings-daemon
Product: Red Hat Enterprise Linux 7 Reporter: Joe Wright <jwright>
Component: gnome-settings-daemonAssignee: Rui Matos <rmatos>
Status: CLOSED WONTFIX QA Contact: Desktop QE <desktop-qa-list>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: cww, jwright, tpelka
Target Milestone: rcKeywords: Desktop, Performance
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-15 22:46:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
memory usage chart over time
none
valgrind data none

Description Joe Wright 2016-03-29 14:22:22 UTC
Created attachment 1141277 [details]
memory usage chart over time

Description of problem:
- memory leak observed for gnome-settings-daemon

Version-Release number of selected component (if applicable):
- currentrelease

How reproducible:
- run gnome

Steps to Reproduce:
1.
2.
3.

Actual results:
- slow memory leak

Expected results:


Additional info:
- See attachments showing slow memory usage increase

Comment 2 Bastien Nocera 2016-04-05 09:59:16 UTC
After installing the debuginfo packages for gnome-settings-daemon with debuginfo-install, run gnome-settings-daemon under valgrind:
valgrind /usr/libexec/gnome-settings-daemon --replace

Let it run a little while (from the logs, about 20 minutes should be enough to capture a part of the leak), and run:
/usr/libexec/gnome-settings-daemon --replace

To make the original gnome-settings-daemon stop, and valgrind to report all the leaked memory.

Comment 3 Joe Wright 2016-05-31 19:26:42 UTC
Created attachment 1163353 [details]
valgrind data

see attached

Comment 4 Bastien Nocera 2016-06-03 14:10:46 UTC
Those 2:

==13846== 159,170 (32,736 direct, 126,434 indirect) bytes in 22 blocks are definitely lost in loss record 14,329 of 14,331
==13846==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==13846==    by 0x77D225E: g_malloc (gmem.c:97)
==13846==    by 0x77BC632: g_hash_table_get_keys_as_array (ghash.c:1703)
==13846==    by 0x66C6D26: g_settings_schema_source_list_schemas (gsettingsschema.c:803)
==13846==    by 0x4049E1: is_schema (gnome-settings-manager.c:208)
==13846==    by 0x4049E1: _load_file (gnome-settings-manager.c:261)
==13846==    by 0x4049E1: _load_dir (gnome-settings-manager.c:314)
==13846==    by 0x4049E1: _load_all (gnome-settings-manager.c:330)
==13846==    by 0x4049E1: gnome_settings_manager_start (gnome-settings-manager.c:424)
==13846==    by 0x403F4A: start_settings_manager (main.c:149)
==13846==    by 0x403F4A: name_acquired_handler (main.c:304)
==13846==    by 0x66E5365: do_call (gdbusnameowning.c:215)
==13846==    by 0x66E5597: request_name_cb (gdbusnameowning.c:327)
==13846==    by 0x6681F46: g_simple_async_result_complete (gsimpleasyncresult.c:763)
==13846==    by 0x66DD321: g_dbus_connection_call_done (gdbusconnection.c:5497)
==13846==    by 0x6681F46: g_simple_async_result_complete (gsimpleasyncresult.c:763)
==13846==    by 0x6681FA8: complete_in_idle_cb (gsimpleasyncresult.c:775)

and

==16821== 15,334 (2,640 direct, 12,694 indirect) bytes in 22 blocks are definitely lost in loss record 11,293 of 11,321
==16821==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16821==    by 0x77D225E: g_malloc (gmem.c:97)
==16821==    by 0x77BC632: g_hash_table_get_keys_as_array (ghash.c:1703)
==16821==    by 0x66C6D4A: g_settings_schema_source_list_schemas (gsettingsschema.c:809)
==16821==    by 0x4049E1: is_schema (gnome-settings-manager.c:208)
==16821==    by 0x4049E1: _load_file (gnome-settings-manager.c:261)
==16821==    by 0x4049E1: _load_dir (gnome-settings-manager.c:314)
==16821==    by 0x4049E1: _load_all (gnome-settings-manager.c:330)
==16821==    by 0x4049E1: gnome_settings_manager_start (gnome-settings-manager.c:424)
==16821==    by 0x403F4A: start_settings_manager (main.c:149)
==16821==    by 0x403F4A: name_acquired_handler (main.c:304)
==16821==    by 0x66E5365: do_call (gdbusnameowning.c:215)
==16821==    by 0x66E5597: request_name_cb (gdbusnameowning.c:327)
==16821==    by 0x6681F46: g_simple_async_result_complete (gsimpleasyncresult.c:763)
==16821==    by 0x66DD321: g_dbus_connection_call_done (gdbusconnection.c:5497)
==16821==    by 0x6681F46: g_simple_async_result_complete (gsimpleasyncresult.c:763)
==16821==    by 0x6681FA8: complete_in_idle_cb (gsimpleasyncresult.c:775)

Are https://bugzilla.gnome.org/show_bug.cgi?id=754681
But they're one-time (at startup), so not our problem.

Those look like leaks inside the CUPS library:

==13846== 8,368 bytes in 1 blocks are definitely lost in loss record 14,257 of 14,331
==13846==    at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==13846==    by 0x233508C5: cupsFileOpenFd (file.c:1220)
==13846==    by 0x23350B92: cupsFileOpen (file.c:1170)
==13846==    by 0x233785A3: _cupsSetDefaults (usersys.c:836)
==13846==    by 0x233787BA: cupsServer (usersys.c:183)
==13846==    by 0x23126556: gsd_print_notifications_manager_start_idle (gsd-print-notifications-manager.c:1324)
==13846==    by 0x77CC799: g_main_dispatch (gmain.c:3109)
==13846==    by 0x77CC799: g_main_context_dispatch (gmain.c:3708)
==13846==    by 0x77CCAE7: g_main_context_iterate.isra.24 (gmain.c:3779)
==13846==    by 0x77CCDB9: g_main_loop_run (gmain.c:3973)
==13846==    by 0x5A8A044: gtk_main (gtkmain.c:1207)
==13846==    by 0x4037C0: main (main.c:427)
==13846== 
==13846== 8,368 bytes in 1 blocks are definitely lost in loss record 14,258 of 14,331
==13846==    at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==13846==    by 0x233508C5: cupsFileOpenFd (file.c:1220)
==13846==    by 0x23350B92: cupsFileOpen (file.c:1170)
==13846==    by 0x233785A3: _cupsSetDefaults (usersys.c:836)
==13846==    by 0x23378779: cupsEncryption (usersys.c:100)
==13846==    by 0x23371D37: _cupsConnect (request.c:1034)
==13846==    by 0x23372BC6: cupsDoIORequest (request.c:156)
==13846==    by 0x233480F2: _cupsGetDests (dest.c:1467)
==13846==    by 0x2334925B: cupsGetDests2 (dest.c:1688)
==13846==    by 0x231265A4: gsd_print_notifications_manager_start_idle (gsd-print-notifications-manager.c:1325)
==13846==    by 0x77CC799: g_main_dispatch (gmain.c:3109)
==13846==    by 0x77CC799: g_main_context_dispatch (gmain.c:3708)
==13846==    by 0x77CCAE7: g_main_context_iterate.isra.24 (gmain.c:3779)
==13846== 
==13846== 8,368 bytes in 1 blocks are definitely lost in loss record 14,259 of 14,331
==13846==    at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==13846==    by 0x233508C5: cupsFileOpenFd (file.c:1220)
==13846==    by 0x23350B92: cupsFileOpen (file.c:1170)
==13846==    by 0x233785A3: _cupsSetDefaults (usersys.c:836)
==13846==    by 0x23378779: cupsEncryption (usersys.c:100)
==13846==    by 0x231248DF: renew_subscription (gsd-print-notifications-manager.c:1076)
==13846==    by 0x231265D4: renew_subscription_timeout_enable (gsd-print-notifications-manager.c:1192)
==13846==    by 0x231265D4: gsd_print_notifications_manager_start_idle (gsd-print-notifications-manager.c:1328)
==13846==    by 0x77CC799: g_main_dispatch (gmain.c:3109)
==13846==    by 0x77CC799: g_main_context_dispatch (gmain.c:3708)
==13846==    by 0x77CCAE7: g_main_context_iterate.isra.24 (gmain.c:3779)
==13846==    by 0x77CCDB9: g_main_loop_run (gmain.c:3973)
==13846==    by 0x5A8A044: gtk_main (gtkmain.c:1207)
==13846==    by 0x4037C0: main (main.c:427)

Please file a separate bug against CUPS for those. If the problem is in the gnome-settings-daemon plugin, the bug will get reassigned.

That looks like a bug in gnome-desktop:

==13846== 4,064 (3,840 direct, 224 indirect) bytes in 48 blocks are definitely lost in loss record 14,166 of 14,331
==13846==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==13846==    by 0x77D225E: g_malloc (gmem.c:97)
==13846==    by 0x77E845D: g_slice_alloc (gslice.c:1007)
==13846==    by 0x7802402: g_variant_iter_new (gvariant.c:2912)
==13846==    by 0x780430B: g_variant_valist_get_nnp (gvariant.c:4734)
==13846==    by 0x780430B: g_variant_valist_get_leaf (gvariant.c:4913)
==13846==    by 0x780430B: g_variant_valist_get (gvariant.c:5094)
==13846==    by 0x78041C2: g_variant_valist_get (gvariant.c:5129)
==13846==    by 0x7805199: g_variant_get_va (gvariant.c:5356)
==13846==    by 0x7805581: g_variant_get (gvariant.c:5303)
==13846==    by 0x56774A4: crtc_initialize (gnome-rr.c:1825)
==13846==    by 0x56774A4: fill_screen_info_from_resources (gnome-rr.c:399)
==13846==    by 0x56774A4: fill_out_screen_info (gnome-rr.c:439)
==13846==    by 0x56774A4: screen_info_new (gnome-rr.c:456)
==13846==    by 0x5677BBE: on_proxy_acquired (gnome-rr.c:624)
==13846==    by 0x6681F46: g_simple_async_result_complete (gsimpleasyncresult.c:763)
==13846==    by 0x6681FA8: complete_in_idle_cb (gsimpleasyncresult.c:775)

Again please file a separate bug against the gnome-desktop3 package.

There's also a fair number of gdbus and dbus-glib calls without cleanups, but there's not enough data to find the root causes.

I don't think that whatever function it is that leaks had time to run in your tests. You might want to try disabling the "housekeeping" plugin with:
gsettings set org.gnome.settings-daemon.plugins.housekeeping active false
and see whether you can reproduce the problem.