Description of problem: ======================= looks like glusterd is leaking, on an IDLE setup. (Possibly regression) I created a 2x2 volume on a 6 node setup(one brick each on 4 nodes and remaining 2 no bricks) I kept the setup idle without any IOs. However resource information shows glusterd consuming memory GLusterD leak after every 2hrs: Resident memory leaking about 3MB ---->as per top command over 16hrs it has leaked about 22MB or so overall(from free -h) after every 2 hrs: 20-25MB 200MB in 16Hrs =====Every 2 hrs===== ############################## distrep Fri Dec 22 08:04:30 EST 2017 ############################### ## LOOP 3 #### total used free shared buff/cache available Mem: 7.6G 514M 3.8G 145M 3.3G 6.5G Swap: 2.0G 0B 2.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20246 root 20 0 604876 14328 4276 S 0.0 0.2 0:05.14 glusterd 20799 root 20 0 1022824 20740 3936 S 0.0 0.3 0:00.57 glusterfsd 20819 root 20 0 678320 14856 2756 S 0.0 0.2 0:00.36 glusterfs ############################### ############################### distrep Fri Dec 22 10:04:35 EST 2017 ############################### ## LOOP 15 #### total used free shared buff/cache available Mem: 7.6G 534M 3.8G 162M 3.3G 6.4G Swap: 2.0G 0B 2.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20246 root 20 0 604876 17184 4276 S 0.0 0.2 0:16.61 glusterd 20799 root 20 0 1022824 20740 3936 S 0.0 0.3 0:02.25 glusterfsd 20819 root 20 0 678320 14856 2756 S 0.0 0.2 0:01.02 glusterfs ############################### distrep Fri Dec 22 12:04:41 EST 2017 ############################### ## LOOP 27 #### total used free shared buff/cache available Mem: 7.6G 556M 3.7G 186M 3.4G 6.4G Swap: 2.0G 0B 2.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20246 root 20 0 604876 19828 4276 S 0.0 0.2 0:28.58 glusterd 20799 root 20 0 1022824 20740 3936 S 0.0 0.3 0:04.12 glusterfsd 20819 root 20 0 678320 14856 2756 S 0.0 0.2 0:01.76 glusterfs ############################### distrep Fri Dec 22 14:04:47 EST 2017 ############################### ## LOOP 39 #### total used free shared buff/cache available Mem: 7.6G 579M 3.7G 202M 3.4G 6.4G Swap: 2.0G 0B 2.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20246 root 20 0 604876 22768 4276 S 0.0 0.3 0:40.59 glusterd 20799 root 20 0 1022824 20740 3936 S 0.0 0.3 0:06.04 glusterfsd 20819 root 20 0 678320 14856 2756 S 0.0 0.2 0:02.65 glusterfs ############################### ======== after 12 hrs ================ distrep Fri Dec 22 20:05:05 EST 2017 ############################### ## LOOP 75 #### total used free shared buff/cache available Mem: 7.6G 648M 3.5G 258M 3.5G 6.2G Swap: 2.0G 0B 2.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20246 root 20 0 604876 32376 4284 S 0.0 0.4 1:17.31 glusterd 20799 root 20 0 1022824 20740 3936 S 0.0 0.3 0:11.47 glusterfsd 20819 root 20 0 678320 14856 2756 S 0.0 0.2 0:05.85 glusterfs ############################### ============= after close to 16.5 hrs ========== [root@dhcp42-243 ~]# date Sat Dec 23 00:47:48 EST 2017 [root@dhcp42-243 ~]# free -h total used free shared buff/cache available Mem: 7.6G 704M 3.4G 306M 3.5G 6.1G Swap: 2.0G 0B 2.0G [root@dhcp42-243 ~]# top -n 1 -b|egrep "RES|gluster" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20246 root 20 0 604876 37768 4284 S 0.0 0.5 1:44.60 glusterd 20799 root 20 0 1022824 20740 3936 S 0.0 0.3 0:15.64 glusterfsd 20819 root 20 0 678320 14856 2756 S 0.0 0.2 0:08.34 glusterfs [root@dhcp42-243 ~]# ############################################################################################################################# The same on 3.3.1 setup ############################### vol present, just an iteration total used free shared buff/cache available Mem: 7.6G 248M 7.1G 8.6M 265M 7.1G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1605 root 20 0 678600 11376 4224 S 0.0 0.1 0:01.19 glusterd 2989 root 20 0 957284 16816 3936 S 0.0 0.2 0:00.51 glusterfsd 3009 root 20 0 678320 10948 2696 S 0.0 0.1 0:00.26 glusterfs ############################### distrep Fri Dec 22 12:54:03 UTC 2017 =====after about 15Hrs , hardly any leak in Resident memory, negligible, hardly some KBs(an increase of about 65MB in virtual mem)========= [root@dhcp42-44 ~]# date;top -n 1 -b|egrep "gluster|RES" Sat Dec 23 06:06:04 UTC 2017 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1605 root 20 0 744136 11408 4224 S 0.0 0.1 0:05.77 glusterd 2989 root 20 0 1022820 16816 3936 S 0.0 0.2 0:13.11 glusterfsd 3009 root 20 0 678320 11000 2736 S 0.0 0.1 0:04.12 glusterfs [root@dhcp42-44 ~]# free -h total used free shared buff/cache available Mem: 7.6G 277M 7.0G 8.7M 414M 7.1G Swap: 4.0G 0B 4.0G [root@dhcp42-44 ~]#
Leaks in: gf_common_mt_gf_timer_t gf_common_mt_asprintf gf_common_mt_strdup gf_common_mt_char gf_common_mt_socket_private_t gf_common_mt_rpcsvc_wrapper_t gf_common_mt_rpc_trans_t Details between two statedumps comparison with one gluster get-state command run below: [mgmt/glusterd.management - usage-type gf_common_mt_gf_timer_t memusage] size=192 num_allocs=3 max_size=384 max_num_allocs=6 total_allocs=434240 vs [mgmt/glusterd.management - usage-type gf_common_mt_gf_timer_t memusage] size=192 num_allocs=3 max_size=384 max_num_allocs=6 total_allocs=434245 [mgmt/glusterd.management - usage-type gf_common_mt_asprintf memusage] size=95942 num_allocs=10204 max_size=95975 max_num_allocs=10205 total_allocs=13049482 vs [mgmt/glusterd.management - usage-type gf_common_mt_asprintf memusage] size=96249 num_allocs=10237 max_size=96282 max_num_allocs=10238 total_allocs=13049519 [mgmt/glusterd.management - usage-type gf_common_mt_strdup memusage] size=7056002 num_allocs=450809 max_size=7056020 max_num_allocs=450810 total_allocs=6742877 vs [mgmt/glusterd.management - usage-type gf_common_mt_strdup memusage] size=7058259 num_allocs=450870 max_size=7058277 max_num_allocs=450871 total_allocs=6742951 [mgmt/glusterd.management - usage-type gf_common_mt_char memusage] size=49290 num_allocs=849 max_size=49367 max_num_allocs=1807 total_allocs=62985416 vs [mgmt/glusterd.management - usage-type gf_common_mt_char memusage] size=49410 num_allocs=850 max_size=49487 max_num_allocs=1807 total_allocs=62985424 [mgmt/glusterd.management - usage-type gf_common_mt_socket_private_t memusage] size=69360 num_allocs=102 max_size=80240 max_num_allocs=118 total_allocs=16222 vs [mgmt/glusterd.management - usage-type gf_common_mt_socket_private_t memusage] size=69360 num_allocs=102 max_size=80240 max_num_allocs=118 total_allocs=16223 mgmt/glusterd.management - usage-type gf_common_mt_rpcsvc_wrapper_t memusage] size=64 num_allocs=2 max_size=96 max_num_allocs=3 total_allocs=16122 vs [mgmt/glusterd.management - usage-type gf_common_mt_rpcsvc_wrapper_t memusage] size=64 num_allocs=2 max_size=96 max_num_allocs=3 total_allocs=16123 [mgmt/glusterd.management - usage-type gf_common_mt_rpc_trans_t memusage] size=129744 num_allocs=102 max_size=150096 max_num_allocs=118 total_allocs=16223 vs [mgmt/glusterd.management - usage-type gf_common_mt_rpc_trans_t memusage] size=129744 num_allocs=102 max_size=150096 max_num_allocs=118 total_allocs=16224
Have posted one patch https://review.gluster.org/19139 which reduces the leak by quite a margin. But we still have a small leak identified.
(In reply to Atin Mukherjee from comment #8) > Have posted one patch https://review.gluster.org/19139 which reduces the > leak by quite a margin. But we still have a small leak identified. s/identified/unidentified
Build : 3.12.2-7 Created some 100 volumes with brick mux enbaled. Ran gluster get-state for every 10 seconds for 13 hours. Once in a while executed gluster vol profile 2cross33_99 start & info There is an increase of 3MB over 13 hours. As there isn't much increase of glusterd memory. Hence marking it as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607