Bug 1528733

Summary: memory leak: get-state leaking memory in small amounts
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED ERRATA QA Contact: Bala Konda Reddy M <bmekala>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.3CC: amukherj, nchilaka, rhinduja, rhs-bugs, sheggodu, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1531149 (view as bug list) Environment:
Last Closed: 2018-09-04 06:40:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1531149, 1532475    
Bug Blocks: 1503137    

Description Nag Pavan Chilakam 2017-12-23 06:10:42 UTC
Description of problem:
=======================
looks like glusterd is leaking, on an IDLE setup.
(Possibly regression)
I created a 2x2 volume on a 6 node setup(one brick each on 4 nodes and remaining 2 no bricks)

I kept the setup idle without any IOs.
However resource information shows glusterd consuming memory

GLusterD leak
after every 2hrs:
Resident memory leaking about 3MB ---->as per top command
over 16hrs it has leaked about 22MB or so

overall(from free -h)
after every 2 hrs:
20-25MB
200MB in 16Hrs



=====Every 2 hrs=====
##############################

distrep
Fri Dec 22 08:04:30 EST 2017
###############################
## LOOP 3 ####
              total        used        free      shared  buff/cache   available
Mem:           7.6G        514M        3.8G        145M        3.3G        6.5G
Swap:          2.0G          0B        2.0G
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20246 root      20   0  604876  14328   4276 S   0.0  0.2   0:05.14 glusterd
20799 root      20   0 1022824  20740   3936 S   0.0  0.3   0:00.57 glusterfsd
20819 root      20   0  678320  14856   2756 S   0.0  0.2   0:00.36 glusterfs
###############################


###############################
distrep
Fri Dec 22 10:04:35 EST 2017
###############################
## LOOP 15 ####
              total        used        free      shared  buff/cache   available
Mem:           7.6G        534M        3.8G        162M        3.3G        6.4G
Swap:          2.0G          0B        2.0G
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20246 root      20   0  604876  17184   4276 S   0.0  0.2   0:16.61 glusterd
20799 root      20   0 1022824  20740   3936 S   0.0  0.3   0:02.25 glusterfsd
20819 root      20   0  678320  14856   2756 S   0.0  0.2   0:01.02 glusterfs


###############################
distrep
Fri Dec 22 12:04:41 EST 2017
###############################
## LOOP 27 ####
              total        used        free      shared  buff/cache   available
Mem:           7.6G        556M        3.7G        186M        3.4G        6.4G
Swap:          2.0G          0B        2.0G
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20246 root      20   0  604876  19828   4276 S   0.0  0.2   0:28.58 glusterd
20799 root      20   0 1022824  20740   3936 S   0.0  0.3   0:04.12 glusterfsd
20819 root      20   0  678320  14856   2756 S   0.0  0.2   0:01.76 glusterfs
###############################


distrep
Fri Dec 22 14:04:47 EST 2017
###############################
## LOOP 39 ####
              total        used        free      shared  buff/cache   available
Mem:           7.6G        579M        3.7G        202M        3.4G        6.4G
Swap:          2.0G          0B        2.0G
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20246 root      20   0  604876  22768   4276 S   0.0  0.3   0:40.59 glusterd
20799 root      20   0 1022824  20740   3936 S   0.0  0.3   0:06.04 glusterfsd
20819 root      20   0  678320  14856   2756 S   0.0  0.2   0:02.65 glusterfs
###############################


======== after 12 hrs ================
distrep
Fri Dec 22 20:05:05 EST 2017
###############################
## LOOP 75 ####
              total        used        free      shared  buff/cache   available
Mem:           7.6G        648M        3.5G        258M        3.5G        6.2G
Swap:          2.0G          0B        2.0G
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20246 root      20   0  604876  32376   4284 S   0.0  0.4   1:17.31 glusterd
20799 root      20   0 1022824  20740   3936 S   0.0  0.3   0:11.47 glusterfsd
20819 root      20   0  678320  14856   2756 S   0.0  0.2   0:05.85 glusterfs
###############################


=============  after close to 16.5 hrs ==========
[root@dhcp42-243 ~]# date
Sat Dec 23 00:47:48 EST 2017
[root@dhcp42-243 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           7.6G        704M        3.4G        306M        3.5G        6.1G
Swap:          2.0G          0B        2.0G
[root@dhcp42-243 ~]# top -n 1 -b|egrep "RES|gluster"
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20246 root      20   0  604876  37768   4284 S   0.0  0.5   1:44.60 glusterd
20799 root      20   0 1022824  20740   3936 S   0.0  0.3   0:15.64 glusterfsd
20819 root      20   0  678320  14856   2756 S   0.0  0.2   0:08.34 glusterfs
[root@dhcp42-243 ~]# 


#############################################################################################################################

The same on 3.3.1 setup

###############################
vol present, just an iteration
              total        used        free      shared  buff/cache   available
Mem:           7.6G        248M        7.1G        8.6M        265M        7.1G
Swap:          4.0G          0B        4.0G
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1605 root      20   0  678600  11376   4224 S   0.0  0.1   0:01.19 glusterd
 2989 root      20   0  957284  16816   3936 S   0.0  0.2   0:00.51 glusterfsd
 3009 root      20   0  678320  10948   2696 S   0.0  0.1   0:00.26 glusterfs
###############################
distrep
Fri Dec 22 12:54:03 UTC 2017

=====after about 15Hrs  , hardly any leak in Resident memory, negligible, hardly some KBs(an increase of about 65MB in virtual mem)=========

[root@dhcp42-44 ~]# date;top -n 1 -b|egrep "gluster|RES"
Sat Dec 23 06:06:04 UTC 2017
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1605 root      20   0  744136  11408   4224 S   0.0  0.1   0:05.77 glusterd
 2989 root      20   0 1022820  16816   3936 S   0.0  0.2   0:13.11 glusterfsd
 3009 root      20   0  678320  11000   2736 S   0.0  0.1   0:04.12 glusterfs
[root@dhcp42-44 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           7.6G        277M        7.0G        8.7M        414M        7.1G
Swap:          4.0G          0B        4.0G
[root@dhcp42-44 ~]#

Comment 7 Atin Mukherjee 2017-12-29 04:12:01 UTC
Leaks in:

gf_common_mt_gf_timer_t 
gf_common_mt_asprintf
gf_common_mt_strdup
gf_common_mt_char
gf_common_mt_socket_private_t
gf_common_mt_rpcsvc_wrapper_t
gf_common_mt_rpc_trans_t

Details between two statedumps comparison with one gluster get-state command run below:

[mgmt/glusterd.management - usage-type gf_common_mt_gf_timer_t memusage]
size=192
num_allocs=3
max_size=384
max_num_allocs=6
total_allocs=434240


vs

[mgmt/glusterd.management - usage-type gf_common_mt_gf_timer_t memusage]
size=192
num_allocs=3
max_size=384
max_num_allocs=6
total_allocs=434245


[mgmt/glusterd.management - usage-type gf_common_mt_asprintf memusage]
size=95942
num_allocs=10204
max_size=95975
max_num_allocs=10205
total_allocs=13049482


vs

[mgmt/glusterd.management - usage-type gf_common_mt_asprintf memusage]
size=96249
num_allocs=10237
max_size=96282
max_num_allocs=10238
total_allocs=13049519


[mgmt/glusterd.management - usage-type gf_common_mt_strdup memusage]
size=7056002
num_allocs=450809
max_size=7056020
max_num_allocs=450810
total_allocs=6742877


vs

[mgmt/glusterd.management - usage-type gf_common_mt_strdup memusage]
size=7058259
num_allocs=450870
max_size=7058277
max_num_allocs=450871
total_allocs=6742951


[mgmt/glusterd.management - usage-type gf_common_mt_char memusage]
size=49290
num_allocs=849
max_size=49367
max_num_allocs=1807
total_allocs=62985416


vs

[mgmt/glusterd.management - usage-type gf_common_mt_char memusage]
size=49410
num_allocs=850
max_size=49487
max_num_allocs=1807
total_allocs=62985424


[mgmt/glusterd.management - usage-type gf_common_mt_socket_private_t memusage]
size=69360
num_allocs=102
max_size=80240
max_num_allocs=118
total_allocs=16222

vs

[mgmt/glusterd.management - usage-type gf_common_mt_socket_private_t memusage]
size=69360
num_allocs=102
max_size=80240
max_num_allocs=118
total_allocs=16223


mgmt/glusterd.management - usage-type gf_common_mt_rpcsvc_wrapper_t memusage]
size=64
num_allocs=2
max_size=96
max_num_allocs=3
total_allocs=16122


vs

[mgmt/glusterd.management - usage-type gf_common_mt_rpcsvc_wrapper_t memusage]
size=64
num_allocs=2
max_size=96
max_num_allocs=3
total_allocs=16123


[mgmt/glusterd.management - usage-type gf_common_mt_rpc_trans_t memusage]
size=129744
num_allocs=102
max_size=150096
max_num_allocs=118
total_allocs=16223


vs

[mgmt/glusterd.management - usage-type gf_common_mt_rpc_trans_t memusage]
size=129744
num_allocs=102
max_size=150096
max_num_allocs=118
total_allocs=16224

Comment 8 Atin Mukherjee 2018-01-04 16:47:22 UTC
Have posted one patch https://review.gluster.org/19139 which reduces the leak by quite a margin. But we still have a small leak identified.

Comment 9 Atin Mukherjee 2018-01-04 16:47:52 UTC
(In reply to Atin Mukherjee from comment #8)
> Have posted one patch https://review.gluster.org/19139 which reduces the
> leak by quite a margin. But we still have a small leak identified.

s/identified/unidentified

Comment 14 Bala Konda Reddy M 2018-04-20 08:59:50 UTC
Build : 3.12.2-7

Created some 100 volumes with brick mux enbaled.

Ran gluster get-state for every 10 seconds for 13 hours.
Once in a while executed gluster vol profile 2cross33_99 start & info 

There is an increase of 3MB over 13 hours. As there isn't much increase of glusterd memory. 

Hence marking it as verified.

Comment 16 errata-xmlrpc 2018-09-04 06:40:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607