Bug 1580352 - Glusterd memory leaking in gf_gld_mt_linebuf
Summary: Glusterd memory leaking in gf_gld_mt_linebuf
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Sanju
QA Contact:
URL:
Whiteboard:
Depends On: 1575539
Blocks: 1611110
TreeView+ depends on / blocked
 
Reported: 2018-05-21 10:49 UTC by Sanju
Modified: 2018-10-23 15:09 UTC (History)
12 users (show)

Fixed In Version: glusterfs-5.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1575539
: 1611110 (view as bug list)
Environment:
Last Closed: 2018-10-23 15:09:43 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3465301 0 None None None 2018-06-29 08:27:26 UTC

Comment 1 Worker Ant 2018-05-21 10:53:09 UTC
REVIEW: https://review.gluster.org/20046 (glusterd: memory leak in geo-rep status) posted (#1) for review on master by Sanju Rakonde

Comment 3 Kotresh HR 2018-05-21 12:24:09 UTC
Description of problem:

Four node cluster.


In all four hosts, the glusterd process has a memory leak.

Looking at the ps output, the resident set size of the process is 1.6 GB on the QA nodes
==> Here, the process is consuming 1.6 GB and is taking nearly 14% of the memory:

- sosreport glusteredc1fs2uq.owfg.com

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     20590  0.5  6.2 1681632 753284 ?      Ssl  Apr27  41:38 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

- sosreport glusterldc1fs1up.owfg.com

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     18199  1.8  7.5 1999648 1230012 ?     Ssl  May01  20:14 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/9920bccf2a4c92d44d9f991404c5765d.socket

- sosreport glusterldc1fs2up.owfg.com

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      9102  3.5 36.7 8990320 5975468 ?     Ssl  Feb14 3927:15 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

==> On this gluster node, it's taking 36% of memory and it's consuming nearly 6 GB.

Node glusteredc1fs1uq.owfg.com

glusterdump.1573.dump.1525458209

Looking at the highest memory size:

[mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage]
size=909495296 --> 909 MB leaked here
num_allocs=888179
max_size=909495296
max_num_allocs=888179
total_allocs=888179

[mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage]
size=170826728  --> 170 MB leaked here
num_allocs=1607174
max_size=170839680
max_num_allocs=1607398
total_allocs=80039329

Same thing for the second iteration - the same structures keep growing: 

glusterdump.1573.dump.1525466693


[mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage]
size=919127040  --> On this second iteration we have 919 MB
num_allocs=897585
max_size=919127040
max_num_allocs=897585
total_allocs=897585

[mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage]
size=172089816
num_allocs=1619086
max_size=172099544
max_num_allocs=1619240
total_allocs=80707302

Identical results for node glusteredc1fs2uq.owfg.com

glusterdump.20590.dump.1525458352


mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage]
size=476495872
num_allocs=465328
max_size=476495872  --> 476 MB leaked here:
max_num_allocs=465328
total_allocs=465328

[mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage]
size=70239188  --> 70 MB here
num_allocs=627665
max_size=70284104
max_num_allocs=628212
total_allocs=86062168

glusterdump.20590.dump.1525466708


[mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage]
size=485989376
num_allocs=474599
max_size=485989376  --> On the second iteration, the memory has increased on 485 MB
max_num_allocs=474599
total_allocs=474599

[mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage]
size=71332824
num_allocs=637632
max_size=71335796
max_num_allocs=637669
total_allocs=87385904

The only place where I can find such allocation is in geo-replication code:

https://github.com/gluster/glusterfs/blob/master/xlators/mgmt/glusterd/src/glusterd-geo-rep.c

Exactly here:

glusterd_urltransform

...

  for (;;) {
                size_t len;
                line = GF_MALLOC (1024, gf_gld_mt_linebuf);
                if (!line) {
                        error = _gf_true;
                        goto out;
                }


...

I believe this is caused by geo-replication. Further assistance from engineering is required to understand the source of this memory leak.

Comment 4 Worker Ant 2018-05-28 02:43:52 UTC
COMMIT: https://review.gluster.org/20046 committed in master by "Amar Tumballi" <amarts> with a commit message- glusterd: memory leak in geo-rep status

Fixes: bz#1580352

Change-Id: I9648e73090f5a2edbac663a6fb49acdb702cdc49
Signed-off-by: Sanju Rakonde <srakonde>

Comment 5 Shyamsundar 2018-10-23 15:09:43 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report.

glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.