Bug 1580352
Summary: | Glusterd memory leaking in gf_gld_mt_linebuf | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Sanju <srakonde> | |
Component: | glusterd | Assignee: | Sanju <srakonde> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | mainline | CC: | amukherj, bmekala, bugs, khiremat, nravinas, rhs-bugs, sankarshan, sheggodu, srakonde, storage-qa-internal, vbellur, vdas | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-5.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1575539 | |||
: | 1611110 (view as bug list) | Environment: | ||
Last Closed: | 2018-10-23 15:09:43 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1575539 | |||
Bug Blocks: | 1611110 |
Comment 1
Worker Ant
2018-05-21 10:53:09 UTC
Description of problem: Four node cluster. In all four hosts, the glusterd process has a memory leak. Looking at the ps output, the resident set size of the process is 1.6 GB on the QA nodes ==> Here, the process is consuming 1.6 GB and is taking nearly 14% of the memory: - sosreport glusteredc1fs2uq.owfg.com USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 20590 0.5 6.2 1681632 753284 ? Ssl Apr27 41:38 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO - sosreport glusterldc1fs1up.owfg.com USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 18199 1.8 7.5 1999648 1230012 ? Ssl May01 20:14 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/9920bccf2a4c92d44d9f991404c5765d.socket - sosreport glusterldc1fs2up.owfg.com USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 9102 3.5 36.7 8990320 5975468 ? Ssl Feb14 3927:15 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO ==> On this gluster node, it's taking 36% of memory and it's consuming nearly 6 GB. Node glusteredc1fs1uq.owfg.com glusterdump.1573.dump.1525458209 Looking at the highest memory size: [mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage] size=909495296 --> 909 MB leaked here num_allocs=888179 max_size=909495296 max_num_allocs=888179 total_allocs=888179 [mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage] size=170826728 --> 170 MB leaked here num_allocs=1607174 max_size=170839680 max_num_allocs=1607398 total_allocs=80039329 Same thing for the second iteration - the same structures keep growing: glusterdump.1573.dump.1525466693 [mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage] size=919127040 --> On this second iteration we have 919 MB num_allocs=897585 max_size=919127040 max_num_allocs=897585 total_allocs=897585 [mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage] size=172089816 num_allocs=1619086 max_size=172099544 max_num_allocs=1619240 total_allocs=80707302 Identical results for node glusteredc1fs2uq.owfg.com glusterdump.20590.dump.1525458352 mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage] size=476495872 num_allocs=465328 max_size=476495872 --> 476 MB leaked here: max_num_allocs=465328 total_allocs=465328 [mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage] size=70239188 --> 70 MB here num_allocs=627665 max_size=70284104 max_num_allocs=628212 total_allocs=86062168 glusterdump.20590.dump.1525466708 [mgmt/glusterd.management - usage-type gf_gld_mt_linebuf memusage] size=485989376 num_allocs=474599 max_size=485989376 --> On the second iteration, the memory has increased on 485 MB max_num_allocs=474599 total_allocs=474599 [mgmt/glusterd.management - usage-type gf_common_mt_mem_pool memusage] size=71332824 num_allocs=637632 max_size=71335796 max_num_allocs=637669 total_allocs=87385904 The only place where I can find such allocation is in geo-replication code: https://github.com/gluster/glusterfs/blob/master/xlators/mgmt/glusterd/src/glusterd-geo-rep.c Exactly here: glusterd_urltransform ... for (;;) { size_t len; line = GF_MALLOC (1024, gf_gld_mt_linebuf); if (!line) { error = _gf_true; goto out; } ... I believe this is caused by geo-replication. Further assistance from engineering is required to understand the source of this memory leak. COMMIT: https://review.gluster.org/20046 committed in master by "Amar Tumballi" <amarts> with a commit message- glusterd: memory leak in geo-rep status Fixes: bz#1580352 Change-Id: I9648e73090f5a2edbac663a6fb49acdb702cdc49 Signed-off-by: Sanju Rakonde <srakonde> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report. glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html [2] https://www.gluster.org/pipermail/gluster-users/ |