Bug 1377584 - memory leak problems are found in daemon:glusterd, server:glusterfsd and client:glusterfs
Summary: memory leak problems are found in daemon:glusterd, server:glusterfsd and clie...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Vijay Bellur
QA Contact: Matt Zywusko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-20 07:14 UTC by bingxuan
Modified: 2018-02-22 02:31 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-06 17:26:53 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
statedump txt file and picture (311.40 KB, application/zip)
2016-09-20 07:14 UTC, bingxuan
no flags Details

Description bingxuan 2016-09-20 07:14:58 UTC
Created attachment 1202732 [details]
statedump txt file and picture

Description of problem:
It is found that memory leak problems exists in all glusterfs instance processes.
Incudes:
daemon:glusterd
server:glusterfsd
client:glusterfs
With the trend, it will be OOM after months time.

Version-Release number of selected component (if applicable):
3.6.9
3.8.4

How reproducible:
leave the enviroment for days and compare the statedump of every glusterfs process.

Steps to Reproduce:
1.
2.
3.

Actual results:
Memory leak problem is detected

Expected results:
There should no memory leakage 

Additional info:
Below code is based on 3.8.4, problem is found in 3.6.9, but checked the code, found that problem still exist in 3.8.4
===============================================================================
leak module:    glusterfs daemon(glusterd)
analysis:       amount of memory leak type data_t : data_pair_t : dict_t = 2 : 1 : 1
file:           ./xlators/mgmt/glusterd/src/glusterd-handshake.c
function:       glusterd_get_args_from_dict
diff:         
--- glusterd-handshake.c
+++ glusterd-handshake_fix.c
@@ -473,6 +473,8 @@
 
         gf_msg_debug (this->name, 0, "brick_name = %s", *brick_name);
 out:
+        if(dict)
+            dict_destroy(dict);
         peerinfo->max_op_version = client_max_op_version;
         peerinfo->min_op_version = client_min_op_version;
===============================================================================
leak module:    glusterfs server(glusterfsd)
analysis:       amount of memory leak type is gf_common_mt_strdup, per size value is 37bytes
file:           ./libglusterfs/src/client_t.c
function:       client_destroy
diff: 
--- client_t.c
+++ client_t_fix.c
@@ -365,6 +365,8 @@
                         xtrav = xtrav->next;
                 }
         }
+        GF_FREE (client->auth.username);
+        GF_FREE (client->auth.passwd);
         GF_FREE (client->auth.data);
         GF_FREE (client->scratch_ctx.ctx);
         GF_FREE (client->client_uid);
===============================================================================
leak module:    glusterfs client
analysis:       dentry_t : inode_t = 1 : 1, amount never decrease.
file:           ./libglusterfs/src/inode.c
function:       inode_table_new 
diff: 
--- inode.c
+++ inode_fix.c
@@ -1582,13 +1582,13 @@
         new->xl = xl;
         new->ctxcount = xl->graph->xl_count + 1;
 
-        new->lru_limit = lru_limit;
-
         new->hashsize = 14057; /* TODO: Random Number?? */
 
         /* In case FUSE is initing the inode table. */
         if (lru_limit == 0)
                 lru_limit = DEFAULT_INODE_MEMPOOL_ENTRIES;
+            
+        new->lru_limit = lru_limit;
 
         new->inode_pool = mem_pool_new (inode_t, lru_limit);
===============================================================================

Will you accept above fix and put it to 3.8.4 (3.6.9 as well)?

Comment 2 Atin Mukherjee 2016-09-20 08:01:35 UTC
Given that you are using community version of GlusterFS, I am moving this BZ to GlusterFS.

Comment 3 Worker Ant 2016-10-04 17:56:19 UTC
REVIEW: http://review.gluster.org/15612 (mgmt/glusterd: Cleanup memory leaks in handshake) posted (#1) for review on master by Vijay Bellur (vbellur)

Comment 4 Worker Ant 2016-10-04 18:02:31 UTC
REVIEW: http://review.gluster.org/15613 (libglusterfs/client_t: cleanup username and passwd in client_destroy()) posted (#1) for review on master by Vijay Bellur (vbellur)

Comment 5 Worker Ant 2016-10-04 18:30:11 UTC
REVIEW: http://review.gluster.org/15614 (libglusterfs: Set right assignment for inode table's lru limit) posted (#1) for review on master by Vijay Bellur (vbellur)

Comment 6 Vijay Bellur 2016-10-04 18:45:46 UTC
(In reply to bingxuan from comment #0)

> ==
> 
> Will you accept above fix and put it to 3.8.4 (3.6.9 as well)?

Thank you for the report and patches! I have posted these patches for acceptance in mainline. We can certainly backport these patches to 3.8. Once 3.9 is released this month, 3.6 will enter end of life. We may not have further releases in 3.6 and hence backporting would probably not happen.

Comment 7 Worker Ant 2016-10-04 20:22:06 UTC
REVIEW: http://review.gluster.org/15613 (libglusterfs/client_t: cleanup username and passwd in client_destroy()) posted (#2) for review on master by Vijay Bellur (vbellur)

Comment 8 Worker Ant 2016-10-05 01:19:30 UTC
REVIEW: http://review.gluster.org/15612 (mgmt/glusterd: Cleanup memory leaks in handshake) posted (#2) for review on master by Vijay Bellur (vbellur)

Comment 9 Worker Ant 2016-10-11 12:18:25 UTC
COMMIT: http://review.gluster.org/15613 committed in master by Kaleb KEITHLEY (kkeithle) 
------
commit b77cf4d021f26eb3fda81abec10464594324fac2
Author: Vijay Bellur <vbellur>
Date:   Tue Oct 4 14:02:12 2016 -0400

    libglusterfs/client_t: cleanup username and passwd in client_destroy()
    
    Thanks to bingxuan.zhang at nokia dot com for the report and patch.
    
    Change-Id: If2b2151ab4df27d769126860d98770c80bc8a534
    BUG: 1377584
    Signed-off-by: Vijay Bellur <vbellur>
    Reviewed-on: http://review.gluster.org/15613
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Niels de Vos <ndevos>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 10 Worker Ant 2016-10-12 06:28:35 UTC
COMMIT: http://review.gluster.org/15612 committed in master by Atin Mukherjee (amukherj) 
------
commit f0c588e5e6fa1552325a31e0e01704ecf063c7e1
Author: Vijay Bellur <vbellur>
Date:   Tue Oct 4 13:55:53 2016 -0400

    mgmt/glusterd: Cleanup memory leaks in handshake
    
    Thanks to bingxuan.zhang at nokia dot com for the report and patch.
    
    Change-Id: I994f82493fec7827f31592340af5bda83322f878
    BUG: 1377584
    Signed-off-by: Vijay Bellur <vbellur>
    Reviewed-on: http://review.gluster.org/15612
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 12 George 2016-12-12 06:17:13 UTC
diff: 
--- inode.c
+++ inode_fix.c
@@ -1582,13 +1582,13 @@
         new->xl = xl;
         new->ctxcount = xl->graph->xl_count + 1;
 
-        new->lru_limit = lru_limit;
-
         new->hashsize = 14057; /* TODO: Random Number?? */
 
         /* In case FUSE is initing the inode table. */
         if (lru_limit == 0)
                 lru_limit = DEFAULT_INODE_MEMPOOL_ENTRIES;
+            
+        new->lru_limit = lru_limit;
 
         new->inode_pool = mem_pool_new (inode_t, lru_limit);

the above change will affect function in inode_table_prune:
                while (table->lru_limit
                       && table->lru_size > (table->lru_limit)) {

                        entry = list_entry (table->lru.next, inode_t, list);

                        table->lru_size--;
                        __inode_retire (entry);

                        ret++;
                }

it will lead to __inode_retire without notify the kernel, and while kerenl do fuse_forget, it finally will lead to invalid memory access and finally lead to coredump, isn't it?

Comment 13 Shyamsundar 2017-03-06 17:26:53 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.