1342902 – glustershd process memory usage too high on CentOS 7.2

Bug 1342902 - glustershd process memory usage too high on CentOS 7.2

Summary: glustershd process memory usage too high on CentOS 7.2

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	core
Sub Component:
Version:	3.8
Hardware:	x86_64
OS:	Other
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-06-06 04:57 UTC by Jianwei Zhang
Modified:	2017-11-07 10:41 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-11-07 10:41:56 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Jianwei Zhang 2016-06-06 04:57:21 UTC

Description of problem:
I am running glusterfs release-3.8 on linux kernel version 3.10.0-327.el7.x86_64 on three(node-1/2/3) CentOS 7.2 host with 8GB of RAM each and double network interface cards. 
When alternately 'ifconfig eno1 down and ifconfig eno1 up' on node-2/3 and run more than 10 hours, the glustershd process memory(VIRT) usage more than 20GB on every node.

Version-Release number of selected component (if applicable):
glusterfs release-3.8 and git cherry-pick 24dd33929bbbc9a72360793048f17bf4e6cec8a3 on release-3.8
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
commit 24dd33929bbbc9a72360793048f17bf4e6cec8a3  --> (master)
Author: Kaleb S KEITHLEY <kkeithle>
Date:   Fri May 6 13:04:38 2016 -0400

    libglusterfs (timer): race conditions, illegal mem access, mem leak
    
    While investigating gfapi memory consumption with valgrind, valgrind
    reported several memory access issues.
    
    Also see the timer 'registry' being recreated (shortly) after being
    freed during teardown due to the way it's currently written.
    
    Passing ctx as data to gf_timer_proc() is prone to memory access
    issues if ctx is freed before gf_timer_proc() terminates. (And in
    fact this does happen, at least in valgrind.) gf_timer_proc() doesn't
    need ctx for anything, it only needs ctx->timer, so just pass that.
    
    Nothing ever calls gf_timer_registry_init(). Nothing outside of
    timer.c that is. Making it and gf_timer_proc() static.
    
    Change-Id: Ia28454dda0cf0de2fec94d76441d98c3927a906a
    BUG: 1333925
    Signed-off-by: Kaleb S KEITHLEY <kkeithle>
    Reviewed-on: http://review.gluster.org/14247
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Poornima G <pgurusid>
    Reviewed-by: Niels de Vos <ndevos>
    Reviewed-by: Jeff Darcy <jdarcy>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Linux Kernel 3.10.0-327.el7.x86_64
CentOS Linux release 7.2.1511 (Core)
16 disks(xfs) on every node. (three nodes)

How reproducible:
Create disperse test volume (16 * (2+1));
test-disperse-0
test-disperse-1
...
test-disperse-15
Bricks of every disperse group are node-1:/disk,node-2:/disk,node-3:/disk; 
Follow network cards of every node configure:
eno1: 10.10.21.111   10.10.21.112   10.10.21.113,   no gateway
eno2: 192.168.21.111 192.168.21.112 192.168.21.113, no gateway
eno1 bind to bricks （10.10.21.111:/brick)

Steps to Reproduce:
1.create disperse test volume (16 * (2+1)), start and mount test volume on every node(mount.glusterfs 127.0.0.1:/test /mnt/test);
2.execute a script on 10.10.21.111,following: 
if runtime > 10 hours, exit()
ssh 192.168.21.112 'ifconfig eno1 down'
ssh 192.168.21.112 'ifconfig eno1 up'
sleep(5)
ssh 192.168.21.113 'ifconfig eno1 down'
ssh 192.168.21.113 'ifconfig eno1 up'
sleep(5)
3.to observe the glustershd process memory usage;

Actual results:
According to the above test method, the glustershd process's memory usage always very very very high(about 20 GB). sweep space is use up!
This is abnormal, I doubt that have a memory leak in glustershd;

Expected results:
We would expect the memory usage to fall within a reasonable ceiling

Additional info:

Comment 1 Jianwei Zhang 2016-06-06 05:29:26 UTC

>>>Actual results:
>>>According to the above test method, the glustershd process's memory usage  >>>always very very very high(about 20 GB). sweep space is use up!==>(sweep space --> swap space)
>>>This is abnormal, I doubt that have a memory leak in glustershd;

Comment 2 Oleksandr Natalenko 2016-07-12 13:32:48 UTC

Possible dup of https://bugzilla.redhat.com/show_bug.cgi?id=1348095 ?

Comment 3 Niels de Vos 2016-09-12 05:39:28 UTC

All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html

Comment 4 Niels de Vos 2017-11-07 10:41:56 UTC

This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.

Note You need to log in before you can comment on or make changes to this bug.