Description of problem: I am running glusterfs release-3.8 on linux kernel version 3.10.0-327.el7.x86_64 on three(node-1/2/3) CentOS 7.2 host with 8GB of RAM each and double network interface cards. When alternately 'ifconfig eno1 down and ifconfig eno1 up' on node-2/3 and run more than 10 hours, the glustershd process memory(VIRT) usage more than 20GB on every node. Version-Release number of selected component (if applicable): glusterfs release-3.8 and git cherry-pick 24dd33929bbbc9a72360793048f17bf4e6cec8a3 on release-3.8 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> commit 24dd33929bbbc9a72360793048f17bf4e6cec8a3 --> (master) Author: Kaleb S KEITHLEY <kkeithle> Date: Fri May 6 13:04:38 2016 -0400 libglusterfs (timer): race conditions, illegal mem access, mem leak While investigating gfapi memory consumption with valgrind, valgrind reported several memory access issues. Also see the timer 'registry' being recreated (shortly) after being freed during teardown due to the way it's currently written. Passing ctx as data to gf_timer_proc() is prone to memory access issues if ctx is freed before gf_timer_proc() terminates. (And in fact this does happen, at least in valgrind.) gf_timer_proc() doesn't need ctx for anything, it only needs ctx->timer, so just pass that. Nothing ever calls gf_timer_registry_init(). Nothing outside of timer.c that is. Making it and gf_timer_proc() static. Change-Id: Ia28454dda0cf0de2fec94d76441d98c3927a906a BUG: 1333925 Signed-off-by: Kaleb S KEITHLEY <kkeithle> Reviewed-on: http://review.gluster.org/14247 NetBSD-regression: NetBSD Build System <jenkins.org> Smoke: Gluster Build System <jenkins.com> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Poornima G <pgurusid> Reviewed-by: Niels de Vos <ndevos> Reviewed-by: Jeff Darcy <jdarcy> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Linux Kernel 3.10.0-327.el7.x86_64 CentOS Linux release 7.2.1511 (Core) 16 disks(xfs) on every node. (three nodes) How reproducible: Create disperse test volume (16 * (2+1)); test-disperse-0 test-disperse-1 ... test-disperse-15 Bricks of every disperse group are node-1:/disk,node-2:/disk,node-3:/disk; Follow network cards of every node configure: eno1: 10.10.21.111 10.10.21.112 10.10.21.113, no gateway eno2: 192.168.21.111 192.168.21.112 192.168.21.113, no gateway eno1 bind to bricks (10.10.21.111:/brick) Steps to Reproduce: 1.create disperse test volume (16 * (2+1)), start and mount test volume on every node(mount.glusterfs 127.0.0.1:/test /mnt/test); 2.execute a script on 10.10.21.111,following: if runtime > 10 hours, exit() ssh 192.168.21.112 'ifconfig eno1 down' ssh 192.168.21.112 'ifconfig eno1 up' sleep(5) ssh 192.168.21.113 'ifconfig eno1 down' ssh 192.168.21.113 'ifconfig eno1 up' sleep(5) 3.to observe the glustershd process memory usage; Actual results: According to the above test method, the glustershd process's memory usage always very very very high(about 20 GB). sweep space is use up! This is abnormal, I doubt that have a memory leak in glustershd; Expected results: We would expect the memory usage to fall within a reasonable ceiling Additional info:
>>>Actual results: >>>According to the above test method, the glustershd process's memory usage >>>always very very very high(about 20 GB). sweep space is use up!==>(sweep space --> swap space) >>>This is abnormal, I doubt that have a memory leak in glustershd;
Possible dup of https://bugzilla.redhat.com/show_bug.cgi?id=1348095 ?
All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html
This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.