+++ This bug was initially created as a clone of Bug #1329335 +++ We are using GlusterFS 3.7.11 (upgraded from 3.7.6 last week) on RHEL 7.x in AWS EC2. We continue to see memory utilization going up once every 2 days. The memory utilization of the server demon(glusterd) in NFS server is keep on increasing. In about 30+ hours the Memory utilization of glusterd service alone will reach 70% of memory available. Since we have alarms for this threshold, we get notified and only way to stop it so far is to restart the glusterd. This happens even where there’s not a lot of load in GlusterFS. The GlusterFS is configured in the two server nodes with two mount location. $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvdf 125829120 120186 125708934 1% /nfs_app1 /dev/xvdg 125829120 142937 125686183 1% /nfs_app2 As part of debugging, we tried the following: 1. From the client side, in the mount location, we tried to read and write around 1000 files (each of 4MB size). There was no marked spike in memory utilization during this time. 2. We were using GlusterFS 3.7.6 and moved to 3.7.11 and despite that the problem persists. 3. We created a dump of the volume in question. The dump file is attached. Some of memory allocations such as gf_common_mt_asptinlf_memoryusage has huge total_allocs. Specifically 3 of them that are listed below. [global.glusterfs - usage-type gf_common_mt_asprintf memusage] size=260 num_allocs=12 max_size=2464 max_num_allocs=294 total_allocs=927964 [global.glusterfs - usage-type gf_common_mt_char memusage] size=6388 num_allocs=164 max_size=30134 max_num_allocs=645 total_allocs=1424017 [protocol/server.xyz-server - usage-type gf_common_mt_strdup memusage] size=26055 num_allocs=2795 max_size=27198 max_num_allocs=2828 total_allocs=135503 4. We also noticed that the mempool has nr_files as a negative number. Not sure if this is also a cause of the problem. [mempool] [storage/posix.xyz-posix] base_path=/nfs_xyz/abc base_path_length=25 max_read=44215866 max_write=104925485 nr_files=-418 This is happening in Prod and as expected generates a lot of problems. Has anybody seen this before? Any insights into what we can try would be greatly appreciated. --- Additional comment from Kaushal on 2016-04-25 03:37:03 EDT --- Hi Nagendra, Could you provide the statedumps of the GlusterD process? The dumps you've provided are of the brick processes (ie. glusterfsd). You can get the statedump of the the GlusterD process by sending is a SIGUSR1 signal. `kill -SIGUSR1 <pid of glusterd>`. The statedump files will be created in /var/run/gluster. It would be nice if you can provide statedumps from two different times, so that we can compare what changed. --- Additional comment from Nagendran on 2016-04-25 08:46 EDT --- --- Additional comment from Nagendran on 2016-04-25 08:46 EDT --- --- Additional comment from Nagendran on 2016-04-25 08:48:29 EDT --- Hi Kaushal, Thanks for your response. As you suggested, We have taken statedumps in different time interval, when we taken first dump, the memory utilization of glusterd is 5.6% , after one hour its increased by 7.6%. So both these dumps have been attached for your reference. Note: The time between the dumps is 1hr. For any more information on this, please let us know. Thanks, Nagendran N --- Additional comment from Atin Mukherjee on 2016-04-25 11:24:51 EDT --- What all commands have been run? Could you also attach the cmd_history.log file from all the nodes? --- Additional comment from Nagendran on 2016-04-26 05:48 EDT --- Command History for the first node. --- Additional comment from Nagendran on 2016-04-26 05:48 EDT --- --- Additional comment from Nagendran on 2016-04-26 05:51:01 EDT --- Hi Kaushal, I have attached the command history for both the nodes as part of this as requested. Please let us know if you have any thoughts on what's going on. Unfortunately, this is creating a serious problem (memory alarms in Prod getting triggered almost every alternate day) and we are ending up restarting during loads. Appreciate any pointers on the problem. --- Additional comment from Mohammed Rafi KC on 2016-04-26 08:51:45 EDT --- From your cmd_history, I see your running lot of profile commands. So are you observing memory leak when you run profile commands continuously?. I'm asking this because that will help to figure out the memory leak path easily. --- Additional comment from Nagendran on 2016-04-26 09:42:38 EDT --- We are using a custom NewRelic plugin to get some metrics about the GlusterFS peer for monitoring - basically a cron job that runs this command: gluster volume profile ${1} info | grep -n "Brick" | sed -n 2p | cut -d":" -f 1 We have temporarily stopped the agent now and will check if this helps. Is this what you are suspecting as the root cause? The reason why we added the agent was to monitor the following: I/O operation per second. Directories activities per second Files activities per second Files Information activities per second Files Latency Directory Latency Are there any other ways of doing it without doing the volume profile? --- Additional comment from Kaushal on 2016-04-27 07:53:10 EDT --- As Rafi has mentioned already, I too think it's the volume profile polling causing issues. From the statedumps, I see that memory allocations for dict_t, data_t, data_pair_t, gf_common_mt_memdup, gf_common_mt_asprintf and gf_common_mt_strdup have increased quite a lot. These are the memory types generally associated with the GlusterFS dictionary data type and operations on it (including dict_serialize and unserialize). Information in GlusterFS is passed between processes (brick to glusterd, glusterd to glusterd and glusterd to cli) using dictionaries as containers. But certain operations generate a large amount of data, like 'volume profile', which make dictionaries huge. Collecting information from multiple sources in volume profile involves a lot of data duplication happening, which uses a lot of memory. While in most cases, memory allocated to dictionaries should be freed upon the dictionary being destroyed, it appears that there is a quite significant leak in volume profile path. We'll try to identify the leak as soon as we can. In the meantime, I do hope that stopping the agent has helped. GlusterFS doesn't have any other way, apart from volume profile, to gather volume stats on server. A client side tool, glusterfsiostat[1], was implemented a couple of years back as a GSOC project. You could try it out. If that doesn't work out, and you really need to monitor the stats, I suggest that you reduce the polling interval. From the logs, I see that the interval is 1 minute right now, which can be changed to say 5 minutes. Also, the polling is being done from both servers. This effectively make the polling period 30 seconds. You can just use one of the servers to get the stats, as volume profile gathers stats of a volume from the whole cluster. --- Additional comment from Nagendran on 2016-04-27 10:02:20 EDT --- Hi Kaushal, Stopping the agent helped. We have not seen the leak since then. So the theory is right. Thanks much for your pointer. It helped us to zero in on the plugin. Like you said if we decrease the frequency, wouldn't we still have this problem (just a little later)? We can try that in a lower environment - but I suspect it may not help much. The client side is also a problem because there are many clients and they change as well. The best option seem to be identifying the way to clear the allocated memory right after the information is read. Is there a way to do that? If so, we can add it to the plugin so every time we read we reset the allocation. Will that work? --- Additional comment from Vijay Bellur on 2016-07-05 05:12:56 EDT --- REVIEW: http://review.gluster.org/14862 (Memory leak in gluster volume profile command) posted (#1) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/14862 (Memory leak in gluster volume profile command) posted (#2) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/14862 (cli: Modify the code to cleanup memory leak in gluster volume profile command) posted (#3) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/14862 (cli: Modify the code to cleanup memory leak in gluster volume profile command) posted (#4) for review on master by Niels de Vos (ndevos)
REVIEW: http://review.gluster.org/14862 (cli: Modify the code to cleanup memory leak in gluster volume profile command) posted (#5) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/14862 (cli: Modify the code to cleanup memory leak in glusterd and cli) posted (#6) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/14862 (glusterd: Modify the code to cleanup memory leak in glusterd) posted (#7) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/14862 (glusterd: Fix memory leak in glusterd (un)lock RPCs) posted (#8) for review on master by MOHIT AGRAWAL (moagrawa)
COMMIT: http://review.gluster.org/14862 committed in master by Atin Mukherjee (amukherj) ------ commit 07b95cf8104da42d783d053d0fbb8497399f7d00 Author: root <root.eng.blr.redhat.com> Date: Tue Jul 5 14:33:15 2016 +0530 glusterd: Fix memory leak in glusterd (un)lock RPCs Problem: At the time of execute "gluster volume profile <vol> info" command It does have memory leak in glusterd. Solution: Modify the code to prevent memory leak in glusterd. Fix : 1) Unref dict and free dict_val buffer in glusterd_mgmt_v3_lock_peer and glusterd_mgmt_v3_unlock_peers. Test : To verify the patch run below loop to generate io traffic for (( i=0 ; i<=1000000 ; i++ )); do echo "hi Start Line " > file$i; cat file$i >> /dev/null; done To verify the improvement in memory leak specific to glusterd run below command cnt=0;while [ $cnt -le 1000 ]; do pmap -x <glusterd-pid> | grep total; gluster volume profile distributed info > /dev/null; cnt=`expr $cnt + 1`; done After apply this patch it will reduce leak significantly. Change-Id: I52a0ca47adb20bfe4b1848a11df23e5e37c5cea9 BUG: 1352854 Signed-off-by: Mohit Agrawal <moagrawa> Reviewed-on: http://review.gluster.org/14862 Reviewed-by: Atin Mukherjee <amukherj> Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Prashanth Pai <ppai> CentOS-regression: Gluster Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/