Red Hat Bugzilla – Bug 1268125
glusterd memory overcommit
Last modified: 2016-06-28 08:13:22 EDT
Description of problem:
We were using Gluster 3.3 through 3.5 without issue but needed to add SSL support. Due to SSL bugs in earlier versions, the quickest path forward was to upgrade the network to 3.7. This generally appears to be fine as files added are appearing where they should, except that it's caused glusterd to vastly overcommit memory on both nodes where it runs.
On one (serverA, the 'master'), it had 4GB of RAM to work with, and the other (serverB) 2GB. Both got up to around 30GB of committed virtual memory in a couple of weeks. When other processes were stopped on serverA and glusterd restarted, the overcommit problem appeared to be alleviated and, if it was growing at all, grew a lot slower. We had to resize serverA and took it offline; at the same time, serverB glusterd shot up to 140GB while it was offline. Both are currently at 2GB of RAM (for other reasons) and, after restarting both daemons, appear to be growing in committed memory at around 4GB/day.
Version of GlusterFS package installed:
Location from which the packages are used:
GlusterFS Cluster Information:
Number of volumes: 2
Volume Names: backup, other
Volume on which the particular issue is seen: N/A
Type of volumes: backup Replicate, other Distribute
Volume options if available:
Volume Name: backup
Number of Bricks: 1 x 2 = 2
Volume Name: other
Number of Bricks: 1
[same as above]
OS Type: Linux
Mount type: GlusterFS
Have not tried (sorry).
Steps to Reproduce:
I stripped dates and did a uniq -n on the actual log messages for the glusterd process. server setup and SSL connect errors appear together.
17299 E [socket.c:2863:socket_connect] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_timer_proc+0xfb) [0x7f57813d366b] -->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_reconnect+0xb9) [0x7f5781185c59] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.4/rpc-transport/socket.so(+0x755d) [0x7f577a62255d] ) 0-socket: invalid argument: this->private [Invalid argument]
400 W [dict.c:1452:dict_get_with_ref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.4/xlator/mgmt/glusterd.so(build_shd_graph+0x69) [0x7f577c95ee99] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get_str_boolean+0x22) [0x7f57813af6a2] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x19406) [0x7f57813ad406] ) 0-dict: dict OR key (graph-check) is NULL [Invalid argument]
217 E [socket.c:2388:socket_poller] 0-socket.management: server setup failed
217 E [socket.c:352:ssl_setup_connection] 0-socket.management: SSL connect error
16 W [dict.c:1452:dict_get_with_ref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.4/xlator/mgmt/glusterd.so(build_shd_graph+0x69) [0x7f3c7c558e99] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get_str_boolean+0x22) [0x7f3c80fa96a2] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x19406) [0x7f3c80fa7406] ) 0-dict: dict OR key (graph-check) is NULL [Invalid argument]
11 E [socket.c:2501:socket_poller] 0-socket.management: error in polling loop
This is totally unrelated (probably?), but in order to get glusterd to really start, I had to ln /usr/lib/x86_64-linux-gnu/glusterfs/3.7.4/xlator/rpc-transport -s /usr/lib/x86_64-linux-gnu/glusterfs/3.7.4/rpc-transport as the logs were essentially complaining about being unable to find xlator/rpc-transport/socket.so.
Thanks for reporting the issue, however we would need a statedump of running glusterd instance once the hike is seen. Along with that cmd_history.log is also expected as that tells the commands performed in the cluster. Otherwise its almost impossible to analyze what has caused the memory leak.
Created attachment 1080409 [details]
statedump of backup volume
Created attachment 1080410 [details]
statedump of other volume
I've generated the statedump files and cut out 6336 sections named xlator.features.locks.backup-locks.inode from backup.dump. I'm hoping the contents of the inode context and the specific names of clients aren't necessary bits of info and have removed them from the dump files.
The cmd_history.log is pretty empty. The memory starts leaking immediately on start/restart. We're currently mitigating by restarting the process once a week, the growth rate is quite steady without having done anything else. cmd_history does contain successes for updating volume settings (specifically, SSL allows) that our internal management system runs every half hour. We could probably stand to modify it to issue changes only if the values differ, but that doesn't look to me to be the source of the issue.
(In reply to ryanlee from comment #4)
> I've generated the statedump files and cut out 6336 sections named
> xlator.features.locks.backup-locks.inode from backup.dump. I'm hoping the
> contents of the inode context and the specific names of clients aren't
> necessary bits of info and have removed them from the dump files.
> The cmd_history.log is pretty empty. The memory starts leaking immediately
> on start/restart. We're currently mitigating by restarting the process once
> a week, the growth rate is quite steady without having done anything else.
> cmd_history does contain successes for updating volume settings
> (specifically, SSL allows) that our internal management system runs every
> half hour. We could probably stand to modify it to issue changes only if
> the values differ, but that doesn't look to me to be the source of the issue.
You would need to take a statedump for glusterd process, not the clients. The attachment indicates the statedump is for clients. Would be able to provide that?
When I searched for gluster and statedump, I found info for running
% gluster volume statedump backup all
% gluster volume statedump other all
on the server, which is what's provided. If you need something else, you're going to have to be more precise about how I can make it for you, not just that you need it.
You could take a statedump of the glusterd instance with the following command:
kill -SIGUSR1 <pid of glusterd instance>
With that the statedump will be generated in /var/run/gluster with the name as glusterdump.<pid of glusterd>.timestamp
Hope this clarifies the confusion.
Created attachment 1081093 [details]
statedump of serverA glusterd
Created attachment 1081094 [details]
statedump of serverB glusterd
Great, thanks. Perhaps the new attachments will be of more use.
Statedump helps, however I did ask for the cmd_history.log & glusterd.log file as well. Along with it could you also provide the output of gluster volume info output?
(In reply to Atin Mukherjee from comment #11)
> however I did ask for the ... glusterd.log file as well
No, you didn't. But I already summarized the errors in it in the original bug report.
I've already mentioned it's effectively empty in comment #4.
> the output of gluster volume info output
I already provided it in the original bug report.
Still present in 3.7.6. I wonder if this is related to bug 1258931?
(In reply to ryanlee from comment #13)
> Still present in 3.7.6. I wonder if this is related to bug 1258931?
Not really IMO. There is no memory leak/over commit reported in that bug.
Sorry, I should have added a bit more context. Several months on, and the overcommit spread to all Gluster clients and started forcing one particularly small node offline due to resource exhaustion. It maxed out at 1TB on serverA (probably because the system didn't allow it to commit any more). Our requirement for SSL support was for secure access from an offsite Gluster client, but while enabling SSL provided a way in from one angle, its apparent memory-hogging side effects meant that client was nearly always disconnected anyways, so it wasn't going to work.
We went to find a different solution so we could switch back to non-SSL mode for everything else. It may not be the same bug, certainly, but I turned off SSL yesterday, and everything is back to normal - all of the memory overcommits on the servers and the clients are mercifully gone, and there's no longer a growth pattern in our monitoring graphs.
Which is the type of related I meant. If not the same, they're both rooted in issues with SSL mode. Is it possible the inability to connect to the self-heal daemon manifests as a constantly growing memory overcommit over the course of months?
REVIEW: http://review.gluster.org/14143 (socket: Reap own-threads) posted (#1) for review on release-3.7 by Kaushal M (firstname.lastname@example.org)
COMMIT: http://review.gluster.org/14143 committed in release-3.7 by Jeff Darcy (email@example.com)
Author: Kaushal M <firstname.lastname@example.org>
Date: Wed Apr 27 16:12:49 2016 +0530
socket: Reap own-threads
Backport of f8948e2 from master
Dead own-threads are reaped periodically (currently every minute). This
helps avoid memory being leaked, and should help prevent memory
starvation issues with GlusterD.
Signed-off-by: Kaushal M <email@example.com>
Smoke: Gluster Build System <firstname.lastname@example.org>
CentOS-regression: Gluster Build System <email@example.com>
NetBSD-regression: NetBSD Build System <firstname.lastname@example.org>
Reviewed-by: Jeff Darcy <email@example.com>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.12, please open a new bug report.
glusterfs-3.7.12 has been announced on the Gluster mailinglists , packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist  and the update infrastructure for your distribution.