Hide Forgot
We have a 6 node GlusterFS volume which is mounted on two clients, but only used from when. When running a "find -H . -user root -ls -exec chown -h user:user {} ;" on a subdirectory of the 22 TB volume (7TB used), the client process started out quite large, around 17 GB, and has since grown of the last three days this process has been running to about 41.5GB, and is till growing. Info on the setup: 6 Nodes 2 Socket Dell R510s 12 core Intel Westmere CPUs in each socket 64 GB of memory 12 1 TB disks 2 10Gb links to large Cisco switch RHEL 6.6, RHGS 3.6.0.53-1.el6rhs 6 lvm volumes (2 disks per volume) for bricks Distributed-Replicated, 12x3, 22 TB Volume Options performance.readdir-ahead: on performance.io-cache: off performance.stat-prefetch: on cluster.lookup-unhashed: off client.event-threads: 8 cluster.read-hash-mode: 2 auto-delete: disable snap-max-soft-limit: 90 snap-max-hard-limit: 256
The volume is currently running in a degraded state, where one member of the 6 nodes is no longer participating (problems with the OS install).
1288857 mainline 1288922 release-3.7 upstream RHGS 3.0 -> release-3.6 ???
This seems to be fixed and targeted to be available in RHGS 3.1.2. Please check https://bugzilla.redhat.com/show_bug.cgi?id=1288921. Fixed in version would be 3.7.5* Please confirm if you are okay to get this fix in 3.1.2.
Has somebody looked at the environment in which this is failing and determined this is the same bug as those?
please check /var/log/glusterfs/mnt.log on the client for the presence of "kernel notifier loop terminated" Thanks
I cannot find the file /var/log/glusterfs/mnt.log, and I don't see that string in the glusterfs/pbench.log, or in any other logs from that system. Sorry.
Not getting the "kernel notifier loop terminated" issue in my 3.6.x setup
There is an ongoing discussion on gluster-users about memory leaks in FUSE:http://www.gluster.org/pipermail/gluster-users/2016-January/024775.html Soumya,Kaleb and Xavier Hernandez have sent some patches upstream. I'll talk to Soumya and update the bug. Peter, could you provide the IP/login details of the client and servers? Thanks, Ravi
Sure, let's talk offline about what information you need to gather from the system.
Sure, but I'd also that we have all the info/logs here on the BZ, so that it gives context in case someone else wants to look.. So I spoke to Soumya. Her fixes is in gfapi and another one related to upcall, so it wouldn't be relevant here. The FUSE related fixes that have been merged in master are: http://review.gluster.org/13327 and http://review.gluster.org/13274 There is also a dict leak in DHT fixed recently: http://review.gluster.org/13322 But before that, could you provide the following? 1. When the memory of the client (fuse mount) is high, take a state dump of the process. Also note the memory consumed. 2. Perform #sync #echo 3 > /proc/sys/vm/drop_caches 3.Note the memory consumed and take the statedump of the process again. Feel free to ping me on #rhs, my nick is 'itisravi'
# slabtop --sort=c -o Active / Total Objects (% used) : 19486645 / 20288665 (96.0%) Active / Total Slabs (% used) : 397170 / 397170 (100.0%) Active / Total Caches (% used) : 78 / 123 (63.4%) Active / Total Size (% used) : 5943909.85K / 6061758.42K (98.1%) Minimum / Average / Maximum Object : 0.01K / 0.30K / 15.88K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 5343645 5343645 100% 0.75K 127301 42 4073632K fuse_inode 5696292 5696292 100% 0.19K 135626 42 1085008K dentry 475076 448395 94% 0.57K 16967 28 271472K radix_tree_node 179936 172950 96% 1.00K 5623 32 179936K xfs_inode 5347072 5347072 100% 0.03K 41774 128 167096K kmalloc-32 1485393 795740 53% 0.10K 38087 39 152348K buffer_head 1251648 1251648 100% 0.06K 19557 64 78228K kmalloc-64 9696 8973 92% 2.00K 606 16 19392K kmalloc-2048 27342 11260 41% 0.64K 558 49 17856K proc_inode_cache 11470 11470 100% 1.02K 370 31 11840K ext4_inode_cache 28896 16671 57% 0.38K 688 42 11008K blkdev_requests
# pidstat -r -u -C glusterfs 2 4 Linux 3.10.0-229.el7.x86_64 (perf42.perf.lab.eng.bos.redhat.com) 02/05/2016 _x86_64_ (12 CPU) 05:52:36 PM UID PID %usr %system %guest %CPU CPU Command 05:52:38 PM 0 1706 4.98 11.44 0.00 16.42 4 glusterfs 05:52:36 PM UID PID minflt/s majflt/s VSZ RSS %MEM Command 05:52:38 PM 0 1706 8676.62 0.00 53101696 52261664 52.88 glusterfs 05:52:38 PM UID PID %usr %system %guest %CPU CPU Command 05:52:40 PM 0 1706 4.50 9.00 0.00 13.50 4 glusterfs 05:52:38 PM UID PID minflt/s majflt/s VSZ RSS %MEM Command 05:52:40 PM 0 1706 7074.00 0.00 53101696 52261664 52.88 glusterfs 05:52:40 PM UID PID %usr %system %guest %CPU CPU Command 05:52:42 PM 0 1706 170.50 0.00 0.00 170.50 4 glusterfs 05:52:40 PM UID PID minflt/s majflt/s VSZ RSS %MEM Command 05:52:42 PM 0 1706 6117.00 0.00 53101696 52261660 52.88 glusterfs 05:52:42 PM UID PID %usr %system %guest %CPU CPU Command 05:52:42 PM UID PID minflt/s majflt/s VSZ RSS %MEM Command 05:52:44 PM 0 1706 9855.50 0.00 53101696 52261664 52.88 glusterfs Average: UID PID %usr %system %guest %CPU CPU Command Average: 0 1706 44.94 5.12 0.00 50.06 - glusterfs Average: UID PID minflt/s majflt/s VSZ RSS %MEM Command Average: 0 1706 7931.71 0.00 53101696 52261663 52.88 glusterfs
# cat /proc/meminfo MemTotal: 98825916 kB MemFree: 2955356 kB MemAvailable: 40674920 kB Buffers: 9168 kB Cached: 36311612 kB SwapCached: 32 kB Active: 51902544 kB Inactive: 36867284 kB Active(anon): 48488272 kB Inactive(anon): 3961580 kB Active(file): 3414272 kB Inactive(file): 32905704 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 4194300 kB SwapFree: 4188980 kB Dirty: 116 kB Writeback: 0 kB AnonPages: 52449140 kB Mapped: 40988 kB Shmem: 804 kB Slab: 6180372 kB SReclaimable: 1737436 kB SUnreclaim: 4442936 kB KernelStack: 7744 kB PageTables: 111456 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 53607256 kB Committed_AS: 1760364 kB VmallocTotal: 34359738367 kB VmallocUsed: 320092 kB VmallocChunk: 34308293628 kB HardwareCorrupted: 0 kB AnonHugePages: 27322368 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 145596 kB DirectMap2M: 6135808 kB DirectMap1G: 94371840 kB
# lsof -p 1706 lsof: WARNING: can't stat() nfs4 file system /pub Output information may be incomplete. COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME glusterfs 1706 root cwd DIR 253,1 4096 128 / glusterfs 1706 root rtd DIR 253,1 4096 128 / glusterfs 1706 root txt REG 253,1 89320 201460176 /usr/sbin/glusterfsd glusterfs 1706 root mem REG 253,1 83384 607939 /usr/lib64/glusterfs/3.6.0.53/xlator/meta.so glusterfs 1706 root mem REG 253,1 119512 67415717 /usr/lib64/glusterfs/3.6.0.53/xlator/debug/io-stats.so glusterfs 1706 root mem REG 253,1 61336 608029 /usr/lib64/glusterfs/3.6.0.53/xlator/performance/md-cache.so glusterfs 1706 root mem REG 253,1 36504 608031 /usr/lib64/glusterfs/3.6.0.53/xlator/performance/quick-read.so glusterfs 1706 root mem REG 253,1 22864 608033 /usr/lib64/glusterfs/3.6.0.53/xlator/performance/readdir-ahead.so glusterfs 1706 root mem REG 253,1 52192 608032 /usr/lib64/glusterfs/3.6.0.53/xlator/performance/read-ahead.so glusterfs 1706 root mem REG 253,1 57888 608035 /usr/lib64/glusterfs/3.6.0.53/xlator/performance/write-behind.so glusterfs 1706 root mem REG 253,1 381024 607927 /usr/lib64/glusterfs/3.6.0.53/xlator/cluster/dht.so glusterfs 1706 root mem REG 253,1 521544 607925 /usr/lib64/glusterfs/3.6.0.53/xlator/cluster/afr.so glusterfs 1706 root mem REG 253,1 271992 67415720 /usr/lib64/glusterfs/3.6.0.53/xlator/protocol/client.so glusterfs 1706 root mem REG 253,1 37152 201664912 /usr/lib64/libnss_sss.so.2 glusterfs 1706 root mem REG 253,1 27512 201351271 /usr/lib64/libnss_dns-2.17.so glusterfs 1706 root mem REG 253,1 58288 201351273 /usr/lib64/libnss_files-2.17.so glusterfs 1706 root mem REG 253,1 153184 201328204 /usr/lib64/liblzma.so.5.0.99 glusterfs 1706 root mem REG 253,1 398272 201328279 /usr/lib64/libpcre.so.1.2.0 glusterfs 1706 root mem REG 253,1 147096 201328289 /usr/lib64/libselinux.so.1 glusterfs 1706 root mem REG 253,1 110808 201351283 /usr/lib64/libresolv-2.17.so glusterfs 1706 root mem REG 253,1 15688 201328475 /usr/lib64/libkeyutils.so.1.5 glusterfs 1706 root mem REG 253,1 62720 201481349 /usr/lib64/libkrb5support.so.0.1 glusterfs 1706 root mem REG 253,1 202576 201481337 /usr/lib64/libk5crypto.so.3.1 glusterfs 1706 root mem REG 253,1 15840 201328198 /usr/lib64/libcom_err.so.2.1 glusterfs 1706 root mem REG 253,1 942024 201481347 /usr/lib64/libkrb5.so.3.3 glusterfs 1706 root mem REG 253,1 316560 201481333 /usr/lib64/libgssapi_krb5.so.2.2 glusterfs 1706 root mem REG 253,1 449808 201338341 /usr/lib64/libssl.so.1.0.1e glusterfs 1706 root mem REG 253,1 82976 201408758 /usr/lib64/glusterfs/3.6.0.53/rpc-transport/socket.so glusterfs 1706 root mem REG 253,1 202168 67415726 /usr/lib64/glusterfs/3.6.0.53/xlator/mount/fuse.so glusterfs 1706 root mem REG 253,1 106065056 67129219 /usr/lib/locale/locale-archive glusterfs 1706 root mem REG 253,1 90632 201328292 /usr/lib64/libz.so.1.2.7 glusterfs 1706 root mem REG 253,1 2107760 201327679 /usr/lib64/libc-2.17.so glusterfs 1706 root mem REG 253,1 2013048 201338339 /usr/lib64/libcrypto.so.1.0.1e glusterfs 1706 root mem REG 253,1 141616 201351281 /usr/lib64/libpthread-2.17.so glusterfs 1706 root mem REG 253,1 19512 201328126 /usr/lib64/libdl-2.17.so glusterfs 1706 root mem REG 253,1 100384 201408747 /usr/lib64/libgfxdr.so.0.0.0 glusterfs 1706 root mem REG 253,1 112584 201408745 /usr/lib64/libgfrpc.so.0.0.0 glusterfs 1706 root mem REG 253,1 673560 201408753 /usr/lib64/libglusterfs.so.0.0.0 glusterfs 1706 root mem REG 253,1 164336 201328123 /usr/lib64/ld-2.17.so glusterfs 1706 root mem REG 253,1 26254 134323387 /usr/lib64/gconv/gconv-modules.cache glusterfs 1706 root 0r CHR 1,3 0t0 6146 /dev/null glusterfs 1706 root 1w CHR 1,3 0t0 6146 /dev/null glusterfs 1706 root 2w CHR 1,3 0t0 6146 /dev/null glusterfs 1706 root 3u a_inode 0,9 0 5833 [eventpoll] glusterfs 1706 root 4u unix 0xffff8817e2eacb00 0t0 27006 socket glusterfs 1706 root 5u IPv4 22587659 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1023->gprfs001.sbu.lab.eng.bos.redhat.com:24007 (ESTABLISHED) glusterfs 1706 root 6r FIFO 0,8 0t0 23473 pipe glusterfs 1706 root 7w FIFO 0,8 0t0 23473 pipe glusterfs 1706 root 8u CHR 10,229 0t0 16567 /dev/fuse glusterfs 1706 root 10u IPv4 23472 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:958->gprfs010-b-10ge.sbu.lab.eng.bos.redhat.com:49157 (ESTABLISHED) glusterfs 1706 root 11r CHR 1,9 0t0 6151 /dev/urandom glusterfs 1706 root 12u IPv4 23426 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1004->gprfs002-b-10ge.sbu.lab.eng.bos.redhat.com:49153 (ESTABLISHED) glusterfs 1706 root 13u IPv4 23417 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1013->gprfs010-b-10ge.sbu.lab.eng.bos.redhat.com:49152 (ESTABLISHED) glusterfs 1706 root 14u IPv4 27015 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1023->gprfs001-b-10ge.sbu.lab.eng.bos.redhat.com:49152 (ESTABLISHED) glusterfs 1706 root 15u IPv4 23411 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1019->gprfs009-b-10ge.sbu.lab.eng.bos.redhat.com:49152 (ESTABLISHED) glusterfs 1706 root 16u IPv4 23413 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1017->gprfs011-b-10ge.sbu.lab.eng.bos.redhat.com:49152 (ESTABLISHED) glusterfs 1706 root 17u IPv4 23428 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1002->gprfs010-b-10ge.sbu.lab.eng.bos.redhat.com:49153 (ESTABLISHED) glusterfs 1706 root 18u IPv4 23415 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1015->gprfs002-b-10ge.sbu.lab.eng.bos.redhat.com:49152 (ESTABLISHED) glusterfs 1706 root 19u IPv4 23420 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:surf->gprfs001-b-10ge.sbu.lab.eng.bos.redhat.com:49153 (ESTABLISHED) glusterfs 1706 root 20u IPv4 23422 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1008->gprfs009-b-10ge.sbu.lab.eng.bos.redhat.com:49153 (ESTABLISHED) glusterfs 1706 root 21u IPv4 23424 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:1006->gprfs011-b-10ge.sbu.lab.eng.bos.redhat.com:49153 (ESTABLISHED) glusterfs 1706 root 22w REG 253,1 815103913 40555 /var/log/glusterfs/pbench.log glusterfs 1706 root 23u IPv4 23439 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:nas->gprfs010-b-10ge.sbu.lab.eng.bos.redhat.com:49154 (ESTABLISHED) glusterfs 1706 root 24u IPv4 23431 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:garcon->gprfs001-b-10ge.sbu.lab.eng.bos.redhat.com:49154 (ESTABLISHED) glusterfs 1706 root 25u IPv4 23433 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:maitrd->gprfs009-b-10ge.sbu.lab.eng.bos.redhat.com:49154 (ESTABLISHED) glusterfs 1706 root 26u IPv4 23435 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:pop3s->gprfs011-b-10ge.sbu.lab.eng.bos.redhat.com:49154 (ESTABLISHED) glusterfs 1706 root 27u IPv4 23437 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:imaps->gprfs002-b-10ge.sbu.lab.eng.bos.redhat.com:49154 (ESTABLISHED) glusterfs 1706 root 28u IPv4 23450 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:980->gprfs010-b-10ge.sbu.lab.eng.bos.redhat.com:49155 (ESTABLISHED) glusterfs 1706 root 29u IPv4 23442 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:988->gprfs001-b-10ge.sbu.lab.eng.bos.redhat.com:49155 (ESTABLISHED) glusterfs 1706 root 30u IPv4 23444 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:986->gprfs009-b-10ge.sbu.lab.eng.bos.redhat.com:49155 (ESTABLISHED) glusterfs 1706 root 31u IPv4 23446 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:984->gprfs011-b-10ge.sbu.lab.eng.bos.redhat.com:49155 (ESTABLISHED) glusterfs 1706 root 32u IPv4 23448 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:982->gprfs002-b-10ge.sbu.lab.eng.bos.redhat.com:49155 (ESTABLISHED) glusterfs 1706 root 33u IPv4 23461 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:969->gprfs010-b-10ge.sbu.lab.eng.bos.redhat.com:49156 (ESTABLISHED) glusterfs 1706 root 34u IPv4 23453 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:977->gprfs001-b-10ge.sbu.lab.eng.bos.redhat.com:49158 (ESTABLISHED) glusterfs 1706 root 35u IPv4 23455 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:975->gprfs009-b-10ge.sbu.lab.eng.bos.redhat.com:49156 (ESTABLISHED) glusterfs 1706 root 36u IPv4 23457 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:973->gprfs011-b-10ge.sbu.lab.eng.bos.redhat.com:49156 (ESTABLISHED) glusterfs 1706 root 37u IPv4 23459 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:971->gprfs002-b-10ge.sbu.lab.eng.bos.redhat.com:49156 (ESTABLISHED) glusterfs 1706 root 39u IPv4 23464 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:966->gprfs001-b-10ge.sbu.lab.eng.bos.redhat.com:49157 (ESTABLISHED) glusterfs 1706 root 40u IPv4 23466 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:964->gprfs009-b-10ge.sbu.lab.eng.bos.redhat.com:49157 (ESTABLISHED) glusterfs 1706 root 41u IPv4 23468 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:962->gprfs011-b-10ge.sbu.lab.eng.bos.redhat.com:49157 (ESTABLISHED) glusterfs 1706 root 42u IPv4 23470 0t0 TCP perf42.perf.lab.eng.bos.redhat.com:960->gprfs002-b-10ge.sbu.lab.eng.bos.redhat.com:49157 (ESTABLISHED)
What do you mean by a "statedump" of the process?
(In reply to Peter Portante from comment #17) > What do you mean by a "statedump" of the process? You would need to do a `kill -SIGUSR1 <fuse_mount_process_pid>` to get the statedump of that process. `gluster --print-statedumpdir` should give you the location where it gets saved. Section 14.6 of the admin guide https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/pdf/Administration_Guide/Red_Hat_Storage-3-Administration_Guide-en-US.pdf has some information on statedumps.
From what I can tell, this is a server side state dump, not a state dump of the glusterfs daemon on the client side. Do I understand this right? There is no "gluster" command available on our client perf42, where the memory leak is occuring. And there does not appear to be any signal handling for the SIGUSR1 applied to the glusterfs daemon process when I send the signal, at least when I strace the process, nothing happens. Do I take statedumps for all 6 servers in the cluster? I am not sure how that addresses the memory leak on the client though.
(In reply to Peter Portante from comment #19) > From what I can tell, this is a server side state dump, not a state dump of > the glusterfs daemon on the client side. Do I understand this right? > The command I gave is for the client. Sending a SIGUSR1 to client process generates the statedump for it, as mentioned in the admin guide. > There is no "gluster" command available on our client perf42, where the > memory leak is occuring. And there does not appear to be any signal > handling for the SIGUSR1 applied to the glusterfs daemon process when I send > the signal, at least when I strace the process, nothing happens. > Is there a /var/run/gluster/ path on the client? If not, create one and try again.
Created attachment 1121563 [details] State dump before drop cache executed Thanks for the clarification of the documentation. I'll file a bug against the documentation to make that easier to understand.
Created attachment 1121564 [details] State dump after drop caches performed Ran the drop caches operation and then took a state dump. No change in the memory usage of the GlusterFS process.
Below is a quick capture of an IRC conversation about the state of the system in this BZ: itisravi portante: btw, what kind of files are there in the volume? portante thanks, this is helpful, and I'll file a doc enhancement but to get that clearer, because I was confused by the documentation text itisravi portante: regular files or VM images..? portante regular files itisravi sure itisravi ok portante lots of small files, though there are a few that are 1 or 2 GB itisravi portante: the process still consumes 40 odd GB is it? portante I started the command, "find -H . -user root -ls -exec chown -h pbench:pbench {} ;" running about a week ago, it has yet to complete, it is slowly chugging through it itisravi right portante yes, right now it is reported by top, 49.4GB RES itisravi hmm. portante all the users of the volume don't see any problems. portante the volume is running fine for all intents and purposes portante and it is running in a degraded state, no less portante One of the peers is reported as: State: Peer Rejected (Connected) portante somebody toasted the setup on that host, so I am following the steps in section 8.6.2 of the guide you posted in the BZ to restore that cluster member itisravi portante: oh ok, this node in question is hosting one of the replica bricks? portante yes, this cluster is a 2x3 itisravi If the bricks of this node are down, then once they are back, self-heal is going to kick in. portante yes, that is what I understand to be the case as well itisravi you'll probably notice high cpu usage during the heal. portante okay, good to know, it will have the weekend to get a good head start ahead of the coming week itisravi right. itisravi so the thing is, running a find on the mount triggers lookups and creates inodes for all these files. The no. of inodes in the memory is only limited by the RAM, which could explain the high mem. usage. itisravi But the fact that dropping caches is not reducing the mem usage seems to indicate memory leaks. portante okay
Notes to self from looking at the client state dumps (Reference:https://github.com/gluster/glusterfs/blob/master/doc/debugging/statedump.md): 1.Mallinfo: mallinfo_fordblks-->* Total free space (bytes) */ Before drop_caches:1538645840 After drop_caches:1305856192 Not sure how the free space can _reduce_ after a drop_cache. 2.Data structure allocation stats: No difference before and after. 3.Mempools: fuse:dentry_t and inode_t have cold_count=0 (hot count ~ 32K) before drop_cache and high pool misses. This is probably due to the high number of lookups triggered by FUSE. After drop_cache, cold_count, hot_count approx 16K each.Seems some inodes were reclaimed due to drop_caches None of other xlator pools have zero cold_count. 4.iobufs: All arena.x have been moved to purged.x==>seems a logical effect of drop_caches 5.call stack and frame: I see mostly lookup and readdirp. Nothing suspicious here. 6.[mount/fuse. gf_common_mt_inode_ctx memory usage: Before:num_allocs=461330 After:num_allocs=2007 7.History of operations in Fuse Mostly lookups , stats and readdirps. See lot of ENOENT errors in fuse_entry_cbk for LOOKUP() 8. Memory accounting highest consumers (num_allocs): Before: num_allocs=5703132--[mount/fuse.fuse - usage-type 40 memusage], type=gf_common_mt_strdup num_allocs=5703133--performance/md-cache.pbench-md-cache - usage-type 117 memusage], type=gf_mdc_mt_md_cache_t num_allocs=694438 type=[mount/fuse.fuse - usage-type 119 memusage], type=gf_fuse_mt_iov_base num_allocs=6142231type=[mount/fuse.fuse - usage-type 48 memusage], tupe=gf_common_mt_mem_pool After: num_allocs=695587--[mount/fuse.fuse - usage-type 119 memusage, type=type=gf_fuse_mt_iov_base num_allocs=194492 It seemed odd that the num_allocs for gf_fuse_mt_iov_base was same before and after drop_caches while every other data type saw a drastic reduction. But with some code reading, Raghavendra (Du) pointed out that this is a bug in memory accounting.Need to fix that.
Hi Peter, While I'm not getting anything conclusive about the high memory usage from the statedumps, they do seem to indicate a high amount of lookups and readdirps triggered from FUSE from the above points (3,5,6,7,8). The fixes that went in recently in upstream, specifically Xavier's patch http://review.gluster.org/#/c/13327/ should help in reducing the leaks. If I provide a build with the FUSE fixes on top of the latest RHGS-3.1.2 branch,would you be able to test and see if you are still observing the high memory usage? You would need to update both clients and servers.
Yes, I'd be happy to try something out. However, we are running 3.0.4, I believe, so let me get the machines upgraded to RHEL 7.2 and RHGS-3.1.2 first and then we can try that out. Thanks!
I should amend this: we are running on the *client* RHEL 7.1 with: Installed Packages glusterfs.x86_64 3.6.0.53-1.el7rhs @/glusterfs-3.6.0.53-1.el7rhs.x86_64 glusterfs-api.x86_64 3.6.0.53-1.el7rhs @/glusterfs-api-3.6.0.53-1.el7rhs.x86_64 glusterfs-fuse.x86_64 3.6.0.53-1.el7rhs @/glusterfs-fuse-3.6.0.53-1.el7rhs.x86_64 glusterfs-libs.x86_64 3.6.0.53-1.el7rhs @/glusterfs-libs-3.6.0.53-1.el7rhs.x86_64 On the servers we are running RHEL 6.6 with RHGS 3.0.4. So as long as an RHGS-3.1.2 client can work with an RHGS 3.0.4 cluster, then we can do this once the self-heal completes (estimated EOD Tuesday, Feb 16th, right now).
(In reply to Peter Portante from comment #27) > I should amend this: we are running on the *client* RHEL 7.1 with: > > Installed Packages > glusterfs.x86_64 3.6.0.53-1.el7rhs > @/glusterfs-3.6.0.53-1.el7rhs.x86_64 > glusterfs-api.x86_64 3.6.0.53-1.el7rhs > @/glusterfs-api-3.6.0.53-1.el7rhs.x86_64 > glusterfs-fuse.x86_64 3.6.0.53-1.el7rhs > @/glusterfs-fuse-3.6.0.53-1.el7rhs.x86_64 > glusterfs-libs.x86_64 3.6.0.53-1.el7rhs > @/glusterfs-libs-3.6.0.53-1.el7rhs.x86_64 > > On the servers we are running RHEL 6.6 with RHGS 3.0.4. > Oh my! You seem to be running a newer version of client with older version of server. This is not supported in general. Moreover, RHGS-3.0.4 has the older afr-v1 (replication translator) code while glusterfs-3.6.x has the re-factored afr-v2 code which is not backward compatible. :-( > So as long as an RHGS-3.1.2 client can work with an RHGS 3.0.4 cluster, then > we can do this once the self-heal completes (estimated EOD Tuesday, Feb > 16th, right now).
Hi Peter, shall I provide a build with the fixes?
I think we need to get this setup to a supported state first. I'll post here when we have moved this to the supported setup.
Removing the BZ from 3.1.3 for now.
Hi Peter, Are you experiencing any more memory leak issues in 3.1.3? If not, we could close this BZ.
One of our clients recently grew to about 55.8GB, and at that point we took a state dump, and Vijay was going to analyze it to see if any of it represented a memory leak.
We have noticed that the bug is not reproduced in the latest version of the product (RHGS-3.3.1+). If the bug is still relevant and is being reproduced, feel free to reopen the bug.
Clearing needinfo as this bug is closed now.