Bug 1738878

Summary:

FUSE client's memory leak

Product:

[Community] GlusterFS

Reporter:

Sergey Pleshkov <s.pleshkov>

Component:

core

Assignee:

Csaba Henk <csaba>

Status:

CLOSED NEXTRELEASE

QA Contact:

Severity:

low

Docs Contact:

Priority:

high

Version:

CC:

bugs, nbalacha, pasik, s.pleshkov

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-10-25 05:03:40 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
statedump1	none

Description Sergey Pleshkov 2019-08-08 10:38:26 UTC

Description of problem:
Single FUSE client consume a lot of memory.
In our clients production environment, single FUSE client slowly continiously eat memory until killed by OOM case

Version-Release number of selected component (if applicable):
Servers
# gluster --version
glusterfs 5.5

rpm -qa | grep glu
glusterfs-libs-5.5-1.el7.x86_64
glusterfs-fuse-5.5-1.el7.x86_64
glusterfs-client-xlators-5.5-1.el7.x86_64
centos-release-gluster5-1.0-1.el7.centos.noarch
glusterfs-api-5.5-1.el7.x86_64
glusterfs-cli-5.5-1.el7.x86_64
nfs-ganesha-gluster-2.7.1-1.el7.x86_64
glusterfs-5.5-1.el7.x86_64
glusterfs-server-5.5-1.el7.x86_64

Client
# gluster --version
glusterfs 5.6

# rpm -qa | grep glus
glusterfs-api-5.6-1.el7.x86_64
glusterfs-libs-5.6-1.el7.x86_64
glusterfs-cli-5.6-1.el7.x86_64
glusterfs-client-xlators-5.6-1.el7.x86_64
glusterfs-fuse-5.6-1.el7.x86_64
glusterfs-5.6-1.el7.x86_64
libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.4.x86_64

How reproducible:
Setup glusterfs replication cluster (3 node, replicate) with many of small files.
Mount storage with FUSE client, set some process to work with gluster folder
Read files metadata and writes files content.

This problem rises with one client that have read executetable files and write logs processes (java|c++ programs) from this gluster volume, other clients same gluster volume have not this problem when work with read|write processes.

Actual results:
RSS memory of FUSE client grows infinitely. 

Expected results:
RSS memory doesn't grow infinitely :)

Additional info:
Get statedumps from problem client, find this results:

pool-name=data_t
active-count=40897046
sizeof-type=72
padded-sizeof=128
size=5234821888
shared-pool=0x7f6bf222aca0

pool-name=dict_t
active-count=40890978
sizeof-type=160
padded-sizeof=256
size=10468090368
shared-pool=0x7f6bf222acc8

Found similar bug  - https://bugzilla.redhat.com/show_bug.cgi?id=1623107

Disabled "readdir-ahead" option to volume, but didn't helped

Comment 1 Sergey Pleshkov 2019-08-08 10:43:21 UTC

Share two statedump
https://cloud.hostco.ru/s/w9MY6jj5Hpj2qoa

Comment 2 Sergey Pleshkov 2019-08-08 10:52:36 UTC

Server and Client OS - Red Hat Enterprise Linux Server release 7.6 (Maipo) / Red Hat Enterprise Linux Server release 7.5 (Maipo)
When client had gluster client from RH repo - 3.12 vers - situation was the same

if it isn't version bug, would you have suggestions what is could be ? Which gluster volume options check and so on

Comment 3 Nithya Balachandran 2019-08-12 05:27:55 UTC

(In reply to Sergey Pleshkov from comment #2)
> Server and Client OS - Red Hat Enterprise Linux Server release 7.6 (Maipo) /
> Red Hat Enterprise Linux Server release 7.5 (Maipo)
> When client had gluster client from RH repo - 3.12 vers - situation was the
> same
> 
> if it isn't version bug, would you have suggestions what is could be ? Which
> gluster volume options check and so on

Please provide the gluster volume info for this volume.

Do you have any script/steps we can use to reproduce the leak?

Comment 4 Sergey Pleshkov 2019-08-12 05:32:19 UTC

[root@LSY-GL-01 host]# gluster volume info PROD

Volume Name: PROD
Type: Replicate
Volume ID: f54a0ce9-d2ec-4d44-a1f8-c53cf1c49a52
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: lsy-gl-01:/diskForData/prod
Brick2: lsy-gl-02:/diskForData/prod
Brick3: lsy-gl-03:/diskForData/prod
Options Reconfigured:
performance.readdir-ahead: off
client.event-threads: 24
server.event-threads: 24
server.allow-insecure: on
features.shard-block-size: 64MB
features.shard: on
network.ping-timeout: 5
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
performance.io-thread-count: 24
cluster.heal-timeout: 120

Comment 5 Sergey Pleshkov 2019-08-12 05:36:26 UTC

This problem arose on one production client, so I can’t immediately check which steps can repeat the problem without interrupting business processes. I will try to reproduce the behavior on the test cluster and let you know.

Comment 6 Csaba Henk 2019-08-15 09:15:38 UTC

Hi Sergey, can you please take statedumps at regular intervals during your test (say, in every 30 minutes, but feel free to adjust in light of the dynamic of the situation) so that we can observe the progress, and tar 'em up and attach to the bug?

Comment 7 Sergey Pleshkov 2019-08-16 12:00:48 UTC

Hi everybody, I will try to reproduce this problem on test environment on next week and will do statedump in process of testing

Comment 8 Sergey Pleshkov 2019-08-29 06:11:44 UTC

Created attachment 1609209 [details]
statedump1

Comment 9 Sergey Pleshkov 2019-08-29 06:29:28 UTC

Hello

Yesterday I ran tests on a problem client. These were the find and chmod commands on gluster share. Actually, the process of the glusterfs continiously eats RAM on them and does not free it away.
On another client that uses glusterfs version 3.12.2 (from RHEL7 repo), I also encountered a similar situation - glusterfs process eats  RAM and it is also not free it ( but it is eaten very slowly)

On other clients that access the same gluster volume, when performing tests with find and chmod command, RAM is also eaten up, but freed when the tests are turned off.

I collected a few state dumps from a problem client and put it in the cloud. https://cloud.hostco.ru/s/w9MY6jj5Hpj2qoa

In the near future I plan to upgrade glusterfs on client to version 6.5 and set lru-limit (don't know what i can do about this problem). 
Do you have any advise about it ?

Script to reproduce problem:

#!/bin/sh

a=0

while [ $a -lt 36000 ]
do
    find $gluster_mount_point  -type f   > /dev/null
    sleep 1
    a=`expr $a + 1`
done

Comment 10 Sergey Pleshkov 2019-08-30 09:02:16 UTC

Hello

Yesterday I upgraded the client to version 6.5 and set the lru-limit - the problem with the continuous occupation of RAM was solved by this workaround.
Gathered a couple of state dumps if anybody want to see them.
https://cloud.hostco.ru/s/w9MY6jj5Hpj2qoa

But ran into another problem after this update
Server software version 5.5, client version 6.5 - every time I write a file to a mounted shared folder, I see this error in the logs (with or without lru-limit option)

[2019-08-30 08:31:04.763118] E [fuse-bridge.c:220:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f361d877a3b] (--> /usr/lib64/glusterfs/6.5/xlator/mount/fuse.so(+0x81d1)[0x7f3614c261d1] (--> /usr/lib64/glusterfs/6.5/xlator/mount/fuse.so(+0x8aaa)[0x7f3614c26aaa] (--> /lib64/libpthread.so.0(+0x7dd5)[0x7f361c6b5dd5] (--> /lib64/libc.so.6(clone+0x6d)[0x7f361bf7dead] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory

Files are created in shared folder, on other clients I see them, the contents can be updated.

I need to open another bug on this issue?

Comment 11 Csaba Henk 2019-09-02 20:06:59 UTC

Yes, this is a known phenomenon and basically harmless, if not too frequent.

The effect of the lru-limit patch is that the glusterfs client asks the kernel to drop those inodes which are found to be inactive for a long time. (The client can't get rid of them on its own, it needs to keep the inode context as long as the kernel keeps a reference to them. The kernel indicates to the glusterfs client when it abandons all references to the inode (this is called "forgetting the inode"), so this condition shall be known to the client; and what the glusterfs client can do is to ask the kernel to evict an inode from caches (this is called "inode invalidation"), which usually implies abandoning references and forgetting it.)

However, it's a naturally racy situation: by the time the glusterfs client sends the request to the kernel to invalidate a given inode, the kernel might have already forgotten it and the reference sent with the invalidation request is dangling. The kernel provides feedback to the glusterfs client of this situation by failing the write to /dev/fuse that carries the invalidation request with errno ENOENT, "No such file or directory".

If this occurs, there is nothing wrong about it in itself. However, if this scenario proliferates, that's an indication of the glusterfs client getting overwhelmed by its invalidation requests so that these requests accumulate faster than they are processed and written to /dev/fuse. This might result in trashing performance (while being useless in terms of inode footprint reduction). To overcome this, we introduced a tunable to stop filing invalidation requests if there number of outstanding invalidation request hits a threshold.

This is implemented in https://review.gluster.org/23187. Do you think you are in need of, are you interested in trying this patch?

Comment 12 Sergey Pleshkov 2019-09-03 04:57:16 UTC

Hello

Well, users using this client software do not complain about performance at the moment - this error just bothered me.

Will this patch be included in client version 6.6? Since I have this client software in production, the software installed from the repository, and there are no user complaints, I think I will wait for the next version of the software.

Thanks for the error clarification.

Comment 13 Sergey Pleshkov 2019-09-05 05:52:49 UTC

Hello

For 6 days of using the client 6.5 with the lru-limit option - the client's gluster process still occuping RAM (from 0.9% to 1.3%)

Did 2 statedumps - immediately after installing the client software and today - everything looks the same as before install 6.5, if I'm not mistaken, the dict_t mempool is filling.
https://cloud.hostco.ru/s/w9MY6jj5Hpj2qoa

Are there any other suggestions or tips for me to do in this situation?

Comment 14 Nithya Balachandran 2019-09-05 09:10:49 UTC

From the statedump:




[mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage]
size=117432480
num_allocs=212740
--


This is likely due to the bug that was fixed in https://review.gluster.org/#/c/glusterfs/+/23016/. This fix needs to be backported to release-6 as well.


The lru-limit has been set to 65000 so there are still over 65K inodes in memory. Those will not be freed until the entries are deleted or the client is remounted. The memory for these inodes, dentries and associated information for various xlators will use up memory.

[nbalacha@dhcp35-62 Downloads]$ grep -A3 -B3 lru_size glusterdump.6947.dump.1567662323 
xlator.mount.fuse.itable.name=meta-autoload/inode
xlator.mount.fuse.itable.lru_limit=65000
xlator.mount.fuse.itable.active_size=4613
xlator.mount.fuse.itable.lru_size=64996
xlator.mount.fuse.itable.purge_size=0
xlator.mount.fuse.itable.invalidate_size=0


You could reduce the lru-limit value further - that will lower the maximum number of inodes in memory and the associated structures various xlators create per inode.


For the dict_t , we will need more information before we can proceed. Do you have a test script that reproduces the problem?