Created attachment 1479256 [details] volume options Description of problem: FUSE clients consume a lot of memory. In our production environment, single FUSE client uses ~16G RSS after two weeks of uptime. Version-Release number of selected component (if applicable): 4.1 How reproducible: Setup glusterfs replication cluster with tons of small files. Mount storage with FUSE client. Read files metadata and files content Actual results: RSS memory of FUSE client grows infinitely. Expected results: RSS memory doesn't grow infinitely :)
Created attachment 1479257 [details] FUSE client statedump
Please, let me know if I can do something else to help to investigate the problem.
Also, I summed all `size` fields in statedump and amount is about 4G which is too much, but it's interesting that somewhere were lost another 12G (16G-4G).
Finaly I tried to find a leak with valgrind: valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all --log-file=gluster_fuse.log /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv --volfile-id=/gl /mnt/gl Than I did: find /mnt/gl -type f Part of valgrind report (full report in attachment): ==15836== 22,504,608 bytes in 104,188 blocks are possibly lost in loss record 976 of 978 ==15836== at 0x4C2B9B5: calloc (vg_replace_malloc.c:711) ==15836== by 0x4E85E95: __gf_calloc (in /usr/lib64/libglusterfs.so.0.0.1) ==15836== by 0x14495FEA: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/md-cache.so) ==15836== by 0x144961C3: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/md-cache.so) ==15836== by 0x1449ECC7: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/md-cache.so) ==15836== by 0x1407C52A: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/quick-read.so) ==15836== by 0x13E69552: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/io-cache.so) ==15836== by 0x13C5AA26: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/readdir-ahead.so) ==15836== by 0x138364C4: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/write-behind.so) ==15836== by 0x135E3CBF: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/cluster/dht.so) ==15836== by 0x132F02B5: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/cluster/afr.so) ==15836== by 0x130B0439: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so) ==15836== ==15836== 25,005,360 bytes in 104,189 blocks are possibly lost in loss record 977 of 978 ==15836== at 0x4C2B9B5: calloc (vg_replace_malloc.c:711) ==15836== by 0x4E85E95: __gf_calloc (in /usr/lib64/libglusterfs.so.0.0.1) ==15836== by 0x13E70496: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/io-cache.so) ==15836== by 0x13E68FF9: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/io-cache.so) ==15836== by 0x13E693FA: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/io-cache.so) ==15836== by 0x13C5AA26: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/readdir-ahead.so) ==15836== by 0x138364C4: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/performance/write-behind.so) ==15836== by 0x135E3CBF: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/cluster/dht.so) ==15836== by 0x132F02B5: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/cluster/afr.so) ==15836== by 0x130B0439: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so) ==15836== by 0x515306F: ??? (in /usr/lib64/libgfrpc.so.0.0.1) ==15836== by 0x5153402: ??? (in /usr/lib64/libgfrpc.so.0.0.1) ==15836== ==15836== 66,357,720 bytes in 127,611 blocks are possibly lost in loss record 978 of 978 ==15836== at 0x4C2B9B5: calloc (vg_replace_malloc.c:711) ==15836== by 0x4E85E95: __gf_calloc (in /usr/lib64/libglusterfs.so.0.0.1) ==15836== by 0x4E6A620: ??? (in /usr/lib64/libglusterfs.so.0.0.1) ==15836== by 0x4E6BACA: inode_new (in /usr/lib64/libglusterfs.so.0.0.1) ==15836== by 0x13067018: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so) ==15836== by 0x130A695C: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so) ==15836== by 0x130B02C2: ??? (in /usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so) ==15836== by 0x515306F: ??? (in /usr/lib64/libgfrpc.so.0.0.1) ==15836== by 0x5153402: ??? (in /usr/lib64/libgfrpc.so.0.0.1) ==15836== by 0x514F2F2: rpc_transport_notify (in /usr/lib64/libgfrpc.so.0.0.1) ==15836== by 0x11FAA675: ??? (in /usr/lib64/glusterfs/4.1.3/rpc-transport/socket.so) ==15836== by 0x11FAD146: ??? (in /usr/lib64/glusterfs/4.1.3/rpc-transport/socket.so) ==15836== ==15836== LEAK SUMMARY: ==15836== definitely lost: 3,832,872 bytes in 15,543 blocks ==15836== indirectly lost: 1,736,704 bytes in 6,784 blocks ==15836== possibly lost: 233,130,874 bytes in 1,009,120 blocks ==15836== still reachable: 1,792,889 bytes in 7,738 blocks ==15836== suppressed: 0 bytes in 0 blocks ==15836== ==15836== For counts of detected and suppressed errors, rerun with: -v ==15836== ERROR SUMMARY: 2528 errors from 602 contexts (suppressed: 0 from 0)
Created attachment 1479358 [details] valgrind report
What is the available size of RAM? We have previously noticed a huge memory being consumed by client process, if there is no other pressure on Client memory. Ie, if there is a pressure on process, then kernel will send 'forget()' on inodes it has looked up. That is the only time we see mermoy usage coming down. Otherwise, kernel remembers all these inodes, and that way, glusterfs keeps all these inodes in memory, which results in high memory usage.
~16G RSS were consumed on a server with 64G RAM On another server it was: ~8G RSS on 16G RAM (available memory at that moment was ~1G) After I drop inodes cache with `echo 3 > /proc/sys/vm/drop_caches` nothing changed.
This issue is critical for us because mount point always killed by OOM-killer. Can I help you to find out the problem? Debug mode, more statedumps, valgrind reports?
Znamensky, for having almost same setup, we would like you to give 'gluster volume info' output. That way, we can try to see if we have the same set of options. We will go through the valgrind report too. ETA to get back is 7th Sept. Meantime, a statedump of client while such issue happens would also be great along with valgrind report, which is attached.
Something unrelated to the issue here: I noticed below snippet in statedump: ---- [mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage] size=3434686672 num_allocs=1 max_size=4294962768 max_num_allocs=19 total_allocs=66157424 ---- This was result of 32bit overflow in getting size and max_size... Made me look at wrong place, and search fuse-bridge for leak for more than 20mins just by looking at such a huge usage report.. Needs to be fixed for sure. I see from valgrind report that SSL is enabled in the setup, so, not sure if that causes the issue. Wasn't able to pin-point at the leak by looking at the reports. Will keep this bug updated!
Znamensky Pavel, if this is happening frequently (ie, memory growth and OOM kill), one request while I am trying to analyse the issue. Can you try below commands (one at a time): (assumption, gl is volume name). $ gluster volume set gl readdir-ahead disable run your tests, see for memory usage... $ gluster volume set gl io-cache disable run your tests, see for memory usage... $ gluster volume set gl quick-read disable run your tests, see for memory usage... From your statedump, I see that quick-read was very useful for the workload, mainly as the 'hit-rate' was very high, compared to misses, So, I wouldn't recommend disabling it forever in system, this is just an ask for the test period.
Created attachment 1480809 [details] reports for v4.1 and directory with tons of files and subdirs
I checked version 3.12 and it looks like it already has this problem. Also, I ran `find -type f ...` on 4.1 on a huge directory with tons of files. And after 12 hours valgrind process ate about 10G RSS on a server with 16G RAM (I don't know about valgrind's memory overhead, but RSS definitely grew over time) You can find all info and reports in v4.1_12hours_experiment.zip About last 3 tests: I did `find -type f ...` on the same dir (not very big in fact) each iteration and then remount storage. Results: - without `readdir-ahead` gluster process consumed about 270MB RSS (nearly the same as with `readdir-ahead`) - without `io-cache` and `readdir-ahead` ~220MB - without `quick-read`, `io-cache`, `readdir-ahead` ~220MB Our base config: Volume Name: gl Type: Replicate Volume ID: a90603ab-cf81-423d-b28f-3978e5afdcb8 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: srv1:/gl Brick2: srv2:/gl Options Reconfigured: cluster.readdir-optimize: on client.event-threads: 2 server.event-threads: 16 network.ping-timeout: 20 cluster.data-self-heal-algorithm: full performance.io-thread-count: 16 performance.io-cache: on performance.cache-size: 32MB performance.quick-read: on transport.address-family: inet6 performance.readdir-ahead: enable nfs.disable: on
Just tried to put glusterfs inside cgroup with memory limit (thanks to systemd). And it works! Glusterfs consumes nearly 100MB RSS (limit for the cgroup is 100MB) and even that 12 hours test can't force glusterfs to overuse memory! I'm not sure about performance in this case, but it look like workaround for cases when FUSE client doesn't release memory.
Good to hear it! It is not a bad option, and let us know how it goes for some more time!
Unfortunately, after ~10 hours of testing it was killed. But all that time it didn't consume more than 100MB.
Would be good to try the runs with SSL enabled, and see if we find anything different.
We have to set up SSL on our production cluster. It's not easy and it looks like it can't be done without downtime. Can I help you in another way?
(In reply to Znamensky Pavel from comment #18) > We have to set up SSL on our production cluster. It's not easy and it looks > like it can't be done without downtime. > Can I help you in another way? No Znamensky, lets try to see if we can understand the leak locally first. Dont' disturb the production cluster now! I guess with the proper workload, it should be possible to identify the leak outside of production cluster too.
(In reply to Amar Tumballi from comment #17) > Would be good to try the runs with SSL enabled, and see if we find anything > different. Is this for me ?
Is there any news on this? Still critical for our production. 46G RSS on v4.1.2 after two month uptime.
We have seen similar issue with v4.1.4 in our environment ( 2-way replication). We don't have SSL enabled either. Please let us know if there's any additional info that can be helpful to the resolution.
The issue was observed in v5.0.1 as well. Anyone can help with a resolution?
(In reply to Yan from comment #23) > The issue was observed in v5.0.1 as well. Anyone can help with a resolution? Hi, Please provide the following on the v5.0.1 client: 1. gluster volume info (I want to check if anything has changed since comment#13) 2. Statedump of the client process which is consuming a lot of memory (one that is not controlled by cgroups) - make sure /var/run/gluster exists on the client machine - kill -SIGUSR1 <gluster-fuse-process-pid> Do this twice with an interval of maybe 30mins in between. Please send me the glusterdump.* files generated in /var/run/gluster. After collecting the above, if not already done, disable readdir-ahead on the volume. gluster v set <volname> performance.readdir-ahead off and see if the memory continues rising.
(In reply to Znamensky Pavel from comment #1) > Created attachment 1479257 [details] > FUSE client statedump I just checked this: pool-name=dict_t active-count=1291721 <--- this is really high and points to a leak sizeof-type=168 padded-sizeof=256 size=330680576 shared-pool=0x7f709e813cc8 This looks similar to the readdir-ahead leak fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1659432. Please disable readdir-ahead for the time being and see if the problem goes away.
After disable readdir-ahead flag, the memory leak is reduced but not completely eliminated. Please suggest further steps.
(In reply to Yan from comment #26) > After disable readdir-ahead flag, the memory leak is reduced but not > completely eliminated. Please suggest further steps. Please provide the following: 1. statedumps of the client process at intervals where you see an increase in memory 2. A test script to reproduce the issue if available.
Created attachment 1517308 [details] statedump output with readdir_ahead on and off with gluster 5.1
Please refer the attachment 1517308 [details] for statedump output every half hour. 1). Test is done with 5.1. Similar issue has been observed in 3.12.13, 4.1.4 as well. # gluster --version glusterfs 5.1 2). The above statedump output is done with a "find" operation every second on the mount dir. #!/bin/sh a=0 while [ $a -lt 36000 ] do find /gserver_mount1/ -type f > /dev/null sleep 1 #a=`expr $a + 1` done 3). Here's the volume info with readdir_ahead on. # gluster volume info Volume Name: glustervol1 Type: Replicate Volume ID: 47aecf8c-de2f-43a5-8cab-64832bd28bd1 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.76.65.246:/mnt/data1/1 Brick2: 10.76.65.247:/mnt/data1/1 Brick3: 10.76.65.248:/mnt/data1/1 Options Reconfigured: performance.client-io-threads: off nfs.disable: on transport.address-family: inet performance.read-ahead: on performance.readdir-ahead: on 4). Memory leakage is reduced with readdir_ahead off ( refer Bug 1659432 ). 5). Below section has an invalid size number. [mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage] size=18446744073709524184 num_allocs=1 max_size=18446744073709551608 max_num_allocs=22 total_allocs=559978326
I'm sorry for the delay. But I can confirm that patched https://bugzilla.redhat.com/show_bug.cgi?id=1659432 version with disabled readdir-ahead still leaks with near the same speed as v4.1.
(In reply to Znamensky Pavel from comment #30) > I'm sorry for the delay. But I can confirm that patched > https://bugzilla.redhat.com/show_bug.cgi?id=1659432 version with disabled > readdir-ahead still leaks with near the same speed as v4.1. Then it is likely to be because the fuse client does not invalidate inodes. Does your workload access a lot of files? The earlier statedump showed around 3 million inodes in memory. pool-name=inode_t active-count=3050376 <---- Here sizeof-type=160 padded-sizeof=256 size=780896256 shared-pool=0x5141cc8 Does the statedump without readdir ahead also show a large number of inodes? https://review.gluster.org/#/c/glusterfs/+/19778/ has a fix to invalidate inodes but is not targeted for release 5 as yet.
(In reply to Yan from comment #29) > Please refer the attachment 1517308 [details] for statedump output every > half hour. > > 1). Test is done with 5.1. Similar issue has been observed in 3.12.13, 4.1.4 > as well. > > # gluster --version > glusterfs 5.1 > > 2). The above statedump output is done with a "find" operation every second > on the mount dir. > Observations: For the statedumps captured with readdir-ahead off: 1. There is no increase in the number of inodes. Did you see the memory rising continuously after the first run (where gluster will create and cache the inodes) while the test was run in this case? If yes, how much? 2. The number if dict_t allocs also remains constant, though high. Was readdir-ahead on initially for this volume and was it remounted after turning readdir-ahead off? If not, the high number is due to the leak when readdir-ahead was enabled as the memory will not be freed if readdir-ahead is disabled but the volume is not remounted. With readdir-ahead on: 1. The inodes do not increase. dict_t does but that is a known issue so I would expect the memory to rise constantly.
(In reply to Nithya Balachandran from comment #31) > Then it is likely to be because the fuse client does not invalidate inodes. > Does your workload access a lot of files? The earlier statedump showed > around 3 million inodes in memory. > >... > > https://review.gluster.org/#/c/glusterfs/+/19778/ has a fix to invalidate > inodes but is not targeted for release 5 as yet. Nithya, you're right! I built glusterfs from the current master (https://github.com/gluster/glusterfs/tree/d9a8ccd354df6db94477bf9ecb09735194523665) with the new invalidate inodes mechanism that you mentioned before, and RSS memory consumption indeed became much lower. And as you supposed our apps quite often access a lot of files. Here are two tests with clients on v6dev and v4.1 (the server is still on v4.1 and read-ahead=on) The first test with default --lru-limit=0 (just did `find /in/big/dir -type f`): v4.1 - ~3GB RSS: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 633 6.6 18.5 3570216 3056136 ? Ssl 19:44 6:25 /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 v6dev - ~1.5GB RSS: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 10851 16.5 9.2 2071036 1526456 ? Ssl 19:45 15:50 /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 It looks good. Let's do the next test. The second test with --lru-limit=10_000 for v6dev: v4.1 - ~3GB RSS: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 3589 4.7 18.6 3570216 3060364 ? Ssl 13:11 18:40 /usr/sbin/glusterfs --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 v6dev - ~170MB RSS: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 24152 14.2 1.0 758768 173704 ? Ssl 13:58 49:06 /usr/sbin/glusterfs --lru-limit=10000 --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 170MB vs. 3GB! It's incredible! Unfortunately, the new version has a drawback - CPU time increased 2.5x times. At the moment it doesn't matter for us. Anyway, I'm sure this change solves our problem. And of course, we're looking forward to a stable version with it. Thank you a lot!
We are seeing what looks like the same issue on glusterfs 4.1.5 I'm not sure if further information is still needed, given the last few comments, but I've collected client statedumps from three systems, along the with the gluster volume info here: https://s3.amazonaws.com/public-rhbz/glusterfs-client-4.1.5-statedumps.zip This includes 2 x client statedumps from each of 3 systems taken roughly 30 to 60 minutes apart The "webserver" and "appserver-active" gluster clients were restarted after setting readdir-ahead to off as suggested earlier in the ticket (this didn't seem to help much in this case), but the "webserver" client has already reached about 15GiB VIRT in just over 48 hours. We had also historically seen somewhat slower, but still significant fuse client memory leaks on v3.x (I think 3.11 or 3.12), but not (or at least not significant) on 3.7.11 or 4.0.2 with very similar workloads.
I believe that I'm able to artificially trigger this using fs_mark, for example: mkdir /srv/gluster/fsmark cd /srv/gluster/fsmark fs_mark -L 500 -d $PWD -v -S 0 -D 128 -n 1000 -s $[8*1024] That's 500 rounds of 128 threads each creating and deleting 1000 8KiB files each in a per-thread subdirectory, where /srv/gluster is a gluster volume mounted with the fuse client
(In reply to Znamensky Pavel from comment #33) > (In reply to Nithya Balachandran from comment #31) > > Then it is likely to be because the fuse client does not invalidate inodes. > > Does your workload access a lot of files? The earlier statedump showed > > around 3 million inodes in memory. > > > >... > > > > https://review.gluster.org/#/c/glusterfs/+/19778/ has a fix to invalidate > > inodes but is not targeted for release 5 as yet. > > > Nithya, you're right! > I built glusterfs from the current master > (https://github.com/gluster/glusterfs/tree/ > d9a8ccd354df6db94477bf9ecb09735194523665) with the new invalidate inodes > mechanism that you mentioned before, and RSS memory consumption indeed > became much lower. > And as you supposed our apps quite often access a lot of files. > Here are two tests with clients on v6dev and v4.1 (the server is still on > v4.1 and read-ahead=on) > > The first test with default --lru-limit=0 (just did `find /in/big/dir -type > f`): > > v4.1 - ~3GB RSS: > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > root 633 6.6 18.5 3570216 3056136 ? Ssl 19:44 6:25 > /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv > --volfile-id=/st1 /mnt/st1 > > v6dev - ~1.5GB RSS: > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > root 10851 16.5 9.2 2071036 1526456 ? Ssl 19:45 15:50 > /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv > --volfile-id=/st1 /mnt/st1 > > It looks good. Let's do the next test. > The second test with --lru-limit=10_000 for v6dev: > > v4.1 - ~3GB RSS: > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > root 3589 4.7 18.6 3570216 3060364 ? Ssl 13:11 18:40 > /usr/sbin/glusterfs --process-name fuse --volfile-server=srv > --volfile-id=/st1 /mnt/st1 > > v6dev - ~170MB RSS: > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > root 24152 14.2 1.0 758768 173704 ? Ssl 13:58 49:06 > /usr/sbin/glusterfs --lru-limit=10000 --process-name fuse > --volfile-server=srv --volfile-id=/st1 /mnt/st1 > > 170MB vs. 3GB! > It's incredible! > Unfortunately, the new version has a drawback - CPU time increased 2.5x > times. At the moment it doesn't matter for us. > Anyway, I'm sure this change solves our problem. And of course, we're > looking forward to a stable version with it. > Thank you a lot! Thank you for testing this. I'm glad to hear the patch is working as expected to keep the memory use down.
> Unfortunately, the new version has a drawback - CPU time increased 2.5x > times. At the moment it doesn't matter for us. > Anyway, I'm sure this change solves our problem. And of course, we're > looking forward to a stable version with it. While a release with this patch merged/tested is another 50days away, we surely would like to reduce the CPU load you see too. Whenever you get time, if you can capture CPU info with below tool "perf record -ag --call-graph=dwarf -o perf.data -p <pid of glusterfs process>", and then see "perf report" to see what actually caused the CPU usage, it will help us to resolve that too. Also note, lru-limit=10000 while many files are accessed may not be a good value. I recommend something like 64k at least. But well, it depends on your memory needs too. So, if you can give 512MB - 1GB RAM for glusterfs, its better at least for performance.
(In reply to Amar Tumballi from comment #37) > While a release with this patch merged/tested is another 50days away, we > surely would like to reduce the CPU load you see too. Whenever you get time, > if you can capture CPU info with below tool "perf record -ag > --call-graph=dwarf -o perf.data -p <pid of glusterfs process>", and then see > "perf report" to see what actually caused the CPU usage, it will help us to > resolve that too. I'm sorry for the delay. I did `find` on a directory with ~ 200_000 files on v6 with lru-limit, without lru-limit and on v4.1. `ps aux` shows this: v6 with lru-limit = 10_000 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 30968 2.2 0.3 637936 54056 ? Ssl 18:15 0:11 /usr/sbin/glusterfs --read-only --lru-limit=10000 --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 v6 without lru-limit USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 31193 2.0 2.0 834544 330680 ? Ssl 18:32 0:09 /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 v4.1 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 31416 1.3 2.0 817704 344996 ? Ssl 18:41 0:06 /usr/sbin/glusterfs --read-only --process-name fuse --volfile-server=srv --volfile-id=/st1 /mnt/st1 Also, I've got `perf` reports, but I'm afraid I can't correctly understand it. Unfortunately, these reports have sensitive information, so I can't attach them to this issue. Nevertheless, if it would be helpful, I could send them directly to you. (In reply to Amar Tumballi from comment #37) > Also note, lru-limit=10000 while many files are accessed may not be a good > value. I recommend something like 64k at least. But well, it depends on your > memory needs too. So, if you can give 512MB - 1GB RAM for glusterfs, its > better at least for performance. Thanks for the advice! Also, I suppose I should create a new issue for CPU problem, should I?
REVIEW: https://review.gluster.org/22014 (fuse: add --lru-limit option) merged (#3) on release-5 by Shyamsundar Ranganathan
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.3, please open a new bug report. glusterfs-5.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2019-January/000118.html [2] https://www.gluster.org/pipermail/gluster-users/