Created attachment 1378455 [details] volumes configuration Description of problem: We have a production system with a glusterfs working as replication system between two servers. Replication directory has 2Gb total size and contains near 20-30 files. We noticed that glusterfsd process memory consumption permanently growth on both servers on 30Mb/day. Memory increase occurs irregularly (I mean memory can be constant for 3 hours but then abrutply increase on 16 Megabytes). We are working with 3.8.8-1 release, but have checked that issue is still reprodusible on the latest glusterfs 3.13 release. Our operation system is Debian Jessie. I can provide any log or debug info that you need to fix this issue. Just tell me what I need to provide. Version-Release number of selected component (if applicable):3.13 How reproducible: Steps to Reproduce: 1. Configure two volumes on two different servers in replication mode(configuration in attachment). 2. Modify content of the files periodically (for instance once per minute). 3. Observe that memory of the glusterfsd permanently increasing. Actual results: Memory consumption of the glusterfsd process growth and never releases. Expected results: Memory consumption of the glusterfsd should not permanently growth while the size of the mounted directory is not increasing. Additional info:
On what consist the modifications you do on the files ? If you increase the frequency of the changes, does the memory usage grow faster ? How many days have you tracked this memory increase ? has it been growing at the same speed all the time or the rate of growing has decreased after some days ? It would also be interesting to provide statedumps of one of the glusterfsd processes at two points in time when memory utilization is significantly different. You can find information on how to generate statedumps here: http://docs.gluster.org/en/latest/Troubleshooting/statedump/
If you increase the frequency of the changes, does the memory usage grow faster ? Yes. How many days have you tracked this memory increase ? On our system I see permanent memory growth for 3 weeks. Has it been growing at the same speed all the time or the rate of growing has decreased after some days ? The same rate(see rate.txt file in attachement) To collect debug statedumps two different test cases were performed: Test case 1: on the system left for the night. No new files were created. Every 5 seconds "sudo gluster volume heal home info" was called to check that replication is ok. In the evening: Node A: ~$ date Tue Jan 9 16:28:43 UTC 2018 ~$ sudo ps -ely | grep gluster S 0 6060 1 0 80 0 38224 246871 - ? 00:00:00 glusterfsd S 0 6085 1 0 80 0 22772 155022 - ? 00:00:00 glusterfs S 0 39482 1 0 80 0 23552 117579 - ? 00:00:01 glusterd S 0 40111 1 0 80 0 50496 150073 - ? 00:00:01 glusterfs Node B: ~$ date Tue Jan 9 16:27:33 UTC 2018 ~$ sudo ps -ely | grep gluster S 0 158949 1 0 80 0 20340 117579 - ? 00:00:00 glusterd S 0 160258 1 0 80 0 38308 246871 - ? 00:00:00 glusterfsd S 0 160278 1 0 80 0 22256 119670 - ? 00:00:00 glusterfs S 0 160379 1 0 80 0 39068 161520 - ? 00:00:00 glusterfs Attachment (statedumps): NodeA_test_case_1_evening.6060.dump, NodeB_test_case1_evening.160258.dump In the morning: Node A: ~$ date Wed Jan 10 07:46:11 UTC 2018 ~$ sudo ps -ely | grep gluster S 0 6060 1 0 80 0 39812 263255 - ? 00:00:21 glusterfsd S 0 6085 1 0 80 0 22772 155022 - ? 00:00:01 glusterfs S 0 39482 1 0 80 0 26092 117579 - ? 00:00:04 glusterd S 0 40111 1 0 80 0 50496 150073 - ? 00:00:01 glusterfs Node B: ~$ date Wed Jan 10 07:51:50 UTC 2018 ~$ sudo ps -ely | grep gluster S 0 158949 1 0 80 0 20384 117579 - ? 00:00:01 glusterd S 0 160258 1 0 80 0 39892 263255 - ? 00:00:18 glusterfsd S 0 160278 1 0 80 0 22256 119670 - ? 00:00:01 glusterfs S 0 160379 1 0 80 0 39068 161520 - ? 00:00:00 glusterfs Attachment (statedumps): NodeA_test_case_1_morning.6060.dump, NodeB_test_case_1_morning.160258.dump Test case 2: on the system left for hour with intensive file generation and removing on mounted glusterfs directory. Every 5 seconds "sudo gluster volume heal home info" was called to check that replication is ok. Script for file generation: ---------------------- #!/bin/bash while dd if=/dev/urandom of=tmp_file bs=64M count=16 iflag=fullblock; sleep 1; rm -f tmp_file; do sleep 1; done ---------------------- Node A: ~$ sudo ps -ely | grep gluster S 0 10859 1 0 80 0 21528 117579 - ? 00:00:00 glusterd S 0 11900 1 0 80 0 37624 263255 - ? 00:00:00 glusterfsd S 0 11920 1 0 80 0 22116 119671 - ? 00:00:00 glusterfs S 0 12019 1 0 80 0 41440 145137 - ? 00:00:00 glusterfs Node B: ~$ sudo ps -ely | grep gluster S 0 12076 1 0 80 0 22320 117579 - ? 00:00:00 glusterd S 0 12657 1 0 80 0 50268 150074 - ? 00:00:00 glusterfs S 0 32953 1 0 80 0 37664 263255 - ? 00:00:00 glusterfsd S 0 32973 1 0 80 0 22408 155023 - ? 00:00:00 glusterfs Attachment (statedumps): NodeA_test_case_2_start.11900.dump, NodeB_test_case_2_start.32953.dump After one hour: Node A: ~$ sudo ps -ely | grep gluster S 0 10859 1 0 80 0 21528 117579 - ? 00:00:00 glusterd S 0 11900 1 3 80 0 38396 296280 - ? 00:03:37 glusterfsd S 0 11920 1 0 80 0 22844 136589 - ? 00:00:00 glusterfs S 0 12019 1 0 80 0 41628 145137 - ? 00:00:00 glusterfs Node B: ~$ sudo ps -ely | grep gluster S 0 12076 1 0 80 0 22404 117579 - ? 00:00:00 glusterd S 0 12657 1 3 80 0 53500 150074 - ? 00:04:07 glusterfs S 0 32953 1 1 80 0 38584 312921 - ? 00:02:09 glusterfsd S 0 32973 1 0 80 0 22524 155023 - ? 00:00:00 glusterfs Attachment (statedumps): NodeA_test_case_2_end.11900.dump, NodeB_test_case_2_start.32953.dump
Created attachment 1379547 [details] statedump for test case 1
Created attachment 1379548 [details] statedump for test case 1 (after 1 day)
Created attachment 1379549 [details] statedump for test case 1 (node B)
Created attachment 1379550 [details] statedump for test case 1 (Node B) (after 1 day)
Created attachment 1379551 [details] statedump for test case 2 (Node A) (start)
Created attachment 1379553 [details] statedump for test case 2 (Node A) (end)
Created attachment 1379554 [details] statedump for test case 2 (Node B) (start)
Created attachment 1379556 [details] statedump for test case 2 (Node B) (end)
Created attachment 1379557 [details] rate of memory growth
Thanks for the info. I'll take a look.
I've tried to reproduce the problem but I haven't been able. I've tested with 3.8.8, latest 3.13 and master branches. In all cases, after an initial increase when caches and some other internal data are being populated, I haven't observed a steady memory usage increase. After looking at the code, I've identified a possible memory leak, but only if something fails. Can you attach the logs to see if there are any issues that could help in determine what's causing the memory increase ? I've also seen that you have network.ping-timeout set to 3. This is considered a too small value.
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.