Description of problem ====================== On single machine of trusted storage pool monitored by RHGSWA, glusterd process memory usage grows about 1.3 GB per day, consuming all available memory in few days. On all other nodes, the memory growth was smaller (about 80 MB/day), which is within limits what has been already reported as BZ 1664046. Version-Release number of selected component ============================================ GlusterFS: ``` # rpm -qa |grep gluster | sort glusterfs-3.12.2-36.el7rhgs.x86_64 glusterfs-api-3.12.2-36.el7rhgs.x86_64 glusterfs-cli-3.12.2-36.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-36.el7rhgs.x86_64 glusterfs-events-3.12.2-36.el7rhgs.x86_64 glusterfs-fuse-3.12.2-36.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-36.el7rhgs.x86_64 glusterfs-libs-3.12.2-36.el7rhgs.x86_64 glusterfs-rdma-3.12.2-36.el7rhgs.x86_64 glusterfs-server-3.12.2-36.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64 python2-gluster-3.12.2-36.el7rhgs.x86_64 tendrl-gluster-integration-1.6.3-13.el7rhgs.noarch vdsm-gluster-4.19.43-2.3.el7rhgs.noarch ``` RHGSWA: ``` # rpm -qa | grep tendrl | sort tendrl-collectd-selinux-1.5.4-3.el7rhgs.noarch tendrl-commons-1.6.3-14.el7rhgs.noarch tendrl-gluster-integration-1.6.3-13.el7rhgs.noarch tendrl-node-agent-1.6.3-13.el7rhgs.noarch tendrl-selinux-1.5.4-3.el7rhgs.noarch ``` How reproducible ================ I don't know. I haven't seen this before and haven't chance to reproduce it again (as I decided to collect data before retrying). Steps to Reproduce ================== 1. Install setup RHGS cluster on 6 machines, with 2 volumes (using standard usmqe configuration). 2. Install RHGSWA on separate machine and import trusted storage pool into RHGSWA 3. Mount one volume on dedicated client machine, and fill it completely with 10 MB files, then free the space. 4. Let the cluster operational for few days Actual results ============== The memory usage on one node grows at about 1.3 GB per day. The machine has 7821 MB of memory, and within one day, memory consumption jumped from 65 % to 82 % (see screenshot from WA dashboard) => (7821/2**10)*.17 GB/day = 1.3 GB/day Expected results ================ The memory utilization doesn't grow that rapidly. Additional info =============== Note: sos report killed all the bricks => memory was freed and I was not able to create proper statedump report. I can't directly confirm that the node was used as RHGSWA Provisioner Node, when I tried to find what node is the provisioner, no machine was assigned. I will try to confirm this indirectly, checking logs. That said, it's possible that the problem is triggered by commands executed by some RHGSWA component. See attached cmd_history.log file. Other memory leak BZs ===================== At the time of reporting this bug, the following memory leak bugs were open: * Bug 1651915 * Bug 1664046 Since I'm not sure about the reproducer, I list the bugs here as it could be related. That said, I noticed enough differences in my case compared to these already reported bugs, I created a separate bug: * Compared to BZ 1664046, the memory growth is much faster. I see 1.3 GB/day, while in BZ 1664046, the rate is about 100 MB/day. Moreover, I see it on single node (out of 6) only, while in BZ 1664046, all storage machines are affected. * Compared to BZ 1651915, I see no "volume status" commands in cmd_history.log The growth rate also differs, but the change could be caused by differences in workloads.
Created attachment 1521305 [details] screenshot of RHGSWA host dashboard, with Memory Utilization chart for 7 days
*** Bug 1664046 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0263