1599987 – Growing memory utilization of tendrl-gluster-integration on one node in cluster

Bug 1599987 - Growing memory utilization of tendrl-gluster-integration on one node in cluster

Summary: Growing memory utilization of tendrl-gluster-integration on one node in cluster

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	web-admin-tendrl-gluster-integration
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Shubhendu Tripathi
QA Contact:	Daniel Horák
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1503137
TreeView+	depends on / blocked

Reported:	2018-07-11 06:43 UTC by Daniel Horák
Modified:	2018-11-13 16:16 UTC (History)
CC List:	5 users (show)
Fixed In Version:	tendrl-gluster-integration-1.6.3-7.el7rhgs
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-09-04 07:08:56 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Gluster storage node memory utilization graph (67.74 KB, image/png) 2018-07-11 06:43 UTC, Daniel Horák	no flags	Details
For comparison: "Not affected" Gluster storage node memory utilization graph (55.79 KB, image/png) 2018-07-11 06:50 UTC, Daniel Horák	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	Tendrl gluster-integration issues 681	0	None	None	None	2018-07-16 02:42:25 UTC
Red Hat Bugzilla	1538248	0	unspecified	CLOSED	[RFE] Performance Improvements	2023-09-14 04:15:57 UTC
Red Hat Bugzilla	1560875	1	None	None	None	2024-09-18 00:47:33 UTC
Red Hat Product Errata	RHSA-2018:2616	0	None	None	None	2018-09-04 07:09:57 UTC

Internal Links: 1538248 1560875

Description Daniel Horák 2018-07-11 06:43:40 UTC

Created attachment 1457998 [details]
Gluster storage node memory utilization graph

Description of problem:
  With the latest builds, memory utilization of tendrl-gluster-integration
  on one Gluster Storage server is growing to very high numbers.

Version-Release number of selected component (if applicable):
  Gluster Storage Server:
  Red Hat Enterprise Linux Server release 7.5 (Maipo)
  Red Hat Gluster Storage Server 3.4.0
  glusterfs-3.12.2-13.el7rhgs.x86_64
  glusterfs-api-3.12.2-13.el7rhgs.x86_64
  glusterfs-cli-3.12.2-13.el7rhgs.x86_64
  glusterfs-client-xlators-3.12.2-13.el7rhgs.x86_64
  glusterfs-events-3.12.2-13.el7rhgs.x86_64
  glusterfs-fuse-3.12.2-13.el7rhgs.x86_64
  glusterfs-geo-replication-3.12.2-13.el7rhgs.x86_64
  glusterfs-libs-3.12.2-13.el7rhgs.x86_64
  glusterfs-rdma-3.12.2-13.el7rhgs.x86_64
  glusterfs-server-3.12.2-13.el7rhgs.x86_64
  gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
  gluster-nagios-common-0.2.4-1.el7rhgs.noarch
  libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64
  python2-gluster-3.12.2-13.el7rhgs.x86_64
  tendrl-collectd-selinux-1.5.4-2.el7rhgs.noarch
  tendrl-commons-1.6.3-8.el7rhgs.noarch
  tendrl-gluster-integration-1.6.3-6.el7rhgs.noarch
  tendrl-node-agent-1.6.3-8.el7rhgs.noarch
  tendrl-selinux-1.5.4-2.el7rhgs.noarch
  vdsm-gluster-4.19.43-2.3.el7rhgs.noarch

How reproducible:
  I've spotted it on three different clusters from yesterday, but I'm not sure
  if it is 100% reproducible on every cluster.

Steps to Reproduce:
1. Prepare, install and configure Gluster Storage Cluster
  (my environment: 6 storage nodes with 8GB RAM, 2GB Swap, 2vCPUs, 1-3 volumes).
2. Install and configure RHGS WA Server and RHGS WA Node Agents on Gluster
  Storage nodes.
3. Import Gluster Cluster into RHGS WA.
4. Let it running for couple of hours/one day.
5. Check memory consumed by tendrl-gluster-integration on all Gluster Storage
  Servers.
  # ps -p $(echo $(ps aux | grep [t]endrl-gluster-integration | awk '{print $2}') | sed 's/ /,/g') -o %cpu,%mem,cmd -h;


Actual results:
  On one Gluster Storage server in the cluster, tendrl-gluster-integration
  consumes huge amount of memory (more than 80% in my case).

  (second number is memory utilization)
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # ps -p $(echo $(ps aux | grep [t]endrl-gluster-integration | awk '{print $2}') | sed 's/ /,/g') -o %cpu,%mem,cmd -h;
    9.0 82.8 /usr/bin/python /usr/bin/tendrl-gluster-integration
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Expected results:
  tendrl-gluster-integration should not consume such hight amount of memory.

Additional info:
  This problem was initially spotted, because of alert similar to this shown
  in RHGS WA:

  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Memory utilization on node gl2.example.com in ClusterA at 80.09 % and running out of memory
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  I've also attached graph of memory usage of the affected node.

Comment 1 Daniel Horák 2018-07-11 06:50:19 UTC

Created attachment 1457999 [details]
For comparison: "Not affected" Gluster storage node memory utilization graph

For comparison, I'm attaching also graph of memory usage from "not affected" storage node.
As you can see, the graph is limited on 8GB (which is the available memory), and also the biggest part of the memory is used for cache (purple color).

Comment 6 Daniel Horák 2018-07-23 08:31:01 UTC

On cluster running for three days, tendrl-gluster-integration consumes
significantly smaller amount of memory (1% or less on system with 8GB RAM) on
all nodes.

RHGS WA Server:
  Red Hat Enterprise Linux Server release 7.5 (Maipo)
  grafana-4.3.2-3.el7rhgs.x86_64
  tendrl-ansible-1.6.3-5.el7rhgs.noarch
  tendrl-api-1.6.3-4.el7rhgs.noarch
  tendrl-api-httpd-1.6.3-4.el7rhgs.noarch
  tendrl-commons-1.6.3-9.el7rhgs.noarch
  tendrl-grafana-plugins-1.6.3-7.el7rhgs.noarch
  tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
  tendrl-monitoring-integration-1.6.3-7.el7rhgs.noarch
  tendrl-node-agent-1.6.3-9.el7rhgs.noarch
  tendrl-notifier-1.6.3-4.el7rhgs.noarch
  tendrl-selinux-1.5.4-2.el7rhgs.noarch
  tendrl-ui-1.6.3-8.el7rhgs.noarch

Gluster Storage Server:
  Red Hat Enterprise Linux Server release 7.5 (Maipo)
  Red Hat Gluster Storage Server 3.4.0
  glusterfs-3.12.2-14.el7rhgs.x86_64
  glusterfs-api-3.12.2-14.el7rhgs.x86_64
  glusterfs-cli-3.12.2-14.el7rhgs.x86_64
  glusterfs-client-xlators-3.12.2-14.el7rhgs.x86_64
  glusterfs-events-3.12.2-14.el7rhgs.x86_64
  glusterfs-fuse-3.12.2-14.el7rhgs.x86_64
  glusterfs-geo-replication-3.12.2-14.el7rhgs.x86_64
  glusterfs-libs-3.12.2-14.el7rhgs.x86_64
  glusterfs-rdma-3.12.2-14.el7rhgs.x86_64
  glusterfs-server-3.12.2-14.el7rhgs.x86_64
  gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
  gluster-nagios-common-0.2.4-1.el7rhgs.noarch
  libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64
  python2-gluster-3.12.2-14.el7rhgs.x86_64
  tendrl-collectd-selinux-1.5.4-2.el7rhgs.noarch
  tendrl-commons-1.6.3-9.el7rhgs.noarch
  tendrl-gluster-integration-1.6.3-7.el7rhgs.noarch
  tendrl-node-agent-1.6.3-9.el7rhgs.noarch
  tendrl-selinux-1.5.4-2.el7rhgs.noarch
  vdsm-gluster-4.19.43-2.3.el7rhgs.noarch

>> VERIFIED

Comment 8 errata-xmlrpc 2018-09-04 07:08:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2616

Note You need to log in before you can comment on or make changes to this bug.