1495161 – [GSS] Few brick processes are consuming more memory after patching 3.2

Bug 1495161 - [GSS] Few brick processes are consuming more memory after patching 3.2

Summary: [GSS] Few brick processes are consuming more memory after patching 3.2

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	locks
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Xavi Hernandez
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1503135 1507361 1526377
TreeView+	depends on / blocked

Reported:	2017-09-25 11:09 UTC by Prerna Sony
Modified:	2021-03-11 15:50 UTC (History)
CC List:	14 users (show)
Fixed In Version:	glusterfs-3.12.2-2
Doc Type:	Bug Fix
Doc Text:	Previously, processes that used of many POSIX locks, possibly in combination with gluster clear-locks command, would lead to memory leak causing high memory consumption on brick processes that triggered ‘OOM killer’ error in some cases. This release fixes the issue related to leaks present in translators.
Clone Of:
Environment:
Last Closed:	2018-09-04 06:36:24 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
state-dump from 3.1.3 (prod, prod-moodle) (7.16 MB, application/x-gzip) 2017-09-27 05:08 UTC, Prerna Sony	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:38:14 UTC

Description Prerna Sony 2017-09-25 11:09:30 UTC

Description of problem:

GlusterFSd processes using more memory after patching 3.2 when compared to the non patched environment i.e 3.1.3

Version-Release number of selected component (if applicable):

glusterfs-server-3.8.4-18.4.el6rhs.x86_64   

How reproducible:
In Customer environment

Actual results:
Few of the brick processes are consuming more memory after patching 3.2

Comment 2 Prerna Sony 2017-09-25 11:13:25 UTC

Created attachment 1330478 [details]
State dump

Comment 12 Prerna Sony 2017-09-27 05:08:36 UTC

Created attachment 1331309 [details]
state-dump from 3.1.3 (prod, prod-moodle)

Comment 22 hari gowtham 2017-10-09 07:45:01 UTC

Hi Atin,

Yes, it is a regression.

This was introduced in 3.8 for https://bugzilla.redhat.com/show_bug.cgi?id=1326085 

This code is not there on 3.1.3 but is there on 3.2.

Regards,
Hari.

Comment 56 Nag Pavan Chilakam 2018-08-14 09:36:58 UTC

Below is what I had run for a span of ~4 days on 3.12.2-15:
create a 18x3 volume with performance.client-io-threads off  and brickmux off(as in customer case)
mounted volume on 8 different clients, and triggered different kinds of IOs as below
1) script to take locks on a file in multiple iterations( 2 clients
2) linux untar from 2 clients for multiple iterations
3) from 2 client creating files simultaneous, renaming and deleting as below
for x in {1..10000};do for i in {1..10000};do dd if=/dev/urandom of=file.$x.$i bs=123 count=10000;done;for j in {1..10000};do mv -f file.$x.$j file.$x.$j.$j;done;rm -rf file.$x.*;done
4) different IOs from 2 client using crefi as below
for x  in {1..1000};do for i in {create,chmod,chown,chgrp,symlink,truncate,rename,hardlink}; do ./crefi.py  --multi -n 15 -b 100 -d 20 --max=10K --min=50 --random -T 3 -t text --fop=$i /mnt/locks/IOs/Crefi/$HOSTNAME/  ; sleep 10 ; done;rm -rf /mnt/locks/IOs/Crefi/$HOSTNAME/*;done

5) same directory creation in depth and bredth from 2 clients simultaneously



mounted client locally on one server and was issuing clearing of locks as below
for i in $(find  IOs);do gluster volume clear-locks locks /$i kind all posix; done




Over this 3 days I didn't see any siginificant mem consumption by bricks

Comment 57 Nag Pavan Chilakam 2018-08-14 14:02:15 UTC

sosreports and logs for my tests @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1495161/onqa_verification


memory info captured in file "fresh_top.log" for each node


I don't see any concern with memory footprint


even after 4 days resident memory has increased by about 1% per glusterfsd
and is not anything close to what customer has seen

Comment 59 Nag Pavan Chilakam 2018-08-16 09:09:31 UTC

I am moving BZ to verified based on my above comments from testing
(if need be i will raise a new bz for c#58)

Comment 64 errata-xmlrpc 2018-09-04 06:36:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.