Bug 1495161 - [GSS] Few brick processes are consuming more memory after patching 3.2
Summary: [GSS] Few brick processes are consuming more memory after patching 3.2
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: locks
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: RHGS 3.4.0
Assignee: Xavi Hernandez
QA Contact: Nag Pavan Chilakam
Depends On:
Blocks: 1503135 1507361 1526377
TreeView+ depends on / blocked
Reported: 2017-09-25 11:09 UTC by Prerna Sony
Modified: 2021-03-11 15:50 UTC (History)
14 users (show)

Fixed In Version: glusterfs-3.12.2-2
Doc Type: Bug Fix
Doc Text:
Previously, processes that used of many POSIX locks, possibly in combination with gluster clear-locks command, would lead to memory leak causing high memory consumption on brick processes that triggered ‘OOM killer’ error in some cases. This release fixes the issue related to leaks present in translators.
Clone Of:
Last Closed: 2018-09-04 06:36:24 UTC
Target Upstream Version:

Attachments (Terms of Use)
state-dump from 3.1.3 (prod, prod-moodle) (7.16 MB, application/x-gzip)
2017-09-27 05:08 UTC, Prerna Sony
no flags Details

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 0 None None None 2018-09-04 06:38:14 UTC

Description Prerna Sony 2017-09-25 11:09:30 UTC
Description of problem:

GlusterFSd processes using more memory after patching 3.2 when compared to the non patched environment i.e 3.1.3

Version-Release number of selected component (if applicable):


How reproducible:
In Customer environment

Actual results:
Few of the brick processes are consuming more memory after patching 3.2

Comment 2 Prerna Sony 2017-09-25 11:13:25 UTC
Created attachment 1330478 [details]
State dump

Comment 12 Prerna Sony 2017-09-27 05:08:36 UTC
Created attachment 1331309 [details]
state-dump from 3.1.3 (prod, prod-moodle)

Comment 22 hari gowtham 2017-10-09 07:45:01 UTC
Hi Atin,

Yes, it is a regression.

This was introduced in 3.8 for https://bugzilla.redhat.com/show_bug.cgi?id=1326085 

This code is not there on 3.1.3 but is there on 3.2.


Comment 56 Nag Pavan Chilakam 2018-08-14 09:36:58 UTC
Below is what I had run for a span of ~4 days on 3.12.2-15:
create a 18x3 volume with performance.client-io-threads off  and brickmux off(as in customer case)
mounted volume on 8 different clients, and triggered different kinds of IOs as below
1) script to take locks on a file in multiple iterations( 2 clients
2) linux untar from 2 clients for multiple iterations
3) from 2 client creating files simultaneous, renaming and deleting as below
for x in {1..10000};do for i in {1..10000};do dd if=/dev/urandom of=file.$x.$i bs=123 count=10000;done;for j in {1..10000};do mv -f file.$x.$j file.$x.$j.$j;done;rm -rf file.$x.*;done
4) different IOs from 2 client using crefi as below
for x  in {1..1000};do for i in {create,chmod,chown,chgrp,symlink,truncate,rename,hardlink}; do ./crefi.py  --multi -n 15 -b 100 -d 20 --max=10K --min=50 --random -T 3 -t text --fop=$i /mnt/locks/IOs/Crefi/$HOSTNAME/  ; sleep 10 ; done;rm -rf /mnt/locks/IOs/Crefi/$HOSTNAME/*;done

5) same directory creation in depth and bredth from 2 clients simultaneously

mounted client locally on one server and was issuing clearing of locks as below
for i in $(find  IOs);do gluster volume clear-locks locks /$i kind all posix; done

Over this 3 days I didn't see any siginificant mem consumption by bricks

Comment 57 Nag Pavan Chilakam 2018-08-14 14:02:15 UTC
sosreports and logs for my tests @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1495161/onqa_verification

memory info captured in file "fresh_top.log" for each node

I don't see any concern with memory footprint

even after 4 days resident memory has increased by about 1% per glusterfsd
and is not anything close to what customer has seen

Comment 59 Nag Pavan Chilakam 2018-08-16 09:09:31 UTC
I am moving BZ to verified based on my above comments from testing
(if need be i will raise a new bz for c#58)

Comment 64 errata-xmlrpc 2018-09-04 06:36:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.