Bug 1529501 - [shd] : shd occupies ~7.8G in memory while toggling cluster.self-heal-daemon in loop , possibly leaky.
Summary: [shd] : shd occupies ~7.8G in memory while toggling cluster.self-heal-daemon ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.3
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: RHGS 3.5.0
Assignee: Mohit Agrawal
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On: RHGS34MemoryLeak 1725024 1727329
Blocks: 1696807
TreeView+ depends on / blocked
 
Reported: 2017-12-28 12:31 UTC by Ambarish
Modified: 2019-10-30 12:20 UTC (History)
10 users (show)

Fixed In Version: glusterfs-6.0-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-30 12:19:37 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3319451 0 None None None 2018-01-16 08:24:38 UTC
Red Hat Product Errata RHEA-2019:3249 0 None None None 2019-10-30 12:20:11 UTC

Description Ambarish 2017-12-28 12:31:54 UTC
Description of problem:
-------------------------

As a part of  verification of https://bugzilla.redhat.com/show_bug.cgi?id=1526363 , created 300  dist rep volumes

Bricks are multiplexed.

Then proceeded to do vol set in loop.

<snip>

for i in {1..300};do gluster volume create butcher$i replica 2 gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/brickA$i gqas016.sbu.lab.eng.bos.redhat.com:/bricks1/brickA$i gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/brickA$i gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/brickA$i gqas003.sbu.lab.eng.bos.redhat.com:/bricks1/brickA$i gqas007.sbu.lab.eng.bos.redhat.com:/bricks1/brickA$i;gluster v start butcher$i;sleep 2;done ;

followed by 

for i in {1..300};do gluster v set butcher$i cluster.self-heal-daemon off;sleep 3 ;gluster v set butcher$i group metadata-cache;sleep 3;gluster v set butcher$i cluster.lookup-optimize on;sleep 3 ;done


<snip>



Self heal daemon occupies almost 4.6G of Resident space in memory post all volume set operations.


**BEFORE VOL SET ** : 

[root@gqas008 /]# ps aux|grep glus
root      8078 12.4  2.6 28807468 1315220 ?    Ssl  05:13   0:28 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/e32d8903c5b60efed5cc4e725235c143.socket --xlator-option *replicate*.node-uuid=cedc8e7d-d3a0-47f2-a50e-ebe12fe964bc


**AFTER VOL SET ** :


[root@gqas008 /]# ps aux|grep glustershd
root      8078  3.0  9.4 31756588 4677648 ?    Ssl  05:13   3:56 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/e32d8903c5b60efed5cc4e725235c143.socket --xlator-option *replicate*.node-uuid=cedc8e7d-d3a0-47f2-a50e-ebe12fe964bc



Mem consumption increased from 1.3G to 4.6G.

Since the delta is massive , raising with high prio.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------

[root@gqas008 /]# rpm -qa|grep glus
glusterfs-libs-3.8.4-52.3.el7rhgs.x86_64
glusterfs-server-3.8.4-52.3.el7rhgs.x86_64


How reproducible:
------------------

2/2

Steps to Reproduce:
-------------------

1. Create lots of volumes of type dist-rep (say,300)

2. Disable self heal in loop



Actual results:
----------------

Drastic increase in mem consumption by shd post vol set operations.

Expected results:
------------------

Controlled memory consumption by shd.

Comment 18 RHEL Program Management 2018-05-18 07:12:46 UTC
Development Management has reviewed and declined this request.
You may appeal this decision by reopening this request.

Comment 24 Nag Pavan Chilakam 2019-07-18 05:48:21 UTC
have rerun the test mentioned in description.
Along with that I kept running  shd off/on for about ~520 volumes{inluding disp/arb/rep type}.
I saw that even after 5 iterations the resident memory was more or less stable at about 1.8-2GB.
Hence moving the bug to verified as I don't observe any substantial leak as mentioned in description.

Version:6.0.8 , test build issued after reverting shd-mux feature.

Comment 28 errata-xmlrpc 2019-10-30 12:19:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3249


Note You need to log in before you can comment on or make changes to this bug.