Bug 1372729

Summary: Possible VSZ leak in glusterfsd brick process
Product: [Community] GlusterFS Reporter: Oleksandr Natalenko <oleksandr>
Component: coreAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED EOL QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.7.15CC: bugs, sarumuga
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-08 11:00:33 UTC Type: Bug
Regression: --- Mount Type: fuse
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1373630    
Bug Blocks:    
Attachments:
Description Flags
Heimdall server node VSZ
none
Pussy server node VSZ none

Description Oleksandr Natalenko 2016-09-02 13:48:39 UTC
Description of problem:

Given replica 2 serving mainly one big distributed-replicated 5 × 2 volume (backups with lots of operations with small files):

===
Volume Name: backups
Type: Distributed-Replicate
...
Status: Started
Number of Bricks: 5 x 2 = 10
Transport-type: tcp
Bricks:
Brick1: pussy.example.com:/mnt/backups_1/backups
Brick2: heimdall.example.com:/mnt/backups_1/backups
Brick3: pussy.example.com:/mnt/backups_2/backups
Brick4: heimdall.example.com:/mnt/backups_2/backups
Brick5: pussy.example.com:/mnt/backups_3/backups
Brick6: heimdall.example.com:/mnt/backups_3/backups
Brick7: pussy.example.com:/mnt/backups_4/backups
Brick8: heimdall.example.com:/mnt/backups_4/backups
Brick9: pussy.example.com:/mnt/backups_5/backups
Brick10: heimdall.example.com:/mnt/backups_5/backups
Options Reconfigured:
cluster.entry-self-heal: off
cluster.metadata-self-heal: off
cluster.data-self-heal: off
performance.readdir-ahead: on
nfs.disable: on
===

Over the last month we observe constant VSZ usage of bricks processes. After updating to 3.7.15 it seems the issue is still there (please see graphs attached).

Version-Release number of selected component (if applicable):

3.7.14, 3.7.15.

How reproducible:

Always.

Steps to Reproduce:
1. create distributed-replicated 5 × 2 volume;
2. perform lots of operations with small files (like git commit on GlusterFS volume — we backup /etc of many servers to it);
3. ...
4. PROFIT.

Actual results:

VSZ grows slowly.

Expected results:

VSZ should not grow to that values.

Additional info:

Graphs attached represent sum of VSZ values over all glusterfsd processes within one node.

RSS also grows, starting from ~80M at the beginning to ~300M before node upgrade. It is considerably larger value than is observed for volumes without such a workload with millions of files.

Comment 1 Oleksandr Natalenko 2016-09-02 13:50:14 UTC
Created attachment 1197217 [details]
Heimdall server node VSZ

Comment 2 Oleksandr Natalenko 2016-09-02 13:50:41 UTC
Created attachment 1197218 [details]
Pussy server node VSZ

Comment 3 Pranith Kumar K 2016-09-06 19:30:59 UTC
Oleksandr,
       The gluster-users mail has an update about massif. Do you want to give that a try?

Pranith

Comment 4 Oleksandr Natalenko 2016-09-06 19:34:44 UTC
Pranith, sure, please, see new BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1373630

Comment 5 Kaushal 2017-03-08 11:00:33 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.