| Summary: | [Compound FOPs] Client side IObuff leaks at a high pace consumes complete client memory and hence making gluster volume inaccessible | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> | ||||||
| Component: | core | Assignee: | Krutika Dhananjay <kdhananj> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | rhgs-3.2 | CC: | amukherj, kdhananj, rhinduja, rhs-bugs, storage-qa-internal | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | RHGS 3.2.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | compund-fop | ||||||||
| Fixed In Version: | glusterfs-3.8.4-6 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1395687 (view as bug list) | Environment: | |||||||
| Last Closed: | 2017-03-23 06:18:07 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1351528, 1395687, 1395694 | ||||||||
| Attachments: |
|
||||||||
|
Description
Nag Pavan Chilakam
2016-11-10 08:19:56 UTC
Created attachment 1219242 [details]
statedumps
From further tests, it seems with all likeliness that it is due to appends
I just appended a file in the mount with TOP and FREE -h command output every 2 minutes.
With every loop the mem consumption seemed to increase by 5MB
I have taken statedumps in below intervals
[root@rhs-client45 gluster]# date;free -h;date ;kill -USR1 16078;date
Thu Nov 10 15:39:39 IST 2016
total used free shared buff/cache available
Mem: 15G 504M 11G 24M 3.2G 14G
Swap: 7.9G 51M 7.8G
Thu Nov 10 15:39:39 IST 2016
Thu Nov 10 15:39:39 IST 2016
[root@rhs-client45 gluster]# ls
glusterdump.16078.dump.1478772579
[root@rhs-client45 gluster]# date;free -h;date ;kill -USR1 16078;date
Thu Nov 10 15:41:05 IST 2016
total used free shared buff/cache available
Mem: 15G 509M 11G 24M 3.2G 14G
Swap: 7.9G 51M 7.8G
Thu Nov 10 15:41:05 IST 2016
Thu Nov 10 15:41:05 IST 2016
[root@rhs-client45 gluster]# ll
total 596
-rw-------. 1 root root 301703 Nov 10 15:39 glusterdump.16078.dump.1478772579
-rw-------. 1 root root 305566 Nov 10 15:41 glusterdump.16078.dump.1478772665
[root@rhs-client45 gluster]# ll
total 596
-rw-------. 1 root root 301703 Nov 10 15:39 glusterdump.16078.dump.1478772579
-rw-------. 1 root root 305566 Nov 10 15:41 glusterdump.16078.dump.1478772665
[root@rhs-client45 gluster]#
[root@rhs-client45 gluster]# ll
total 596
-rw-------. 1 root root 301703 Nov 10 15:39 glusterdump.16078.dump.1478772579
-rw-------. 1 root root 305566 Nov 10 15:41 glusterdump.16078.dump.1478772665
[root@rhs-client45 gluster]# date;free -h;date ;kill -USR1 16078;date
Thu Nov 10 15:44:20 IST 2016
total used free shared buff/cache available
Mem: 15G 518M 11G 24M 3.2G 14G
Swap: 7.9G 51M 7.8G
Thu Nov 10 15:44:20 IST 2016
Thu Nov 10 15:44:20 IST 2016
[root@rhs-client45 gluster]#
Attached are new statedumps
Created attachment 1219301 [details]
new statedumps with append only
Just a Note: Compound FOPs is enabled :) So there is indeed iobuf leak as per the statedump attached. Just to confirm that it is compound fops indeed that is contributing to the leak (and not any of the several other options enabled on the volume), I did a dd on a plain replicate volume without compound-fops, captured statedump before and after the dd - there was no increase in active iobufs. Once I enabled compound fops and did a dd again, the number of active iobufs had increased from 1 to 6 and after another dd from 6 to 13. Also checked afr changes for compound fops - I didn't find any mem leaks there. So the only possibility is leaks at the protocol/client layer which I am investigating through code reading at the moment. Found the leaks. Nice catch, Nag! :) The patch on upstream master is out for review - http://review.gluster.org/#/c/15860/ Moving this bug to POST state. https://code.engineering.redhat.com/gerrit/#/c/90421/ <--- patch posted Waiting on QE and PM ack before it can be merged downstream. QA validation: Raised https://bugzilla.redhat.com/show_bug.cgi?id=1401380 while on_qa verification I have done the same case in same environment I did see fuse mem leaks, but this has nothing to do with compound fops as it was seen without even cfops enabled. Also, should kruthika the statedumps and they didn't have any pointrrs to cfops buffs, hence moving to verified tested on 3.8.4-8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |