Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1393316 - OOM Kill on client when heal is in progress on 1*(2+1) arbiter volume
OOM Kill on client when heal is in progress on 1*(2+1) arbiter volume
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: readdir-ahead (Show other bugs)
3.2
All All
high Severity high
: ---
: RHGS 3.2.0
Assigned To: Raghavendra G
Karan Sandha
: Triaged
Depends On: 1356960 1408217 1408220 1408221
Blocks: 1277328 1351528
  Show dependency treegraph
 
Reported: 2016-11-09 05:18 EST by Karan Sandha
Modified: 2017-03-23 02:17 EDT (History)
11 users (show)

See Also:
Fixed In Version: glusterfs-3.8.4-11
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1356960
Environment:
Last Closed: 2017-03-23 02:17:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 05:18:45 EDT

  None (edit)
Comment 2 Raghavendra G 2016-11-22 22:46:48 EST
> Based on the statedump, one big leak I see is from dirents which are allocated by DHT but don't seem to be leaking in dht. I think some xlator above dht is not freeing it. Could you let me know the size of the directory you may have?

I think this can be readdir-ahead. Is it possible to turn off readdir-ahead and see whether it helps?

regards,
Raghavendra
Comment 3 Raghavendra G 2016-11-22 22:58:01 EST
(In reply to Raghavendra G from comment #2)
> > Based on the statedump, one big leak I see is from dirents which are allocated by DHT but don't seem to be leaking in dht. I think some xlator above dht is not freeing it. Could you let me know the size of the directory you may have?
> 
> I think this can be readdir-ahead. Is it possible to turn off readdir-ahead
> and see whether it helps?

The reason I suspect readdir-ahead is that there is no upper limit to amount of dentries readdir-ahead can store as of now. It keeps populating the cache till EOD is reached or an error is encountered in readdir from lower xlators. So, in a scenario where readdirs from application are infrequent and directory is huge, all the dentries of a directory is cached in memory and that could result in OOM. Please note that it is not a leak, but a bug in readdir-ahead to not have an upper limit.

> 
> regards,
> Raghavendra
Comment 4 Raghavendra G 2016-11-22 23:01:31 EST
From https://bugzilla.redhat.com/show_bug.cgi?id=1356960#c5,

<comment>

I have performed the same steps with  "performance.readdir-ahead off" with gluster 3.8.4.3 build and i am not hitting this issue

</comment>
Comment 6 surabhi 2016-11-29 05:00:44 EST
As per the triaging we all have the agreement that this BZ has to be fixed in rhgs-3.2.0. Providing qa_ack
Comment 9 Poornima G 2016-12-09 01:33:00 EST
The unlimited caching behaviour of readdir-ahead is been thee from day-0. Implementing the upper cache limit for readdir-ahead in 3.2 time frame is difficult as the fix is intrusive.

Can this be deferred for 3.2?
Comment 10 Atin Mukherjee 2016-12-15 00:44:51 EST
An upstream patch http://review.gluster.org/#/c/16137/ posted for review
Comment 11 Atin Mukherjee 2016-12-22 06:50:55 EST
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/93587
Comment 12 Atin Mukherjee 2016-12-22 11:53:27 EST
a compilation failure was introduced by https://code.engineering.redhat.com/gerrit/#/c/93587 which is now fixed through https://code.engineering.redhat.com/gerrit/#/c/93622/ (this issue was only there in downstream code)
Comment 14 Atin Mukherjee 2016-12-28 07:41:13 EST
We'd need to pull in one more patch here to ensure rda-low-wmark & rda-high-wmark options are not exposed to the users.

upstream mainline patch : http://review.gluster.org/16297
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/93820/
Comment 16 Milind Changire 2017-01-06 10:47:24 EST
BZ added to erratum https://errata.devel.redhat.com/advisory/24866
Moving to ON_QA
Comment 19 errata-xmlrpc 2017-03-23 02:17:38 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.