Bug 1495161
Summary: | [GSS] Few brick processes are consuming more memory after patching 3.2 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Prerna Sony <psony> | ||||
Component: | locks | Assignee: | Xavi Hernandez <jahernan> | ||||
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rhgs-3.2 | CC: | abhishku, amukherj, bkunal, hgowtham, jahernan, nbalacha, nchilaka, psony, rcyriac, rhs-bugs, sankarshan, sheggodu, srmukher, storage-qa-internal | ||||
Target Milestone: | --- | ||||||
Target Release: | RHGS 3.4.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.12.2-2 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, processes that used of many POSIX locks, possibly in combination with gluster clear-locks command, would lead to memory leak causing high memory consumption on brick processes that triggered ‘OOM killer’ error in some cases.
This release fixes the issue related to leaks present in translators.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-09-04 06:36:24 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1503135, 1507361, 1526377 | ||||||
Attachments: |
|
Description
Prerna Sony
2017-09-25 11:09:30 UTC
Created attachment 1330478 [details]
State dump
Created attachment 1331309 [details]
state-dump from 3.1.3 (prod, prod-moodle)
Hi Atin, Yes, it is a regression. This was introduced in 3.8 for https://bugzilla.redhat.com/show_bug.cgi?id=1326085 This code is not there on 3.1.3 but is there on 3.2. Regards, Hari. Below is what I had run for a span of ~4 days on 3.12.2-15: create a 18x3 volume with performance.client-io-threads off and brickmux off(as in customer case) mounted volume on 8 different clients, and triggered different kinds of IOs as below 1) script to take locks on a file in multiple iterations( 2 clients 2) linux untar from 2 clients for multiple iterations 3) from 2 client creating files simultaneous, renaming and deleting as below for x in {1..10000};do for i in {1..10000};do dd if=/dev/urandom of=file.$x.$i bs=123 count=10000;done;for j in {1..10000};do mv -f file.$x.$j file.$x.$j.$j;done;rm -rf file.$x.*;done 4) different IOs from 2 client using crefi as below for x in {1..1000};do for i in {create,chmod,chown,chgrp,symlink,truncate,rename,hardlink}; do ./crefi.py --multi -n 15 -b 100 -d 20 --max=10K --min=50 --random -T 3 -t text --fop=$i /mnt/locks/IOs/Crefi/$HOSTNAME/ ; sleep 10 ; done;rm -rf /mnt/locks/IOs/Crefi/$HOSTNAME/*;done 5) same directory creation in depth and bredth from 2 clients simultaneously mounted client locally on one server and was issuing clearing of locks as below for i in $(find IOs);do gluster volume clear-locks locks /$i kind all posix; done Over this 3 days I didn't see any siginificant mem consumption by bricks sosreports and logs for my tests @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1495161/onqa_verification memory info captured in file "fresh_top.log" for each node I don't see any concern with memory footprint even after 4 days resident memory has increased by about 1% per glusterfsd and is not anything close to what customer has seen I am moving BZ to verified based on my above comments from testing (if need be i will raise a new bz for c#58) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |