985388 – running dbench results in leaked fds leading to OOM killer killing glusterfsd.

Bug 985388 - running dbench results in leaked fds leading to OOM killer killing glusterfsd.

Summary: running dbench results in leaked fds leading to OOM killer killing glusterfsd.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Pranith Kumar K
QA Contact:	spandura
Docs Contact:
URL:
Whiteboard:
Depends On:	976800
Blocks:	977250
TreeView+	depends on / blocked

Reported:	2013-07-17 11:23 UTC by Pranith Kumar K
Modified:	2013-09-23 22:35 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.4.0.12rhs.beta6-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:	976800
Environment:
Last Closed:	2013-09-23 22:35:54 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Pranith Kumar K 2013-07-17 11:23:58 UTC

+++ This bug was initially created as a clone of Bug #976800 +++

Description of problem:
Running dbench on a distributed replicate volume causes leaked fds on the server thereby causing the OOM killer to kill the brick process.

Output of dmesg on server:
========================================================================
<snip>
VFS: file-max limit 188568 reached
.
.
.
Out of memory: Kill process 12235 (glusterfsd) score 215 or sacrifice child
Killed process 12235, UID 0, (glusterfsd) total-vm:3138856kB, anon-rss:466728kB, file-rss:1028kB
glusterfsd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
glusterfsd cpuset=/ mems_allowed=0
Pid: 12333, comm: glusterfsd Not tainted 2.6.32-358.6.2.el6.x86_64 #1

</snip>
========================================================================


How reproducible:
Always

Steps to Reproduce:
1. create a 2x2 distributed replicate volume and FUSE mount it
2. On the mount point,run "dbench -s -F -S -x  --one-byte-write-fix --stat-check 10"
3.Kill dbench after running for about 3 minutes.
4. On the server, do
 ls -l /proc/pid_of_brick(s)/fd|grep deleted


Actual results:
We can still see open/unlinked fds even though dbench was killed. Also if dbench is run till completion, we can observe some of the bricks getting killed by the oom killer (ps aux |grep glusterfsd)

Expected results:
Once dbench is killed, the brick processes must not have open/unlinked fds.

Additional info:
Bisected the leak to the following commit ID on upstream:
* 8909c28 - cluster/afr: fsync() guarantees POST-OP completion

--- Additional comment from Anand Avati on 2013-06-24 03:03:34 EDT ---

REVIEW: http://review.gluster.org/5248 (cluster/afr: Fix fd/memory leak on fsync) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2013-06-24 12:45:57 EDT ---

COMMIT: http://review.gluster.org/5248 committed in master by Anand Avati (avati) 
------
commit 03f5172dd50b50988c65dd66e87a0d43e78a3810
Author: Pranith Kumar K <pkarampu>
Date:   Mon Jun 24 08:15:09 2013 +0530

    cluster/afr: Fix fd/memory leak on fsync
    
    Change-Id: I764883811e30ca9d9c249ad00b6762101083a2fe
    BUG: 976800
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/5248
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 2 Amar Tumballi 2013-07-23 09:57:37 UTC

https://code.engineering.redhat.com/gerrit/10491

Comment 3 Pranith Kumar K 2013-07-26 09:18:43 UTC

Sac,
     There seems to be still one leaked fd bug if open-behind is not disabled(Is there a bug for it?). So for verifying this bug please disable open-behind.

Pranith.

Comment 4 spandura 2013-07-31 10:43:37 UTC

Verified this bug on the build:-
================================
root@king [Jul-31-2013-15:49:42] >rpm -qa | grep glusterfs-server
glusterfs-server-3.4.0.13rhs-1.el6rhs.x86_64

root@king [Jul-31-2013-15:49:49] >gluster --version
glusterfs 3.4.0.13rhs built on Jul 28 2013 15:22:56

Steps to verify:
================

Case 1:
~~~~~~~~~~~
1. create a 1 x 2 replicate volume

2. Create 2 Fuse mount

3. On both the mount points ,run "dbench -s -F -S -x  --one-byte-write-fix --stat-check 10"

4. Kill dbench after running for about 3 minutes.

5. On both the storage_nodes, do
 ls -l /proc/pid_of_brick(s)/fd|grep deleted 

Actual Result:
================
No fd leaks observed. 

Case 2:
~~~~~~~~~~~~~
1. create a 1 x 2 replicate volume. Set the volume option "open-behind" to "off"

2. Create 2 Fuse mount

3. On both the mount points ,run "dbench -s -F -S -x  --one-byte-write-fix --stat-check 10"

4. Kill dbench after running for about 3 minutes.

5. On both the storage_nodes, do
 ls -l /proc/pid_of_brick(s)/fd|grep deleted 

Actual Result:
================
No fd leaks observed.

Bug is fixed. Moving it to verified state. 

But there are fd leaks while running the same case with "open-behind" volume option set to "on" . Please refer to the bug 990510

Comment 5 Scott Haines 2013-09-23 22:35:54 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.