Bug 1631356 - glusterfsd keeping fd open in index xlator after stop the volume
Summary: glusterfsd keeping fd open in index xlator after stop the volume
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.4
Hardware: All
OS: All
urgent
urgent
Target Milestone: ---
: ---
Assignee: Mohit Agrawal
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On: 1631357
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-20 12:05 UTC by Mohit Agrawal
Modified: 2018-10-08 05:13 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1631357 1631372 (view as bug list)
Environment:
Last Closed: 2018-10-04 12:31:58 UTC
Embargoed:


Attachments (Terms of Use)

Description Mohit Agrawal 2018-09-20 12:05:31 UTC
Description of problem:
glusterfsd keeping fd open in index xlator after stop the volume

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Enable brick_mux 
2.Create 100 volumes(test1..test100) (1x3) environment
3.Start all the volumes
4.Stop volumes test2..test100
5.After stop the volume check in proc for brick_pid
  ls -lrth /proc/<brick_pid>/fd | grep ".glusterfs"

Actual results:
After stop the volume proc is showing .glusterfs is still consumed
for a brick that is already stopped

Expected results:
No internal directory should be consumed for a stopped brick 

Additional info:

Comment 2 Atin Mukherjee 2018-09-20 12:40:15 UTC
upstream patch : https://review.gluster.org/21235

Comment 5 Pranith Kumar K 2018-09-24 13:28:53 UTC
Steps to re-create the issue:

There seems to be a race where by the time fd_destroy() is called, the graph is already cleaned up. Because of this, the fds are not closed because xlator_release()/xlator_releasedir() functions don't get called.

I was able to consistently re-create the issue with the following change:

18:46:22 :) ⚡ git diff
diff --git a/xlators/protocol/server/src/server-helpers.c b/xlators/protocol/server/src/server-helpers.c
index c492ab164..29af4a946 100644
--- a/xlators/protocol/server/src/server-helpers.c
+++ b/xlators/protocol/server/src/server-helpers.c
@@ -249,6 +249,7 @@ server_connection_cleanup_flush_cbk (call_frame_t *frame, void *cookie,
         fd = frame->local;
         client = frame->root->client;
 
+        sleep (5);
         fd_unref (fd);
         frame->local = NULL;

Steps:
1) start glusterd and set brick-mux to on
2) Create 2 plain replicate volumes and set open-behind off on the volume
3) Mount one of the volumes and on the mount execute "exec >a"
4) confirm that the file is opened on the bricks
5) execute "gluster volume stop <volname>" 
6) Wait for a minute just to be on safer side and check "ls /proc/<pid-of-brick>/fd" It shows the file 'a'


Note You need to log in before you can comment on or make changes to this bug.