1631356 – glusterfsd keeping fd open in index xlator after stop the volume

Bug 1631356 - glusterfsd keeping fd open in index xlator after stop the volume

Summary: glusterfsd keeping fd open in index xlator after stop the volume

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	core
Sub Component:
Version:	rhgs-3.4
Hardware:	All
OS:	All
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Mohit Agrawal
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:	1631357
Blocks:
TreeView+	depends on / blocked

Reported:	2018-09-20 12:05 UTC by Mohit Agrawal
Modified:	2018-10-08 05:13 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1631357 1631372 (view as bug list)
Environment:
Last Closed:	2018-10-04 12:31:58 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Mohit Agrawal 2018-09-20 12:05:31 UTC

Description of problem:
glusterfsd keeping fd open in index xlator after stop the volume

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Enable brick_mux 
2.Create 100 volumes(test1..test100) (1x3) environment
3.Start all the volumes
4.Stop volumes test2..test100
5.After stop the volume check in proc for brick_pid
  ls -lrth /proc/<brick_pid>/fd | grep ".glusterfs"

Actual results:
After stop the volume proc is showing .glusterfs is still consumed
for a brick that is already stopped

Expected results:
No internal directory should be consumed for a stopped brick 

Additional info:

Comment 2 Atin Mukherjee 2018-09-20 12:40:15 UTC

upstream patch : https://review.gluster.org/21235

Comment 5 Pranith Kumar K 2018-09-24 13:28:53 UTC

Steps to re-create the issue:

There seems to be a race where by the time fd_destroy() is called, the graph is already cleaned up. Because of this, the fds are not closed because xlator_release()/xlator_releasedir() functions don't get called.

I was able to consistently re-create the issue with the following change:

18:46:22 :) ⚡ git diff
diff --git a/xlators/protocol/server/src/server-helpers.c b/xlators/protocol/server/src/server-helpers.c
index c492ab164..29af4a946 100644
--- a/xlators/protocol/server/src/server-helpers.c
+++ b/xlators/protocol/server/src/server-helpers.c
@@ -249,6 +249,7 @@ server_connection_cleanup_flush_cbk (call_frame_t *frame, void *cookie,
         fd = frame->local;
         client = frame->root->client;
 
+        sleep (5);
         fd_unref (fd);
         frame->local = NULL;

Steps:
1) start glusterd and set brick-mux to on
2) Create 2 plain replicate volumes and set open-behind off on the volume
3) Mount one of the volumes and on the mount execute "exec >a"
4) confirm that the file is opened on the bricks
5) execute "gluster volume stop <volname>" 
6) Wait for a minute just to be on safer side and check "ls /proc/<pid-of-brick>/fd" It shows the file 'a'

Note You need to log in before you can comment on or make changes to this bug.