826975 – glusterfs process not terminated after unmount

Bug 826975 - glusterfs process not terminated after unmount

Summary: glusterfs process not terminated after unmount

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	core
Sub Component:
Version:	3.3-beta
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	shishir gowda
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	854656
TreeView+	depends on / blocked

Reported:	2012-05-31 10:49 UTC by Shwetha Panduranga
Modified:	2013-12-09 01:32 UTC (History)
CC List:	3 users (show)
Fixed In Version:	glusterfs-3.4.0qa6
Clone Of:
Clones:	854656 (view as bug list)
Environment:
Last Closed:	2013-07-09 17:09:43 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Shwetha Panduranga 2012-05-31 10:49:44 UTC

Description of problem:
----------------------
The glusterfs client process is not terminated for every unmount executed on the volume. 

Version-Release number of selected component (if applicable):
------------------------------------------------------------
3.3.oqa45

How reproducible:
-----------------
Often

Steps to Reproduce:
---------------------
1.Create a distribute-replicate volume (3X3)

2.From Node1 and Node2 continuously mount and unmount to the volume with type "fuse/nfs"

Actual results:
-----------------
[05/31/12 - 05:54:30 root@ARF-Client1 ~]# mount
/dev/mapper/vg_dhcp159180-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/vdb1 on /opt/export type xfs (rw)

[05/31/12 - 05:54:33 root@ARF-Client1 ~]# ps -ef | grep gluster
root      2968     1 11 05:26 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3155     1 11 05:27 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3247     1 12 05:28 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3340     1 12 05:29 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3941     1 15 05:34 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      4753     1  0 05:40 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      4815  4794  0 05:54 pts/0    00:00:00 grep gluster
[05/31/12 - 05:54:40 root@ARF-Client1 ~]# 

Node2:-
-------
[05/31/12 - 05:56:17 root@AFR-Client2 ~]# mount
/dev/mapper/vg_dhcp159192-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

[05/31/12 - 05:56:19 root@AFR-Client2 ~]# ps -ef | grep gluster
root     13157     1  0 05:24 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     13354     1  0 05:25 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     13827     1  0 05:29 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     13923     1  0 05:30 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14148     1  0 05:33 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14543     1  0 05:36 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14687     1  0 05:37 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14782 14759  0 05:56 pts/0    00:00:00 grep gluster

Expected results:
For every unmount, the glusterfs process should be terminated. 

Additional info:
------------------

[05/31/12 - 06:13:34 root@AFR-Server1 ~]# gluster v info
 
Volume Name: dstore
Type: Distributed-Replicate
Volume ID: ebb5f2a8-b35c-4583-855b-65814c5a1b6e
Status: Started
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: 10.16.159.184:/export_b1/dir1
Brick2: 10.16.159.188:/export_b1/dir1
Brick3: 10.16.159.196:/export_b1/dir1
Brick4: 10.16.159.184:/export_c1/dir1
Brick5: 10.16.159.188:/export_c1/dir1
Brick6: 10.16.159.196:/export_c1/dir1
Brick7: 10.16.159.184:/export_d1/dir1
Brick8: 10.16.159.188:/export_d1/dir1
Brick9: 10.16.159.196:/export_d1/dir1

Comment 1 Amar Tumballi 2012-05-31 12:15:56 UTC

interesting.. Shwetha, can you attach to one of this process and see where it is hung? 

gdb -p <PID>; 'gdb) thread apply all bt full'

That will help to corner the issue.

Comment 2 Shwetha Panduranga 2012-05-31 12:35:04 UTC

Lets consider 2 nodes. Node1 and Node2.On volume auth.allow is set to Node1. 

Mount from node1 succeeds. mount from node2 fails and error message is also reported but glusterfs process is started.

 Steps to recreate the issue:-
----------------------------

[05/31/12 - 08:20:02 root@AFR-Server1 ~]# gluster v create vol1 10.16.159.184:/export11
Creation of volume vol1 has been successful. Please start the volume to access data.

[05/31/12 - 08:23:40 root@AFR-Server1 ~]# gluster v set vol1 auth.allow 10.16.159.180
Set volume successful

[05/31/12 - 08:23:58 root@AFR-Server1 ~]# gluster v info
 
Volume Name: vol1
Type: Distribute
Volume ID: f90a7384-f5d7-4f13-970f-6db6a01afce6
Status: Created
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.16.159.184:/export11
Options Reconfigured:
auth.allow: 10.16.159.180

[05/31/12 - 08:28:05 root@AFR-Server1 ~]# gluster v start vol1
Starting volume vol1 has been successful

Client1 :- 10.16.159.180
--------------------------
[05/31/12 - 08:28:19 root@ARF-Client1 ~]# mount -t glusterfs 10.16.159.184:/vol1 /mnt/gfsc1
[05/31/12 - 08:28:26 root@ARF-Client1 ~]# ps -ef | grep gluster
root     15141     1  0 08:28 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/vol1 --volfile-server=10.16.159.184 /mnt/gfsc1
root     15154  4794  0 08:28 pts/0    00:00:00 grep gluster


Client2 :- 10.16.159.192
-------------------------
[05/31/12 - 08:28:33 root@AFR-Client2 ~]# mount -t glusterfs 10.16.159.184:/vol1 /mnt/gfsc1
Mount failed. Please check the log file for more details.
[05/31/12 - 08:28:40 root@AFR-Client2 ~]# ps -ef | grep gluster
root     23120     1  0 08:28 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/vol1 --volfile-server=10.16.159.184 /mnt/gfsc1
root     23134 14759  0 08:28 pts/0    00:00:00 grep gluster

Comment 3 shishir gowda 2012-07-06 05:42:14 UTC

Can you please attach gdb to any one of these processes and provide the bt? Also a statedump of any one of these processes would help

Comment 4 Amar Tumballi 2012-12-24 09:43:43 UTC

Shwetha, not happening anymore in upstream testing. Please re-open if seen again.

Note You need to log in before you can comment on or make changes to this bug.