Bug 854656 - glusterfs process not terminated after unmount
glusterfs process not terminated after unmount
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
2.0
Unspecified Unspecified
medium Severity high
: ---
: ---
Assigned To: Amar Tumballi
spandura
:
Depends On: 826975
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-05 09:49 EDT by Vidya Sakar
Modified: 2013-12-18 19:08 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 826975
Environment:
Last Closed: 2013-09-23 18:33:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vidya Sakar 2012-09-05 09:49:10 EDT
+++ This bug was initially created as a clone of Bug #826975 +++

Description of problem:
----------------------
The glusterfs client process is not terminated for every unmount executed on the volume. 

Version-Release number of selected component (if applicable):
------------------------------------------------------------
3.3.oqa45

How reproducible:
-----------------
Often

Steps to Reproduce:
---------------------
1.Create a distribute-replicate volume (3X3)

2.From Node1 and Node2 continuously mount and unmount to the volume with type "fuse/nfs"

Actual results:
-----------------
[05/31/12 - 05:54:30 root@ARF-Client1 ~]# mount
/dev/mapper/vg_dhcp159180-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/vdb1 on /opt/export type xfs (rw)

[05/31/12 - 05:54:33 root@ARF-Client1 ~]# ps -ef | grep gluster
root      2968     1 11 05:26 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3155     1 11 05:27 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3247     1 12 05:28 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3340     1 12 05:29 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      3941     1 15 05:34 ?        00:03:13 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      4753     1  0 05:40 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root      4815  4794  0 05:54 pts/0    00:00:00 grep gluster
[05/31/12 - 05:54:40 root@ARF-Client1 ~]# 

Node2:-
-------
[05/31/12 - 05:56:17 root@AFR-Client2 ~]# mount
/dev/mapper/vg_dhcp159192-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

[05/31/12 - 05:56:19 root@AFR-Client2 ~]# ps -ef | grep gluster
root     13157     1  0 05:24 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     13354     1  0 05:25 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     13827     1  0 05:29 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     13923     1  0 05:30 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14148     1  0 05:33 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14543     1  0 05:36 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14687     1  0 05:37 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/dstore --volfile-server=10.16.159.184 /mnt/gfsc1
root     14782 14759  0 05:56 pts/0    00:00:00 grep gluster

Expected results:
For every unmount, the glusterfs process should be terminated. 

Additional info:
------------------

[05/31/12 - 06:13:34 root@AFR-Server1 ~]# gluster v info
 
Volume Name: dstore
Type: Distributed-Replicate
Volume ID: ebb5f2a8-b35c-4583-855b-65814c5a1b6e
Status: Started
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: 10.16.159.184:/export_b1/dir1
Brick2: 10.16.159.188:/export_b1/dir1
Brick3: 10.16.159.196:/export_b1/dir1
Brick4: 10.16.159.184:/export_c1/dir1
Brick5: 10.16.159.188:/export_c1/dir1
Brick6: 10.16.159.196:/export_c1/dir1
Brick7: 10.16.159.184:/export_d1/dir1
Brick8: 10.16.159.188:/export_d1/dir1
Brick9: 10.16.159.196:/export_d1/dir1

--- Additional comment from amarts@redhat.com on 2012-05-31 08:15:56 EDT ---

interesting.. Shwetha, can you attach to one of this process and see where it is hung? 

gdb -p <PID>; 'gdb) thread apply all bt full'

That will help to corner the issue.

--- Additional comment from shwetha.h.panduranga@redhat.com on 2012-05-31 08:35:04 EDT ---

Lets consider 2 nodes. Node1 and Node2.On volume auth.allow is set to Node1. 

Mount from node1 succeeds. mount from node2 fails and error message is also reported but glusterfs process is started.

 Steps to recreate the issue:-
----------------------------

[05/31/12 - 08:20:02 root@AFR-Server1 ~]# gluster v create vol1 10.16.159.184:/export11
Creation of volume vol1 has been successful. Please start the volume to access data.

[05/31/12 - 08:23:40 root@AFR-Server1 ~]# gluster v set vol1 auth.allow 10.16.159.180
Set volume successful

[05/31/12 - 08:23:58 root@AFR-Server1 ~]# gluster v info
 
Volume Name: vol1
Type: Distribute
Volume ID: f90a7384-f5d7-4f13-970f-6db6a01afce6
Status: Created
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.16.159.184:/export11
Options Reconfigured:
auth.allow: 10.16.159.180

[05/31/12 - 08:28:05 root@AFR-Server1 ~]# gluster v start vol1
Starting volume vol1 has been successful

Client1 :- 10.16.159.180
--------------------------
[05/31/12 - 08:28:19 root@ARF-Client1 ~]# mount -t glusterfs 10.16.159.184:/vol1 /mnt/gfsc1
[05/31/12 - 08:28:26 root@ARF-Client1 ~]# ps -ef | grep gluster
root     15141     1  0 08:28 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/vol1 --volfile-server=10.16.159.184 /mnt/gfsc1
root     15154  4794  0 08:28 pts/0    00:00:00 grep gluster


Client2 :- 10.16.159.192
-------------------------
[05/31/12 - 08:28:33 root@AFR-Client2 ~]# mount -t glusterfs 10.16.159.184:/vol1 /mnt/gfsc1
Mount failed. Please check the log file for more details.
[05/31/12 - 08:28:40 root@AFR-Client2 ~]# ps -ef | grep gluster
root     23120     1  0 08:28 ?        00:00:00 /usr/local/sbin/glusterfs --volfile-id=/vol1 --volfile-server=10.16.159.184 /mnt/gfsc1
root     23134 14759  0 08:28 pts/0    00:00:00 grep gluster

--- Additional comment from sgowda@redhat.com on 2012-07-06 01:42:14 EDT ---

Can you please attach gdb to any one of these processes and provide the bt? Also a statedump of any one of these processes would help
Comment 4 spandura 2013-07-09 07:14:50 EDT
Verified whether the bug exist on the build:

root@king [Jul-09-2013-16:43:15] >rpm -qa | grep glusterfs
glusterfs-fuse-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-geo-replication-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-server-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-rdma-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-debuginfo-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-devel-3.4.0.12rhs.beta3-1.el6rhs.x86_64


root@king [Jul-09-2013-16:43:16] >gluster --version
glusterfs 3.4.0.12rhs.beta3 built on Jul  6 2013 14:35:18


This issue is not happening any more.
Comment 5 Scott Haines 2013-09-23 18:33:19 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.