Bug 763249 (GLUSTER-1517)

Summary: gluster volume stop - starts a new nfs server.
Product: [Community] GlusterFS Reporter: Lakshmipathi G <lakshmipathi>
Component: nfsAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 3.1-alphaCC: gluster-bugs, shehjart, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lakshmipathi G 2010-09-03 09:33:41 UTC
while trying to reproduce this issue again ,it worked. will verify this issue with 3.1.0qa15 and update its results

Comment 1 Lakshmipathi G 2010-09-03 10:34:55 UTC
created two volumes as -
gluster volume create DHT4 10.192.134.144:/mnt/d1 10.214.231.112:/mnt/d1 10.198.110.16:/mnt/d1 10.192.141.187:/mnt/d1
gluster volume create AFR4 10.192.134.144:/mnt/a1 10.214.231.112:/mnt/a1 10.198.110.16:/mnt/a1 10.192.141.187:/mnt/a1

after starting both of them.
#showmount -e
Export list for domU-12-31-39-0E-8E-31:
/DHT4 *
/AFR4 *
10.192.141.187#showmount -e 10.198.110.16
Export list for 10.198.110.16:
/DHT4 *
/AFR4 *
10.192.141.187#showmount -e 10.214.231.112 
Export list for 10.214.231.112:
/DHT4 *
/AFR4 *
10.192.141.187#showmount -e 10.192.134.144
Export list for 10.192.134.144:
/DHT4 *
/AFR4 *

10.214.231.112#ps aux | grep gluste
root      1679  0.0  0.0  56936  1836 ?        Ssl  02:45   0:01 /old_opt/3.0.4/sbin/glusterfs -f cfg.vol /opt/ -l /tmp/client.log
root      1852  0.0  0.1  66128 15444 ?        Ssl  06:16   0:00 glusterd
root      1888  0.0  0.7 140516 55224 ?        Ssl  06:23   0:00 /usr/local/sbin/glusterfs --xlator-option DHT4-server.listen-port=6971 -s localhost --volfile-id DHT4.10.214.231.112.mnt-d1 -p /etc/glusterd/vols/DHT4/run/10.214.231.112-mnt-d1.pid --brick-name /mnt/d1 --brick-port 6971 -l /etc/glusterd/logs/mnt-d1.log
root      1900  0.1  0.7 140516 55216 ?        Ssl  06:24   0:00 /usr/local/sbin/glusterfs --xlator-option AFR4-server.listen-port=6972 -s localhost --volfile-id AFR4.10.214.231.112.mnt-a1 -p /etc/glusterd/vols/AFR4/run/10.214.231.112-mnt-a1.pid --brick-name /mnt/a1 --brick-port 6972 -l /etc/glusterd/logs/mnt-a1.log
root      1904  0.4  1.5 251676 121192 ?       Ssl  06:24   0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid
root      1912  0.0  0.0   6068   648 pts/0    R+   06:24   0:00 grep gluste


Now stop one volume ,DHT4
#gluster volume stop DHT4
Stopping volume DHT4 has been successful

10.192.141.187#showmount -e
Export list for domU-12-31-39-0E-8E-31:
/AFR4 *
10.192.141.187#showmount -e 10.192.134.144
Export list for 10.192.134.144:
/AFR4 *
10.192.141.187#showmount -e 10.214.231.112 
Export list for 10.214.231.112:
/DHT4 *
/AFR4 *


ps shows -2 nfs running.

10.214.231.112#ps aux | grep gluste
root      1679  0.0  0.0  56936  1836 ?        Ssl  02:45   0:01 /old_opt/3.0.4/sbin/glusterfs -f cfg.vol /opt/ -l /tmp/client.log
root      1852  0.0  0.1  66148 15448 ?        Ssl  06:16   0:00 glusterd
root      1900  0.0  0.7 140508 55228 ?        Ssl  06:24   0:00 /usr/local/sbin/glusterfs --xlator-option AFR4-server.listen-port=6972 -s localhost --volfile-id AFR4.10.214.231.112.mnt-a1 -p /etc/glusterd/vols/AFR4/run/10.214.231.112-mnt-a1.pid --brick-name /mnt/a1 --brick-port 6972 -l /etc/glusterd/logs/mnt-a1.log
root      1904  0.2  1.5 252308 121716 ?       Ssl  06:24   0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid
root      1914  0.3  0.8 128280 63004 ?        Ssl  06:25   0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid
root      1919  0.0  0.0   6060   612 pts/0    S+   06:25   0:00 grep gluste

Comment 2 Lakshmipathi G 2010-09-03 11:08:26 UTC
This has two nfs servers again. Added a brick and testing it with dht.

logs files - /share/ticket/1517

#ps ax|grep glust
 8681 ?        Ssl    0:00 glusterd
 8692 ?        Ssl    0:03 /usr/local/sbin/glusterfs --xlator-option afr4-server.listen-port=6971 -s localhost --volfile-id afr4.10.192.134.144.mnt-a1 -p /etc/glusterd/vols/afr4/run/10.192.134.144-mnt-a1.pid --brick-name /mnt/a1 --brick-port 6971 -l /etc/glusterd/logs/mnt-a1.log
 8719 ?        Ssl    0:03 /usr/local/sbin/glusterfs --xlator-option afr2-server.listen-port=6972 -s localhost --volfile-id afr2.10.192.134.144.mnt-d1 -p /etc/glusterd/vols/afr2/run/10.192.134.144-mnt-d1.pid --brick-name /mnt/d1 --brick-port 6972 -l /etc/glusterd/logs/mnt-d1.log
 8832 ?        Ssl    0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid
 8844 ?        Ssl    0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid
 8885 pts/0    S+     0:00 grep glust
25273 ?        Dsl    0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/vols/afr4/rb_client.vol /etc/glusterd/vols/afr4/rb_mount
25339 ?        Ssl    0:00 /old_opt/3.0.4/sbin/glusterfsd -f /root/laks/cfg.vol /opt -l client.log -L NONE

Comment 3 Lakshmipathi G 2010-09-04 06:38:24 UTC
(gdb) bt
#0  0x00002aaaab5738f9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00002aaaaad07006 in event_dispatch_epoll (event_pool=0x62c348) at event.c:838
#2  0x00002aaaaad0749b in event_dispatch (event_pool=0x62c348) at event.c:984
#3  0x0000000000405caa in main (argc=5, argv=0x7fff2175ffc8) at glusterfsd.c:1398
Missing separate debuginfos, use: debuginfo-install glibc.x86_64
(gdb) info threads
  4 Thread 1084229968 (LWP 19019)  0x00002aaaab577268 in do_sigwait () from /lib64/libpthread.so.0
  3 Thread 1085282640 (LWP 19020)  0x00002aaaab8590d8 in epoll_wait () from /lib64/libc.so.6
  2 Thread 1095772496 (LWP 19021)  0x00002aaaab81ec61 in nanosleep () from /lib64/libc.so.6
  1 Thread 46912513096960 (LWP 19018)  0x00002aaaab5738f9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) fr 1
#1  0x00002aaaaad07006 in event_dispatch_epoll (event_pool=0x62c348) at event.c:838
838					pthread_cond_wait (&event_pool->cond,
(gdb) bt
#0  0x00002aaaab5738f9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00002aaaaad07006 in event_dispatch_epoll (event_pool=0x62c348) at event.c:838
#2  0x00002aaaaad0749b in event_dispatch (event_pool=0x62c348) at event.c:984
#3  0x0000000000405caa in main (argc=5, argv=0x7fff2175ffc8) at glusterfsd.c:1398
(gdb) fr 2
#2  0x00002aaaaad0749b in event_dispatch (event_pool=0x62c348) at event.c:984
984		ret = event_pool->ops->event_dispatch (event_pool);
(gdb) bt
#0  0x00002aaaab5738f9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00002aaaaad07006 in event_dispatch_epoll (event_pool=0x62c348) at event.c:838
#2  0x00002aaaaad0749b in event_dispatch (event_pool=0x62c348) at event.c:984
#3  0x0000000000405caa in main (argc=5, argv=0x7fff2175ffc8) at glusterfsd.c:1398
(gdb) th 2
Ambiguous command "th 2": thbreak, thread.
(gdb) t 2
[Switching to thread 2 (Thread 1095772496 (LWP 19021))]#0  0x00002aaaab81ec61 in nanosleep () from /lib64/libc.so.6
(gdb) bt
#0  0x00002aaaab81ec61 in nanosleep () from /lib64/libc.so.6
#1  0x00002aaaab8525c4 in usleep () from /lib64/libc.so.6
#2  0x00002aaaaacf0406 in gf_timer_proc (ctx=0x62a010) at timer.c:182
#3  0x00002aaaab56f407 in start_thread () from /lib64/libpthread.so.0
#4  0x00002aaaab858b0d in clone () from /lib64/libc.so.6
(gdb) t 3
[Switching to thread 3 (Thread 1085282640 (LWP 19020))]#0  0x00002aaaab8590d8 in epoll_wait () from /lib64/libc.so.6
(gdb) bt
#0  0x00002aaaab8590d8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aaaaad070d9 in event_dispatch_epoll (event_pool=0x63d0a8) at event.c:859
#2  0x00002aaaaad0749b in event_dispatch (event_pool=0x63d0a8) at event.c:984
#3  0x00002aaaad6fbe62 in nfs_rpcsvc_stage_proc (arg=0x63d068) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#4  0x00002aaaab56f407 in start_thread () from /lib64/libpthread.so.0
#5  0x00002aaaab858b0d in clone () from /lib64/libc.so.6
(gdb) t 4
[Switching to thread 4 (Thread 1084229968 (LWP 19019))]#0  0x00002aaaab577268 in do_sigwait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002aaaab577268 in do_sigwait () from /lib64/libpthread.so.0
#1  0x00002aaaab57730d in sigwait () from /lib64/libpthread.so.0
#2  0x0000000000405584 in glusterfs_sigwaiter (arg=0x7fff2175fdd0) at glusterfsd.c:1161
#3  0x00002aaaab56f407 in start_thread () from /lib64/libpthread.so.0
#4  0x00002aaaab858b0d in clone () from /lib64/libc.so.6
(gdb) bt
#0  0x00002aaaab577268 in do_sigwait () from /lib64/libpthread.so.0
#1  0x00002aaaab57730d in sigwait () from /lib64/libpthread.so.0
#2  0x0000000000405584 in glusterfs_sigwaiter (arg=0x7fff2175fdd0) at glusterfsd.c:1161
#3  0x00002aaaab56f407 in start_thread () from /lib64/libpthread.so.0
#4  0x00002aaaab858b0d in clone () from /lib64/libc.so.6
(gdb) thread
[Current thread is 4 (Thread 1084229968 (LWP 19019))]
(gdb) threads
Undefined command: "threads".  Try "help".
(gdb) info threads
* 4 Thread 1084229968 (LWP 19019)  0x00002aaaab577268 in do_sigwait () from /lib64/libpthread.so.0
  3 Thread 1085282640 (LWP 19020)  0x00002aaaab8590d8 in epoll_wait () from /lib64/libc.so.6
  2 Thread 1095772496 (LWP 19021)  0x00002aaaab81ec61 in nanosleep () from /lib64/libc.so.6
  1 Thread 46912513096960 (LWP 19018)  0x00002aaaab5738f9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) y

Comment 4 Vijay Bellur 2010-09-06 06:59:31 UTC
PATCH: http://patches.gluster.com/patch/4576 in master (protocol/client: ignore rpc_clnt_destroy as temp fix for sigterm handling)

Comment 5 Shehjar Tikoo 2010-09-07 03:50:16 UTC
*** Bug 1519 has been marked as a duplicate of this bug. ***

Comment 6 Vijay Bellur 2010-09-23 02:29:02 UTC
PATCH: http://patches.gluster.com/patch/4936 in master (glusterfsd: destroy mgmt in cleanup)

Comment 7 Vijay Bellur 2010-09-24 07:54:07 UTC
PATCH: http://patches.gluster.com/patch/4956 in master (mgmt/glusterd: add option to force kill gnfs process)