tested with 3.1.0qa19 ,as you can see above glusterd killed first and started again. Currently running volume is stopped and started again. There is two nfs process.
#gluster volume info Volume Name: dht Type: Distribute Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: 10.192.134.144:/mnt/a1 Brick2: 10.192.141.187:/mnt/a1 Brick3: 10.214.231.112:/mnt/a1 10.192.141.187#ps aux | grep gluster root 19607 0.0 0.1 73320 12748 ? Ssl 02:37 0:00 glusterd root 19624 0.0 0.7 208540 58168 ? Ssl 02:38 0:00 /usr/local/sbin/glusterfs --xlator-option dht-server.listen-port=6971 -s localhost --volfile-id dht.10.192.141.187.mnt-a1 -p /etc/glusterd/vols/dht/run/10.192.141.187-mnt-a1.pid --brick-name /mnt/a1 --brick-port 6971 -l /etc/glusterd/logs/mnt-a1.log root 19652 0.0 0.8 129848 64712 ? Ssl 02:43 0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid root 19701 0.0 0.0 6060 604 pts/0 R+ 02:51 0:00 grep gluster root 32072 0.0 1.0 258600 82448 ? Ssl Aug16 3:08 /old_opt/3.0.4/sbin/glusterfsd -f /root/laks/cfg.vol /opt -l client.log -L NONE kill now - 10.192.141.187#killall glusterd restart - 10.192.141.187#glusterd 10.192.141.187#gluster volume stop dht Stopping volume will make its data inaccessible. Do you want to Continue? (y/n) y Stopping volume dht has been successful 10.192.141.187#gluster volume start dht Starting volume dht has been successful 10.192.141.187#gluster volume info Volume Name: dht Type: Distribute Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: 10.192.134.144:/mnt/a1 Brick2: 10.192.141.187:/mnt/a1 Brick3: 10.214.231.112:/mnt/a1 10.192.141.187#ps aux | grep gluster root 19652 0.0 0.8 129848 64712 ? Ssl 02:43 0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid <<<<<<<<<<<<<<<<<<<1 root 19704 1.5 0.1 63080 12700 ? Ssl 02:51 0:00 glusterd root 19714 0.3 0.7 143012 57680 ? Ssl 02:51 0:00 /usr/local/sbin/glusterfs --xlator-option dht-server.listen-port=6971 -s localhost --volfile-id dht.10.192.141.187.mnt-a1 -p /etc/glusterd/vols/dht/run/10.192.141.187-mnt-a1.pid --brick-name /mnt/a1 --brick-port 6971 -l /etc/glusterd/logs/mnt-a1.log root 19718 0.6 0.7 127872 62740 ? Ssl 02:51 0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid <<<<<<<<<<<<<<<<<<<2 root 19729 0.0 0.0 6064 612 pts/0 S+ 02:51 0:00 grep gluster root 32072 0.0 1.0 258600 82448 ? Ssl Aug16 3:08 /old_opt/3.0.4/sbin/glusterfsd -f /root/laks/cfg.vol /opt -l client.log -L NONE
When the glusterd restarts the uuid of the brickinfo will be null, this is the reason why the stop doesnt kill the nfs process that runs. In the latest build the check for this is done and the bricks are resolved prior to the starting or stopping. I tested this case again and it works fine. pranith @ ~/workspace/1repo 13:07:04 :( $ ps aux | grep gluster root 5866 0.0 1.3 69128 27924 pts/2 S 11:38 0:00 gdb glusterd root 13437 0.4 0.4 49956 9104 pts/2 Sl+ 13:06 0:00 /usr/local/sbin/glusterd --debug root 13439 0.0 0.4 57096 10044 pts/0 Sl+ 13:06 0:00 gluster root 13445 0.7 2.9 136888 59596 ? Ssl 13:07 0:00 /usr/local/sbin/glusterfs --xlator-option pranith-server.listen-port=6971 -s localhost --volfile-id pranith.pranith-laptop.home-export6996 -p /etc/glusterd/vols/pranith/run/pranith-laptop-home-export6996.pid --brick-name /home/export6996 --brick-port 6971 -l /etc/glusterd/logs/bricks/home-export6996.log root 13449 0.6 3.0 120052 61084 ? Ssl 13:07 0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /etc/glusterd/logs/nfs.log pranith 13458 0.0 0.0 7624 896 pts/3 S+ 13:07 0:00 grep --color=auto gluster pranith @ ~/workspace/1repo 13:07:11 :) $ ps aux | grep gluster root 5866 0.0 1.3 69128 27924 pts/2 S 11:38 0:00 gdb glusterd root 13439 0.0 0.4 57100 10064 pts/0 Sl+ 13:06 0:00 gluster root 13459 0.8 0.4 50088 9044 pts/2 Sl+ 13:07 0:00 /usr/local/sbin/glusterd --debug pranith 13463 0.0 0.0 7624 896 pts/3 S+ 13:07 0:00 grep --color=auto gluster pranith @ ~/workspace/1repo 13:07:54 :) $ ps aux | grep gluster root 5866 0.0 1.3 69128 27924 pts/2 S 11:38 0:00 gdb glusterd root 13439 0.0 0.4 57100 10072 pts/0 Sl+ 13:06 0:00 gluster root 13459 0.4 0.4 50088 9088 pts/2 Sl+ 13:07 0:00 /usr/local/sbin/glusterd --debug root 13465 0.5 2.9 136888 59596 ? Ssl 13:08 0:00 /usr/local/sbin/glusterfs --xlator-option pranith-server.listen-port=6971 -s localhost --volfile-id pranith.pranith-laptop.home-export6996 -p /etc/glusterd/vols/pranith/run/pranith-laptop-home-export6996.pid --brick-name /home/export6996 --brick-port 6971 -l /etc/glusterd/logs/bricks/home-export6996.log root 13469 0.6 3.0 120052 61084 ? Ssl 13:08 0:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /etc/glusterd/logs/nfs.log pranith 13479 0.0 0.0 7624 896 pts/3 S+ 13:08 0:00 grep --color=auto gluster
the analysis above doesnt apply for this bug as it is logged after the fix. SInce it is working fine for me I am moving it to works for me. If it is found again we can test it again.