+++ This bug was initially created as a clone of Bug #1434448 +++ Description of problem: ================== After enabling brick multiplexing, I killed the brick process(which is universal for that node for all bricks of all volumes) on one of the node. I see that the process gets killed and all bricks show the online status and port number as N or N/A However it still shows the old PID of the killed process This PID also should be shown as N root@dhcp35-215 bricks]# gluster v status|grep 215 Before kill the brick process(grep'ing only for bricks in this local node) Brick 10.70.35.215:/rhs/brick3/cross3 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick4/cross3 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick1/ecvol 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick2/ecvol 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick3/ecvol 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick4/ecvol 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick1/ecx 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick2/ecx 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick3/ecx 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick4/ecx 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick3/rep2 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick4/rep2 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick3/rep3 49152 0 Y 13072 Brick 10.70.35.215:/rhs/brick4/rep3 49152 0 Y 13072 [root@dhcp35-215 bricks]# kill -9 13072 [root@dhcp35-215 bricks]# gluster v status|grep 215 (after kill the brick process) Brick 10.70.35.215:/rhs/brick3/cross3 N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick4/cross3 N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick1/ecvol N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick2/ecvol N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick3/ecvol N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick4/ecvol N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick1/ecx N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick2/ecx N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick3/ecx N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick4/ecx N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick3/rep2 N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick4/rep2 N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick3/rep3 N/A N/A N 13072 Brick 10.70.35.215:/rhs/brick4/rep3 N/A N/A N 13072 [root@dhcp35-215 bricks]# ps -ef|grep 13072 root 2258 21234 0 19:35 pts/0 00:00:00 grep --color=auto 13072 [root@dhcp35-215 bricks]# Version-Release number of selected component (if applicable): ============ glusterfs-libs-3.10.0-1.el7.x86_64 glusterfs-api-3.10.0-1.el7.x86_64 glusterfs-rdma-3.10.0-1.el7.x86_64 glusterfs-3.10.0-1.el7.x86_64 python2-gluster-3.10.0-1.el7.x86_64 glusterfs-fuse-3.10.0-1.el7.x86_64 glusterfs-server-3.10.0-1.el7.x86_64 glusterfs-geo-replication-3.10.0-1.el7.x86_64 glusterfs-extra-xlators-3.10.0-1.el7.x86_64 glusterfs-client-xlators-3.10.0-1.el7.x86_64 glusterfs-cli-3.10.0-1.el7.x86_64 How reproducible: ======= always Steps to Reproduce: 1.enabled brick multiplexing feature 2.create a volume or multiple volume and start them 3.you can notice all bricks hosted on the same node will be having same PID 4. select a node and kill the PID 5. issue volume status Actual results: ==== volume status still shows the PID against each brick even though the PID is killed Expected results: ================ PID must show as N/A --- Additional comment from Jeff Darcy on 2017-03-21 11:16:58 EDT --- I would say that killing a process is an invalid test, but this probably needs to be fixed anyway.
REVIEW: https://review.gluster.org/16971 (glusterd: reset pid to -1 if brick is not online) posted (#1) for review on master by Atin Mukherjee (amukherj)
COMMIT: https://review.gluster.org/16971 committed in master by Jeff Darcy (jeff.us) ------ commit e325479cf222d2f25dbc0a4c6b80bfe5a7f09f43 Author: Atin Mukherjee <amukherj> Date: Thu Mar 30 14:47:45 2017 +0530 glusterd: reset pid to -1 if brick is not online While populating brick details in gluster volume status response payload if a brick is not online then pid should be reset back to -1 so that volume status output doesn't show up the pid which was not cleaned up especially with brick multiplexing where multiple bricks belong to same process. Change-Id: Iba346da9a8cb5b5f5dd38031d4c5ef2097808387 BUG: 1437494 Signed-off-by: Atin Mukherjee <amukherj> Reviewed-on: https://review.gluster.org/16971 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Gaurav Yadav <gyadav> Reviewed-by: Prashanth Pai <ppai> Reviewed-by: Jeff Darcy <jeff.us>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report. glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html [2] https://www.gluster.org/pipermail/gluster-users/