Description of problem: ======================= "gluster get-state" is capturing the port number for the stopped state brick process. Volume1.Brick1.path: 10.70.41.198:/bricks/brick0/q0 Volume1.Brick1.hostname: 10.70.41.198 Volume1.Brick1.port: 49152 <========== Volume1.Brick1.rdma_port: 0 Volume1.Brick1.status: Stopped <=========== Volume1.Brick1.signedin: False Volume1.Brick2.path: 10.70.41.217:/bricks/brick0/q1 Volume1.Brick2.hostname: 10.70.41.217 Volume1.Brick3.path: 10.70.41.198:/bricks/brick1/q2 Volume1.Brick3.hostname: 10.70.41.198 Volume1.Brick3.port: 49153 Volume1.Brick3.rdma_port: 0 Volume1.Brick3.status: Stopped Volume1.Brick3.signedin: False Volume1.Brick4.path: 10.70.41.217:/bricks/brick1/q3 Volume1.Brick4.hostname: 10.70.41.217 Actual results: =============== "gluster get-state" is capturing the port number for the stopped state brick process. Expected results: ================= port should be 0
REVIEW: http://review.gluster.org/16064 (glusterd: reset port when a daemon is brought down) posted (#1) for review on master by Atin Mukherjee (amukherj)
REVIEW: http://review.gluster.org/16064 (glusterd: reset port when a daemon is brought down) posted (#2) for review on master by Atin Mukherjee (amukherj)
Initially I thought that we can reset the port value to 0 in case a rpc disconnect is received. But it looks like we can't reset the port value to 0, if we end up with stale port entries (in case of abrupt shutdown of daemons when pmap_signout will not be received by glusterd), having ports reset to 0 will not help in cleaning up the entries. It looks like we have to live with this problem.
This is no more a valid bug. With brick multiplexing changes this is taken care of.