Bug 1507749 - clean up port map on brick disconnect
Summary: clean up port map on brick disconnect
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Atin Mukherjee
QA Contact:
URL:
Whiteboard:
Depends On: 1503246
Blocks: 1503244 1507747 1526371 1530512
TreeView+ depends on / blocked
 
Reported: 2017-10-31 04:43 UTC by Atin Mukherjee
Modified: 2018-01-03 07:47 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.10.8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1503246
Environment:
Last Closed: 2017-12-08 16:46:32 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Atin Mukherjee 2017-10-31 04:43:13 UTC
+++ This bug was initially created as a clone of Bug #1503246 +++

Description of problem:

GlusterD's portmap entry for a brick is cleaned up when a PMAP_SIGNOUT event is initiated by the brick process at the shutdown. But if the brick process crashes or gets killed through SIGKILL then this event is not initiated and glusterd ends up with a stale port. Since GlusterD's portmap traversal happens both ways, forward for allocation and backward for registry search, there is a possibility that glusterd might end up running with a stale port for a brick which eventually will end up with clients to fail to connect to the bricks or other daemons.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Worker Ant on 2017-10-17 12:07:18 EDT ---

REVIEW: https://review.gluster.org/18541 (glusterd: clean up portmap on brick disconnect) posted (#1) for review on master by Atin Mukherjee (amukherj@redhat.com)

--- Additional comment from Worker Ant on 2017-10-19 09:43:56 EDT ---

REVIEW: https://review.gluster.org/18541 (glusterd: clean up portmap on brick disconnect) posted (#2) for review on master by Atin Mukherjee (amukherj@redhat.com)

--- Additional comment from Worker Ant on 2017-10-31 00:37:16 EDT ---

COMMIT: https://review.gluster.org/18541 committed in master by  

------------- glusterd: clean up portmap on brick disconnect

GlusterD's portmap entry for a brick is cleaned up when a PMAP_SIGNOUT event is
initiated by the brick process at the shutdown. But if the brick process crashes
or gets killed through SIGKILL then this event is not initiated and glusterd
ends up with a stale port. Since GlusterD's portmap traversal happens both ways,
forward for allocation and backward for registry search, there is a possibility
that glusterd might end up running with a stale port for a brick which
eventually will end up with clients to fail to connect to the bricks.

Solution is to clean up the port entry in case the process is down as
part of the brick disconnect event. Although with this the handling
PMAP_SIGNOUT event becomes redundant in most of the cases, but this is
the safeguard method to avoid glusterd getting into the stale port
issues.

Change-Id: I04c5be6d11e772ee4de16caf56dbb37d5c944303
BUG: 1503246
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>

Comment 1 Worker Ant 2017-10-31 04:48:33 UTC
REVIEW: https://review.gluster.org/18588 (glusterd: clean up portmap on brick disconnect) posted (#1) for review on release-3.10 by Atin Mukherjee

Comment 2 Worker Ant 2017-10-31 18:08:00 UTC
COMMIT: https://review.gluster.org/18588 committed in release-3.10 by  

------------- glusterd: clean up portmap on brick disconnect

GlusterD's portmap entry for a brick is cleaned up when a PMAP_SIGNOUT event is
initiated by the brick process at the shutdown. But if the brick process crashes
or gets killed through SIGKILL then this event is not initiated and glusterd
ends up with a stale port. Since GlusterD's portmap traversal happens both ways,
forward for allocation and backward for registry search, there is a possibility
that glusterd might end up running with a stale port for a brick which
eventually will end up with clients to fail to connect to the bricks.

Solution is to clean up the port entry in case the process is down as
part of the brick disconnect event. Although with this the handling
PMAP_SIGNOUT event becomes redundant in most of the cases, but this is
the safeguard method to avoid glusterd getting into the stale port
issues.

This patch also needs to bring in the changes from change id
I705f101739ab1647ff52a92820d478354407264a which is needed for the
compilation to go through.

> mainline patch : https://review.gluster.org/#/c/18541/
>                  https://review.gluster.org/#/c/17129/

Change-Id: I04c5be6d11e772ee4de16caf56dbb37d5c944303
BUG: 1507749
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>

Comment 3 Shyamsundar 2017-12-08 16:46:32 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.8, please open a new bug report.

glusterfs-3.10.8 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-December/000086.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.