Bug 1560957 - After performing remove-brick followed by add-brick operation, brick went offline state
Summary: After performing remove-brick followed by add-brick operation, brick went off...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Atin Mukherjee
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1560955
TreeView+ depends on / blocked
 
Reported: 2018-03-27 11:12 UTC by Atin Mukherjee
Modified: 2018-06-20 18:03 UTC (History)
5 users (show)

Fixed In Version: glusterfs-v4.1.0
Clone Of: 1560955
Environment:
Last Closed: 2018-06-20 18:03:13 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Atin Mukherjee 2018-03-27 11:13:26 UTC
Description of problem:

On a three node cluster, Enable brick-mux and create a replica X3 volume. Stop glusterd on node 3 and perform replace-brick on node 1. Replace brick succeeds, now start the glusterd on the node 3. Now Perform add-brick (3 bricks) to the volume. Add-brick succeeds, the brick on the node went offline.   

Version-Release number of selected component (if applicable):
mainline

How reproducible:
2/2

Steps to Reproduce:
1. Create a replica 3 volume and mount it. start io
2. Stop glusterd on one node(N3)
3. Perform replace brick operation on node (N1)
4. Start glusterd on node where it was stopped(N3)
5. Add 3 bricks to the volume, perform this operation on Node (N1)
6. One brick on node(N2) is offline 

Actual results:
Brick on node (N2) is offline

Expected results:
All bricks should be online in the volume


RCA:

glusterd maintains a boolean flag 'port_registered' which is used to determine if a brick has completed its portmap sign in process. This flag is (re)set in pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is the identifier to determine if the very first brick with which the process is spawned up has completed its sign in process. However in case of glusterd restart when a brick is already identified as running, glusterd does a pmap_registry_bind to ensure its portmap table is updated but this flag isn't which is fine in case of non brick multiplex case but causes an issue given the subsequent brick attach can depend on this flag. With replace-brick operation, I think this is more visible as the brick to be replaced is first attached and then the old brick is brought down, so there's eventually no provision for a pmap_signin here as in brick multiplexing only for the very first brick the pmap_signin happens.

Comment 2 Atin Mukherjee 2018-03-31 05:54:18 UTC
Revised steps to reproduce:

1. Create and start a volume (with more than one brick)
2. remove the first brick
3. add one more brick , this operation will take a very significant long time (because of this bug)
4. check volume status, all bricks barring the newly added one will report a N/A status.

Comment 3 Worker Ant 2018-03-31 11:28:43 UTC
REVIEW: https://review.gluster.org/19800 (glusterd: mark port_registered to true for all running bricks with brick mux) posted (#2) for review on master by Atin Mukherjee

Comment 4 Worker Ant 2018-04-05 07:18:29 UTC
COMMIT: https://review.gluster.org/19800 committed in master by "Atin Mukherjee" <amukherj> with a commit message- glusterd: mark port_registered to true for all running bricks with brick mux

glusterd maintains a boolean flag 'port_registered' which is used to determine
if a brick has completed its portmap sign in process. This flag is (re)set in
pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is
the identifier to determine if the very first brick with which the process is
spawned up has completed its sign in process. However in case of glusterd
restart when a brick is already identified as running, glusterd does a
pmap_registry_bind to ensure its portmap table is updated but this flag isn't
which is fine in case of non brick multiplex case but causes an issue if
the very first brick which came as part of process is replaced and then
the subsequent brick attach will fail. One of the way to validate this
is to create and start a volume, remove the first brick and then
add-brick a new one. Add-brick operation will take a very long time and
post that the volume status will show all other brick status apart from
the new brick as down.

Solution is to set brickinfo->port_registered to true for all the
running bricks when brick multiplexing is enabled.

Change-Id: Ib0662d99d0fa66b1538947fd96b43f1cbc04e4ff
Fixes: bz#1560957
Signed-off-by: Atin Mukherjee <amukherj>

Comment 5 Shyamsundar 2018-06-20 18:03:13 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report.

glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.