Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1427461

Summary: Bricks take up new ports upon volume restart after add-brick op with brick mux enabled
Product: [Community] GlusterFS Reporter: Samikshan Bairagya <sbairagy>
Component: glusterdAssignee: Samikshan Bairagya <sbairagy>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10CC: amukherj, bugs, jdarcy, sbairagy
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: brick-multiplexing-testing
Fixed In Version: glusterfs-3.10.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1421590 Environment:
Last Closed: 2017-04-05 00:01:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1421590    
Bug Blocks:    

Description Samikshan Bairagya 2017-02-28 09:32:23 UTC
+++ This bug was initially created as a clone of Bug #1421590 +++

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce and actual results:

Taking a 1 node cluster here to list down the steps to reproduce, but this can be reproduced on multi-node cluster too.

1. Enable brick multiplexing
2. Create 1 volume with one brick
2. Start the volume and check volume status. The brick will be using port 49152
3. Add a brick to the volume and check vol status. Both bricks use 49152
4. Stop the volume and then start it.
5. Check volume status. Both bricks now use 49153.
6. If you restart the volume again and check the status, the bricks would now use 49154. For every restart, the bricks take up the next port.

Expected results:
The bricks should use the ports being used upon restart and not take up a new port.

--- Additional comment from Atin Mukherjee on 2017-02-13 03:48:38 EST ---

Samikshan - just to double check, is this issue not seen if brick mux is disabled?

--- Additional comment from Samikshan Bairagya on 2017-02-13 04:20:47 EST ---

(In reply to Atin Mukherjee from comment #1)
> Samikshan - just to double check, is this issue not seen if brick mux is
> disabled?

No. I tested this with brick mux disabled. This issue wasn't seen.

--- Additional comment from Jeff Darcy on 2017-02-13 09:53:07 EST ---

We're likely to encounter many of these "grey area" bugs which are not addressed by any existing requirements or tests.  Since fixing them is already likely to become a bottleneck, and manual testing is likely to make that even worse, it would be very helpful if other developers could provide the missing tests.  Any suggestions for how best to do that?

--- Additional comment from Worker Ant on 2017-02-20 08:14:38 EST ---

REVIEW: https://review.gluster.org/16689 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#1) for review on master by Samikshan Bairagya (samikshan)

--- Additional comment from Worker Ant on 2017-02-20 09:24:55 EST ---

REVIEW: https://review.gluster.org/16689 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#2) for review on master by Samikshan Bairagya (samikshan)

--- Additional comment from Worker Ant on 2017-02-21 09:48:07 EST ---

REVIEW: https://review.gluster.org/16689 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#3) for review on master by Samikshan Bairagya (samikshan)

--- Additional comment from Worker Ant on 2017-02-27 17:59:07 EST ---

COMMIT: https://review.gluster.org/16689 committed in master by Jeff Darcy (jdarcy) 
------
commit 1e3538baab7abc29ac329c78182b62558da56d98
Author: Samikshan Bairagya <samikshan>
Date:   Mon Feb 20 18:35:01 2017 +0530

    core: Clean up pmap registry up correctly on volume/brick stop
    
    This commit changes the following:
    1. In glusterfs_handle_terminate, send out individual pmap signout
    requests to glusterd for every brick.
    2. Add another parameter to glusterfs_mgmt_pmap_signout function to
    pass the brickname that needs to be removed from the pmap registry.
    3. Make sure pmap_registry_search doesn't break out from the loop
    iterating over the list of bricks per port if the first brick entry
    corresponding to a port is whitespaced out.
    4. Make sure the pmap registry entries are removed for other
    daemons like snapd.
    
    Change-Id: I69949874435b02699e5708dab811777ccb297174
    BUG: 1421590
    Signed-off-by: Samikshan Bairagya <samikshan>
    Reviewed-on: https://review.gluster.org/16689
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Gaurav Yadav <gyadav>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 1 Worker Ant 2017-02-28 09:36:24 UTC
REVIEW: https://review.gluster.org/16786 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#1) for review on release-3.10 by Samikshan Bairagya (samikshan)

Comment 2 Worker Ant 2017-02-28 15:29:58 UTC
COMMIT: https://review.gluster.org/16786 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit b2fd61bd37d92839b1745e68c9c3a3e0ec38e0a7
Author: Samikshan Bairagya <samikshan>
Date:   Mon Feb 20 18:35:01 2017 +0530

    core: Clean up pmap registry up correctly on volume/brick stop
    
    This commit changes the following:
    1. In glusterfs_handle_terminate, send out individual pmap signout
    requests to glusterd for every brick.
    2. Add another parameter to glusterfs_mgmt_pmap_signout function to
    pass the brickname that needs to be removed from the pmap registry.
    3. Make sure pmap_registry_search doesn't break out from the loop
    iterating over the list of bricks per port if the first brick entry
    corresponding to a port is whitespaced out.
    4. Make sure the pmap registry entries are removed for other
    daemons like snapd.
    
    > Reviewed-on: https://review.gluster.org/16689
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Gaurav Yadav <gyadav>
    > Reviewed-by: Jeff Darcy <jdarcy>
    
    (cherry picked from commit 1e3538baab7abc29ac329c78182b62558da56d98)
    
    Change-Id: I69949874435b02699e5708dab811777ccb297174
    BUG: 1427461
    Signed-off-by: Samikshan Bairagya <samikshan>
    Reviewed-on: https://review.gluster.org/16786
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 3 Shyamsundar 2017-04-05 00:01:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.1, please open a new bug report.

glusterfs-3.10.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-April/030494.html
[2] https://www.gluster.org/pipermail/gluster-users/