Bug 1427461 - Bricks take up new ports upon volume restart after add-brick op with brick mux enabled
Summary: Bricks take up new ports upon volume restart after add-brick op with brick mu...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Samikshan Bairagya
QA Contact:
URL:
Whiteboard: brick-multiplexing-testing
Depends On: 1421590
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-28 09:32 UTC by Samikshan Bairagya
Modified: 2017-04-05 00:01 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.10.1
Clone Of: 1421590
Environment:
Last Closed: 2017-04-05 00:01:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Samikshan Bairagya 2017-02-28 09:32:23 UTC
+++ This bug was initially created as a clone of Bug #1421590 +++

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce and actual results:

Taking a 1 node cluster here to list down the steps to reproduce, but this can be reproduced on multi-node cluster too.

1. Enable brick multiplexing
2. Create 1 volume with one brick
2. Start the volume and check volume status. The brick will be using port 49152
3. Add a brick to the volume and check vol status. Both bricks use 49152
4. Stop the volume and then start it.
5. Check volume status. Both bricks now use 49153.
6. If you restart the volume again and check the status, the bricks would now use 49154. For every restart, the bricks take up the next port.

Expected results:
The bricks should use the ports being used upon restart and not take up a new port.

--- Additional comment from Atin Mukherjee on 2017-02-13 03:48:38 EST ---

Samikshan - just to double check, is this issue not seen if brick mux is disabled?

--- Additional comment from Samikshan Bairagya on 2017-02-13 04:20:47 EST ---

(In reply to Atin Mukherjee from comment #1)
> Samikshan - just to double check, is this issue not seen if brick mux is
> disabled?

No. I tested this with brick mux disabled. This issue wasn't seen.

--- Additional comment from Jeff Darcy on 2017-02-13 09:53:07 EST ---

We're likely to encounter many of these "grey area" bugs which are not addressed by any existing requirements or tests.  Since fixing them is already likely to become a bottleneck, and manual testing is likely to make that even worse, it would be very helpful if other developers could provide the missing tests.  Any suggestions for how best to do that?

--- Additional comment from Worker Ant on 2017-02-20 08:14:38 EST ---

REVIEW: https://review.gluster.org/16689 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#1) for review on master by Samikshan Bairagya (samikshan)

--- Additional comment from Worker Ant on 2017-02-20 09:24:55 EST ---

REVIEW: https://review.gluster.org/16689 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#2) for review on master by Samikshan Bairagya (samikshan)

--- Additional comment from Worker Ant on 2017-02-21 09:48:07 EST ---

REVIEW: https://review.gluster.org/16689 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#3) for review on master by Samikshan Bairagya (samikshan)

--- Additional comment from Worker Ant on 2017-02-27 17:59:07 EST ---

COMMIT: https://review.gluster.org/16689 committed in master by Jeff Darcy (jdarcy) 
------
commit 1e3538baab7abc29ac329c78182b62558da56d98
Author: Samikshan Bairagya <samikshan>
Date:   Mon Feb 20 18:35:01 2017 +0530

    core: Clean up pmap registry up correctly on volume/brick stop
    
    This commit changes the following:
    1. In glusterfs_handle_terminate, send out individual pmap signout
    requests to glusterd for every brick.
    2. Add another parameter to glusterfs_mgmt_pmap_signout function to
    pass the brickname that needs to be removed from the pmap registry.
    3. Make sure pmap_registry_search doesn't break out from the loop
    iterating over the list of bricks per port if the first brick entry
    corresponding to a port is whitespaced out.
    4. Make sure the pmap registry entries are removed for other
    daemons like snapd.
    
    Change-Id: I69949874435b02699e5708dab811777ccb297174
    BUG: 1421590
    Signed-off-by: Samikshan Bairagya <samikshan>
    Reviewed-on: https://review.gluster.org/16689
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Gaurav Yadav <gyadav>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 1 Worker Ant 2017-02-28 09:36:24 UTC
REVIEW: https://review.gluster.org/16786 (core: Clean up pmap registry up correctly on volume/brick stop) posted (#1) for review on release-3.10 by Samikshan Bairagya (samikshan)

Comment 2 Worker Ant 2017-02-28 15:29:58 UTC
COMMIT: https://review.gluster.org/16786 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit b2fd61bd37d92839b1745e68c9c3a3e0ec38e0a7
Author: Samikshan Bairagya <samikshan>
Date:   Mon Feb 20 18:35:01 2017 +0530

    core: Clean up pmap registry up correctly on volume/brick stop
    
    This commit changes the following:
    1. In glusterfs_handle_terminate, send out individual pmap signout
    requests to glusterd for every brick.
    2. Add another parameter to glusterfs_mgmt_pmap_signout function to
    pass the brickname that needs to be removed from the pmap registry.
    3. Make sure pmap_registry_search doesn't break out from the loop
    iterating over the list of bricks per port if the first brick entry
    corresponding to a port is whitespaced out.
    4. Make sure the pmap registry entries are removed for other
    daemons like snapd.
    
    > Reviewed-on: https://review.gluster.org/16689
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Gaurav Yadav <gyadav>
    > Reviewed-by: Jeff Darcy <jdarcy>
    
    (cherry picked from commit 1e3538baab7abc29ac329c78182b62558da56d98)
    
    Change-Id: I69949874435b02699e5708dab811777ccb297174
    BUG: 1427461
    Signed-off-by: Samikshan Bairagya <samikshan>
    Reviewed-on: https://review.gluster.org/16786
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 3 Shyamsundar 2017-04-05 00:01:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.1, please open a new bug report.

glusterfs-3.10.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-April/030494.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.