Bug 1451573 - AFR returns the node uuid of the same node for every file in the replica
Summary: AFR returns the node uuid of the same node for every file in the replica
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1315781 1366817 1487042
Blocks: 1451561
TreeView+ depends on / blocked
 
Reported: 2017-05-17 05:36 UTC by Nithya Balachandran
Modified: 2017-08-31 06:33 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.11.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1366817
Environment:
Last Closed: 2017-05-30 18:52:40 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Worker Ant 2017-05-17 05:44:37 UTC
REVIEW: https://review.gluster.org/17312 (cluster/dht: Rebalance on all nodes should migrate files) posted (#1) for review on release-3.11 by N Balachandran (nbalacha)

Comment 2 Worker Ant 2017-05-17 13:52:50 UTC
COMMIT: https://review.gluster.org/17312 committed in release-3.11 by Shyamsundar Ranganathan (srangana) 
------
commit 960e97922e9638bd76685fe72f4e11fd1271e185
Author: N Balachandran <nbalacha>
Date:   Wed May 10 21:26:28 2017 +0530

    cluster/dht: Rebalance on all nodes should migrate files
    
    Problem:
    Rebalance compares the node-uuid of a file against its own
    to and migrates a file only if they match. However, the
    current behaviour in both AFR and EC is to return
    the node-uuid of the first brick in a replica set for all
    files. This means a single node ends up migrating all
    the files if the first brick of every replica set is on the
    same node.
    
    Fix:
    AFR and EC will return all node-uuids for the replica set.
    The rebalance process will divide the files to be migrated
    among all the nodes by hashing the gfid of the file and
    using that value to select a node to perform the migration.
    This patch makes the required DHT and tiering changes.
    
    Some tests in rebal-all-nodes-migrate.t will need to be
    uncommented once the AFR and EC changes are merged.
    
    > BUG: 1366817
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17239
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Amar Tumballi <amarts>
    > Reviewed-by: Jeff Darcy <jeff.us>
    > Reviewed-by: Shyamsundar Ranganathan <srangana>
    
    (cherry picked from commit b23bd3dbc2c153171d0bb1205e6804afe022a55f)
    Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c
    BUG: 1451573
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17312
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 3 Worker Ant 2017-05-17 17:52:21 UTC
REVIEW: https://review.gluster.org/17318 (cluster/ec: return all node uuids from all subvolumes) posted (#1) for review on release-3.11 by Pranith Kumar Karampuri (pkarampu)

Comment 4 Worker Ant 2017-05-19 04:43:21 UTC
REVIEW: https://review.gluster.org/17336 (cluster/afr: Return the list of node_uuids for the subvolume) posted (#1) for review on release-3.11 by Ravishankar N (ravishankar)

Comment 5 Worker Ant 2017-05-19 13:21:01 UTC
COMMIT: https://review.gluster.org/17336 committed in release-3.11 by Shyamsundar Ranganathan (srangana) 
------
commit 74aa9ab2f2f6b2514847457101642b359823fde5
Author: karthik-us <ksubrahm>
Date:   Wed Apr 19 18:04:46 2017 +0530

    cluster/afr: Return the list of node_uuids for the subvolume
    
    Problem:
    AFR was returning the node uuid of the first node for every file if
    the replica set was healthy, which was resulting in only one node
    migrating all the files.
    
    Fix:
    With this patch AFR returns the list of node_uuids to the upper layer,
    so that they can decide on which node to migrate which files, resulting
    in improved performance. Ordering of node uuids will be maintained based
    on the ordering of the bricks. If a brick is down, then the node uuid
    for that will be set to all zeros.
    
    >Reviewed-on: https://review.gluster.org/17084
    > Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    > Tested-by: Pranith Kumar Karampuri <pkarampu>
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    (cherry picked from commit 0a50167c0a8f950f5a1c76442b6c9abea466200d)
    
    Change-Id: I73ee0f9898ae473584fdf487a2980d7a6db22f31
    BUG: 1451573
    Signed-off-by: karthik-us <ksubrahm>
    Reviewed-on: https://review.gluster.org/17336
    Tested-by: Ravishankar N <ravishankar>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 6 Worker Ant 2017-05-22 22:41:22 UTC
COMMIT: https://review.gluster.org/17318 committed in release-3.11 by Shyamsundar Ranganathan (srangana) 
------
commit c674cf28ab2de312e6bc53d65e82827667e0cad3
Author: Xavier Hernandez <xhernandez>
Date:   Fri May 12 09:23:47 2017 +0200

    cluster/ec: return all node uuids from all subvolumes
    
    EC was retuning the UUID of the brick with smaller value. This had
    the side effect of not evenly balancing the load between bricks on
    rebalance operations.
    
    This patch modifies the common functions that combine multiple subvolume
    values into a single result to take into account the subvolume order
    and, optionally, other subvolumes that could be damaged.
    
    This makes easier to add future features where brick order is important.
    It also makes possible to easily identify the originating brick of each
    answer, in case some brick will have an special meaning in the future.
    
     >Change-Id: Iee0a4da710b41224a6dc8e13fa8dcddb36c73a2f
     >BUG: 1366817
     >Signed-off-by: Xavier Hernandez <xhernandez>
     >Reviewed-on: https://review.gluster.org/17297
     >Smoke: Gluster Build System <jenkins.org>
     >NetBSD-regression: NetBSD Build System <jenkins.org>
     >CentOS-regression: Gluster Build System <jenkins.org>
     >Reviewed-by: Ashish Pandey <aspandey>
     >Reviewed-by: Pranith Kumar Karampuri <pkarampu>
     >(cherry picked from commit bcc34ce05c1be76dae42838d55c15d3af5f80e48)
    
    Change-Id: I055713c3c25b7ba99248be880414fb0e8f36a67e
    BUG: 1451573
    Signed-off-by: Pranith Kumar Karampuri <pkarampu>
    Reviewed-on: https://review.gluster.org/17318
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 7 Shyamsundar 2017-05-30 18:52:40 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report.

glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.