Bug 1312722

Summary: DHT - rebalance - when any brick/sub-vol is down and rebalance is not performing any action(fixing lay-out or migrating data) it should not say 'Starting rebalance on volume <vol-name> has been successful' .
Product: [Community] GlusterFS Reporter: Sakshi <sabansal>
Component: distributeAssignee: Sakshi <sabansal>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 3.7.9CC: amukherj, bugs, mzywusko, nbalacha, racpatel, rhs-bugs, smohan, vbellur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1224857 Environment:
Last Closed: 2016-06-28 12:13:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1224857    
Bug Blocks:    

Description Sakshi 2016-02-29 03:54:32 UTC
+++ This bug was initially created as a clone of Bug #1224857 +++

+++ This bug was initially created as a clone of Bug #890637 +++

Description of problem:
DHT - rebalance - when any brick/sub-vol is down, rebalance will performing any action but cli says 'Starting rebalance on volume <vol-name> has been successful' .


Version-Release number of selected component (if applicable):
3.3.0.5rhs-40

How reproducible:
always

Steps to Reproduce:
1. Create a Distributed volume having 3 or more sub-volumes on multiple server and start that volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs  server:/<volume> <client-1_mount_point>”

3. From mount point create some dirs and files inside it.
4. Bring on of the sub-volume down.
[root@localhost ~]# gluster volume status
Status of volume: defect
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.35.173:/home/def1				24011	Y	6440
Brick 10.70.35.180:/home/def1				24011	Y	28882
Brick 10.70.35.170:/home/def1				24011	N	27711
NFS Server on localhost					38467	Y	6608
NFS Server on 10.70.35.170				38467	Y	6153
NFS Server on 10.70.35.173				38467	Y	6446


5. Execute rebalance.
[root@localhost ~]# gluster volume rebalance defect fix-layout start
Starting rebalance on volume defect has been successful

6. check status and log
[root@localhost ~]# gluster volume rebalance defect status
                                    Node Rebalanced-files          size       scanned      failures         status
                               ---------      -----------   -----------   -----------   -----------   ------------
                               localhost                0            0            0            1         failed
                            10.70.35.173                0            0            0            1         failed
                            10.70.35.170                0            0            0            1         failed


log:-
[2012-12-28 09:55:48.833293] I [dht-common.c:2337:dht_setxattr] 0-defect-dht: fixing the layout of /
[2012-12-28 09:55:48.833309] W [dht-selfheal.c:603:dht_fix_layout_of_directory] 0-defect-dht: 1 subvolume(s) are down. Skipping fix layout.
  
Actual results:
[root@localhost ~]# gluster volume rebalance defect fix-layout start
Starting rebalance on volume defect has been successful

Expected results:
all sub-vol/bricks should be up is basic condition for rebalance. So when one sub-vol or brick is down, It should give proper message indicating that rebalance is not started as one of the brick/sub-volume is down rather than saying it started


Additional info:

--- Additional comment from Rachana Patel on 2012-12-28 06:19:40 EST ---

correction:-
Description of problem:
DHT - rebalance - when any brick/sub-vol is down, rebalance will not performing any action but cli says 'Starting rebalance on volume <vol-name> has been successful' .

--- Additional comment from Scott Haines on 2013-09-27 13:07:32 EDT ---

Targeting for 3.0.0 (Denali) release.

--- Additional comment from Vivek Agarwal on 2014-04-07 07:40:57 EDT ---

Per bug triage, between dev, PM and QA, moving these out of denali

--- Additional comment from Anand Avati on 2015-05-27 10:36:38 EDT ---

REVIEW: http://review.gluster.org/10906 (dht: check if all bricks are started before performing rebalance) posted (#2) for review on master by Sakshi Bansal (sabansal)

--- Additional comment from Vijay Bellur on 2016-02-03 06:59:43 EST ---

REVIEW: http://review.gluster.org/10906 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#5) for review on master by Sakshi Bansal

--- Additional comment from Vijay Bellur on 2016-02-28 11:23:54 EST ---

COMMIT: http://review.gluster.org/10906 committed in master by Atin Mukherjee (amukherj) 
------
commit 368e26f454fe35477e46dc698fa6b8c3c608ea8d
Author: Sakshi <sabansal>
Date:   Tue May 26 09:53:55 2015 +0530

    glusterd: check if glusterd is started on all nodes and all
              bricks are started before performing rebalance
    
    Change-Id: I458ea9cd86cf35bdb7d758be55f951ae9f3e66f0
    BUG: 1224857
    Signed-off-by: Sakshi <sabansal>
    Reviewed-on: http://review.gluster.org/10906
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 1 Vijay Bellur 2016-02-29 03:58:45 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#1) for review on release-3.7 by Sakshi Bansal

Comment 2 Vijay Bellur 2016-03-01 10:37:17 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#2) for review on release-3.7 by Sakshi Bansal

Comment 3 Vijay Bellur 2016-03-07 04:48:32 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#3) for review on release-3.7 by Sakshi Bansal

Comment 4 Vijay Bellur 2016-03-17 12:16:12 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#4) for review on release-3.7 by Raghavendra Talur (rtalur)

Comment 5 Mike McCune 2016-03-28 23:31:34 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 6 Vijay Bellur 2016-04-07 06:32:52 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#5) for review on release-3.7 by Sakshi Bansal

Comment 7 Vijay Bellur 2016-04-07 10:35:20 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#6) for review on release-3.7 by Sakshi Bansal

Comment 8 Vijay Bellur 2016-04-14 11:34:57 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#7) for review on release-3.7 by Raghavendra Talur (rtalur)

Comment 9 Vijay Bellur 2016-04-14 17:49:36 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#8) for review on release-3.7 by Raghavendra Talur (rtalur)

Comment 10 Vijay Bellur 2016-04-26 21:46:29 UTC
REVIEW: http://review.gluster.org/13537 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#9) for review on release-3.7 by Raghavendra Talur (rtalur)

Comment 11 Vijay Bellur 2016-04-29 06:59:45 UTC
COMMIT: http://review.gluster.org/13537 committed in release-3.7 by Raghavendra Talur (rtalur) 
------
commit a8e4a633d5ee42cbbf747ba31f5e3295e6d20ac0
Author: Sakshi <sabansal>
Date:   Tue May 26 09:53:55 2015 +0530

    glusterd: check if glusterd is started on all nodes and all
              bricks are started before performing rebalance
    
    Backport of http://review.gluster.org/#/c/10906/
    
    > Change-Id: I458ea9cd86cf35bdb7d758be55f951ae9f3e66f0
    > BUG: 1224857
    > Signed-off-by: Sakshi <sabansal>
    > Reviewed-on: http://review.gluster.org/10906
    > Smoke: Gluster Build System <jenkins.com>
    > CentOS-regression: Gluster Build System <jenkins.com>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > Reviewed-by: Atin Mukherjee <amukherj>
    
    BUG: 1312722
    Change-Id: Ib8e59b33e064be8301f682a4b08cb5cf10c22fc9
    Signed-off-by: Sakshi <sabansal>
    Signed-off-by: Raghavendra Talur <rtalur>
    Reviewed-on: http://review.gluster.org/13537
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 12 Kaushal 2016-06-28 12:13:55 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.12, please open a new bug report.

glusterfs-3.7.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-June/049918.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user