Bug 1224857 - DHT - rebalance - when any brick/sub-vol is down and rebalance is not performing any action(fixing lay-out or migrating data) it should not say 'Starting rebalance on volume <vol-name> has been successful' .
Summary: DHT - rebalance - when any brick/sub-vol is down and rebalance is not perform...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Sakshi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1312722
TreeView+ depends on / blocked
 
Reported: 2015-05-26 04:15 UTC by Sakshi
Modified: 2016-06-16 13:04 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 890637
: 1312722 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:04:41 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Sakshi 2015-05-26 04:15:10 UTC
+++ This bug was initially created as a clone of Bug #890637 +++

Description of problem:
DHT - rebalance - when any brick/sub-vol is down, rebalance will performing any action but cli says 'Starting rebalance on volume <vol-name> has been successful' .


Version-Release number of selected component (if applicable):
3.3.0.5rhs-40

How reproducible:
always

Steps to Reproduce:
1. Create a Distributed volume having 3 or more sub-volumes on multiple server and start that volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs  server:/<volume> <client-1_mount_point>”

3. From mount point create some dirs and files inside it.
4. Bring on of the sub-volume down.
[root@localhost ~]# gluster volume status
Status of volume: defect
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.35.173:/home/def1				24011	Y	6440
Brick 10.70.35.180:/home/def1				24011	Y	28882
Brick 10.70.35.170:/home/def1				24011	N	27711
NFS Server on localhost					38467	Y	6608
NFS Server on 10.70.35.170				38467	Y	6153
NFS Server on 10.70.35.173				38467	Y	6446


5. Execute rebalance.
[root@localhost ~]# gluster volume rebalance defect fix-layout start
Starting rebalance on volume defect has been successful

6. check status and log
[root@localhost ~]# gluster volume rebalance defect status
                                    Node Rebalanced-files          size       scanned      failures         status
                               ---------      -----------   -----------   -----------   -----------   ------------
                               localhost                0            0            0            1         failed
                            10.70.35.173                0            0            0            1         failed
                            10.70.35.170                0            0            0            1         failed


log:-
[2012-12-28 09:55:48.833293] I [dht-common.c:2337:dht_setxattr] 0-defect-dht: fixing the layout of /
[2012-12-28 09:55:48.833309] W [dht-selfheal.c:603:dht_fix_layout_of_directory] 0-defect-dht: 1 subvolume(s) are down. Skipping fix layout.
  
Actual results:
[root@localhost ~]# gluster volume rebalance defect fix-layout start
Starting rebalance on volume defect has been successful

Expected results:
all sub-vol/bricks should be up is basic condition for rebalance. So when one sub-vol or brick is down, It should give proper message indicating that rebalance is not started as one of the brick/sub-volume is down rather than saying it started


Additional info:

--- Additional comment from Rachana Patel on 2012-12-28 06:19:40 EST ---

correction:-
Description of problem:
DHT - rebalance - when any brick/sub-vol is down, rebalance will not performing any action but cli says 'Starting rebalance on volume <vol-name> has been successful' .

--- Additional comment from Scott Haines on 2013-09-27 13:07:32 EDT ---

Targeting for 3.0.0 (Denali) release.

--- Additional comment from Vivek Agarwal on 2014-04-07 07:40:57 EDT ---

Per bug triage, between dev, PM and QA, moving these out of denali

Comment 1 Anand Avati 2015-05-27 14:36:38 UTC
REVIEW: http://review.gluster.org/10906 (dht: check if all bricks are started before performing rebalance) posted (#2) for review on master by Sakshi Bansal (sabansal)

Comment 2 Vijay Bellur 2016-02-03 11:59:43 UTC
REVIEW: http://review.gluster.org/10906 (glusterd: check if glusterd is started on all nodes and all           bricks are started before performing rebalance) posted (#5) for review on master by Sakshi Bansal

Comment 3 Vijay Bellur 2016-02-28 16:23:54 UTC
COMMIT: http://review.gluster.org/10906 committed in master by Atin Mukherjee (amukherj) 
------
commit 368e26f454fe35477e46dc698fa6b8c3c608ea8d
Author: Sakshi <sabansal>
Date:   Tue May 26 09:53:55 2015 +0530

    glusterd: check if glusterd is started on all nodes and all
              bricks are started before performing rebalance
    
    Change-Id: I458ea9cd86cf35bdb7d758be55f951ae9f3e66f0
    BUG: 1224857
    Signed-off-by: Sakshi <sabansal>
    Reviewed-on: http://review.gluster.org/10906
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 4 Niels de Vos 2016-06-16 13:04:41 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.