Bug 991025 - Rebalance : Status command showing incorrect status of nodes after rebooting while rebalance is in progress
Summary: Rebalance : Status command showing incorrect status of nodes after rebooting ...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Nithya Balachandran
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1286104
TreeView+ depends on / blocked
 
Reported: 2013-08-01 12:36 UTC by senaik
Modified: 2015-11-27 10:51 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1286104 (view as bug list)
Environment:
Last Closed: 2015-11-27 10:46:55 UTC
Embargoed:


Attachments (Terms of Use)

Description senaik 2013-08-01 12:36:41 UTC
Description of problem:
=========================== 
Rebalance Status command showing incorrect status of node after reboot while rebalance is in progress 

Version-Release number of selected component (if applicable):
=========================================================== 
3.4.0.14rhs-1.el6rhs.x86_64.rpm


How reproducible:
================== 
Always


Steps to Reproduce:
==================== 
1.Create a 2x2 distributed replicate Volume 

2.Fuse mount the volume and create some files 
for i in {1..300} ; do dd if=/dev/urandom of=f"$i" bs=10M count=1; done

3.Add 2 bricks to the volume and start rebalance 

4.Some failures are reported in rebalance status due to space issue , execute rebalance start force command and check status. 

gluster v rebalance dist_repl status

Node  Rebalanced-files  size    scanned failures    status run time in secs
----- ----------------  ----   -------- --------   ------- -----------------
localhost      17      170.0MB   80       0      in progress    5.00
10.70.34.88    0       0Bytes    304      0      completed      1.00
10.70.34.86    0       0Bytes    304      0      completed      0.00
10.70.34.87    14      140.0MB   176      0      in progress    5.00

5. Reboot one of the nodes - 10.70.34.87 

6. Check Rebalance Status 

gluster v rebalance dist_repl status

Node  Rebalanced-files  size    scanned failures    status run time in secs
----- ----------------  ----   -------- --------   ------- -----------------
localhost      36      360.0MB   336      0      completed      5.00
10.70.34.88    0       0Bytes    304      0      completed      1.00
10.70.34.86    0       0Bytes    304      0      completed      0.00

volume rebalance: dist_repl: success:

7. After the node comes back online , check status again , Status is shown as 'not started'

 Node  Rebalanced-files  size    scanned failures    status run time in secs
----- ----------------  ----   -------- --------   ------- -----------------
localhost      36      360.0MB   336      0      completed      15.00
10.70.34.88    0       0Bytes    304      0      completed      0.00
10.70.34.86    0       0Bytes    304      0      completed      0.00
10.70.34.87    0      0Bytes    0        0      not started    0.00
volume rebalance: dist_repl: success:

Actual results:
================= 
Rebalance Status should shows the status as 'not started' after the node comes back online .  


Expected results:
================= 
Rebalance Status should show the status as 'Completed' if rebalance process is completed or if rebalance process was in progress when the node went down , it should start rebalance process when the node comes back online . 

Additional info:
================= 
[root@boost tmp]# gluster v i dist_repl
 
Volume Name: dist_repl
Type: Distributed-Replicate
Volume ID: 767665be-5dba-4a74-8b1e-251bb9d91f50
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.34.85:/rhs/brick1/ab1
Brick2: 10.70.34.86:/rhs/brick1/ab2
Brick3: 10.70.34.87:/rhs/brick1/ab3
Brick4: 10.70.34.88:/rhs/brick1/ab4
Brick5: 10.70.34.86:/rhs/brick1/ab5
Brick6: 10.70.34.87:/rhs/brick1/ab6


Note You need to log in before you can comment on or make changes to this bug.