Bug 1034173

Summary: Rebalance : Status lists failures on stopping rebalance while it is in progress
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: senaik
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED DEFERRED QA Contact: storage-qa-internal <storage-qa-internal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.1CC: spalai, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1286171 1286172 (view as bug list) Environment:
Last Closed: 2015-11-27 12:13:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1286171, 1286172, 1800956    

Description senaik 2013-11-25 11:54:51 UTC
Description of problem:
======================= 
When rebalance is in progress , and we execute rebalance stop command and check the status , there are failures listed in the output 


Version-Release number of selected component (if applicable):
============================================================= 
glusterfs 3.4.0.44rhs


How reproducible:
================= 
faced it twice 


Steps to Reproduce:
================== 
1.Create a distribute replicate volume and start it 

2.Fuse mount the volume and create some directories and files with the script attached 

3.Add 2 bricks and start rebalance 

4.Add 2 more bricks and start rebalance . While rebalance is in progress , stop rebalance and check status 

gluster v rebalance vol1 status

Node Rebalanced-files size scanned failures skipped  status run time in secs
---- ---------------- ---- ------- --------- ------  ------ ----------------
localhost      95    1.4MB   1175    9         0     stopped    12.00
10.70.34.86     0    0Bytes  1918    10        0     stopped    11.00
10.70.34.88     0    0Bytes  1900    10        0     stopped    11.00
10.70.34.89     0    0Bytes  1970    9         0     stopped    11.00
10.70.34.87     0    0Bytes  1970    10        0     stopped    11.00

volume rebalance: vol1: success:


--------------part of the log ------------------

[2013-11-25 10:47:42.449124] I [dht-rebalance.c:1788:gf_defrag_stop] 0-: Received stop command on rebalance
[2013-11-25 10:47:42.449175] I [dht-rebalance.c:1766:gf_defrag_status_get] 0-glusterfs: Rebalance is stopped. Time taken is 7.00 secs
[2013-11-25 10:47:42.449193] I [dht-rebalance.c:1769:gf_defrag_status_get] 0-glusterfs: Files migrated: 77, size: 1129472, lookups: 522, failures: 0, 
skipped: 0
[2013-11-25 10:47:42.451809] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir3/TestDir1
[2013-11-25 10:47:42.452313] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir3
[2013-11-25 10:47:42.452629] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0
[2013-11-25 10:47:42.453153] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0
[2013-11-25 10:47:42.453648] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0/TestDir0/TestDir0
[2013-11-25 10:47:42.454289] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0/TestDir0
[2013-11-25 10:47:42.454914] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0/TestDir0
[2013-11-25 10:47:42.455506] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0/TestDir0
[2013-11-25 10:47:42.456053] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0/TestDir0
[2013-11-25 10:47:42.456615] E [dht-rebalance.c:1483:gf_defrag_fix_layout] 0-vol1-dht: Fix layout failed for /TestDir0
[2013-11-25 10:47:42.457202] I [dht-rebalance.c:1766:gf_defrag_status_get] 0-glusterfs: Rebalance is stopped. Time taken is 7.00 secs
[2013-11-25 10:47:42.457276] I [dht-rebalance.c:1769:gf_defrag_status_get] 0-glusterfs: Files migrated: 77, size: 1129472, lookups: 522, failures: 10, skipped: 0
--------------------------------------------------------------------------- 

Actual results:
===============
Failures listed in rebalance status output on executing rebalance stop command while rebalance is in progress 


Expected results:
================
Failures should not be listed in rebalance status output because of rebalance stop command unless there are failures due to other issues 

Additional info:

Comment 4 Susant Kumar Palai 2015-11-27 12:13:43 UTC
Cloning this to 3.1. to be fixed in future release.