Bug 990562

Summary: Rebalance/remove-brick:Treat migration failures due to space constraints as skipped
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: shishir gowda <sgowda>
Component: glusterfsAssignee: shishir gowda <sgowda>
Status: CLOSED ERRATA QA Contact: senaik
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: amarts, asriram, gluster-bugs, nsathyan, rhs-bugs, sdharane, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.15rhs Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 989846 Environment:
Last Closed: 2013-09-23 22:35:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 989846, 998077    
Bug Blocks:    

Description shishir gowda 2013-07-31 13:04:20 UTC
+++ This bug was initially created as a clone of Bug #989846 +++

Description of problem:
Currently when files are ignored/skipped for migration due to space constraints (destination does not have enough space or migration leads to imbalance in cluster), we treat them as failures too. These files are shown as failures in the cli output(status)

We need to separate these out, and treat them correctly as ignored/skipped to prevent users from getting worried with the failure counts.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Anand Avati on 2013-07-30 00:27:59 EDT ---

REVIEW: http://review.gluster.org/5399 (cluster/dht: Treat migration failures due to space constraints as skipped) posted (#3) for review on master by Shishir Gowda (sgowda)

--- Additional comment from Anand Avati on 2013-07-30 23:52:49 EDT ---

REVIEW: http://review.gluster.org/5399 (cluster/dht: Treat migration failures due to space constraints as skipped) posted (#4) for review on master by Shishir Gowda (sgowda)

--- Additional comment from Anand Avati on 2013-07-31 02:56:38 EDT ---

COMMIT: http://review.gluster.org/5399 committed in master by Vijay Bellur (vbellur) 
------
commit e306698b00d2d3e736cbc97a1383bfb5d3724796
Author: shishir gowda <sgowda>
Date:   Fri Jul 26 11:59:12 2013 +0530

    cluster/dht: Treat migration failures due to space constraints as skipped
    
    Currently rebalance/remove-brick op's display migration failed count even
    for files which failed due to space issues (not enough space for file, or
    migration leading to cluster imbalance)
    
    These will now be counted as skipped, and rebalance/remove-brick status
    will display the additional counter
    
    Change-Id: I674904d380b5f8300e9ca9e6af557c3d30d6cff4
    BUG: 989846
    Signed-off-by: shishir gowda <sgowda>
    Reviewed-on: http://review.gluster.org/5399
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 4 senaik 2013-08-07 12:36:11 UTC
Version : 3.4.0.17rhs-1.el6rhs.x86_64
========

1) Files which fail to migrate because of space constarints are now reported as skipped in Rebalance Status output

gluster v rebalance sample status

Node   Rebalanced-files size  scanned  failures  skipped status run time in secs
----  ----------------- ----  -------  --------  -------- ------ --------------- 
localhost       3      30.0MB   507      0         44   completed   2.00
10.70.34.88     0      0Bytes   506      0          0   completed   1.00
10.70.34.86     22     220.0MB  525      0          0   completed   3.00
10.70.34.87     0      0Bytes   509      0         45   completed   2.00

volume rebalance: sample: success:

-------------------Part of rebalance log-----------------

[2013-08-07 12:07:01.678969] I [dht-rebalance.c:672:dht_migrate_file] 0-sample-dht: /f423: attempting to move from sample-client-0 to sample-client-4
[2013-08-07 12:07:01.682079] W [dht-rebalance.c:374:__dht_check_free_space] 0-sample-dht: data movement attempted from node (sample-client-0) with hig
her disk space to a node (sample-client-4) with lesser disk space (/f423)
[2013-08-07 12:07:01.685689] I [dht-rebalance.c:672:dht_migrate_file] 0-sample-dht: /f426: attempting to move from sample-client-0 to sample-client-4
[2013-08-07 12:07:01.691630] W [dht-rebalance.c:374:__dht_check_free_space] 0-sample-dht: data movement attempted from node (sample-client-0) with hig
her disk space to a node (sample-client-4) with lesser disk space (/f426)
------------------------------------------------------------

2) Rebalance start force migrates all the skipped files

 gluster v rebalance sample start force
volume rebalance: sample: success: Starting rebalance on volume sample has been successful.
ID: b5444208-5ebe-49fa-bfc1-32fddce2549e

gluster v rebalance sample status

Node   Rebalanced-files size  scanned  failures  skipped status run time in secs
----  ----------------- ----  -------  --------  -------- ------ --------------- 
localhost       44     440.0MB  546      0          0   completed   7.00
10.70.34.88     0      0Bytes   505      0          0   completed   1.00
10.70.34.86     0      0Bytes   505      0          0   completed   1.00
10.70.34.87     45     450.0MB  589      0          0   completed   8.00

volume rebalance: sample: success:

Comment 5 Scott Haines 2013-09-23 22:35:57 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html