This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1475181 - dht remove-brick status does not indicate failures files not migrated because of a lack of space
dht remove-brick status does not indicate failures files not migrated because...
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: distribute (Show other bugs)
3.12
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Nithya Balachandran
:
Depends On: 1474318 1474284
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-26 03:41 EDT by Nithya Balachandran
Modified: 2017-09-05 13:37 EDT (History)
5 users (show)

See Also:
Fixed In Version: glusterfs-3.12.0
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1474318
Environment:
Last Closed: 2017-09-05 13:37:29 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Nithya Balachandran 2017-07-26 03:41:31 EDT
+++ This bug was initially created as a clone of Bug #1474318 +++

+++ This bug was initially created as a clone of Bug #1474284 +++

Description of problem:

The dht remove-brick operation is expected to treat skipped files as failures as they are left behind on the removed bricks.

If a file could not be migrated because there was no subvolume that could accommodate it, the error is ignored because of an incorrect loop counter.

This is a regression from previous releases.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create a 2x1 distribute volume with 500 MB bricks and create enough files so that a single brick cannot accommodate all of them
2. Remove the 2nd brick
3. Check the logs and the remove-brick status.

Actual results:
The remove-brick status shows no failures. However the rebalance logs show messages :

[2017-07-24 09:56:20.191412] W [MSGID: 109033] [dht-rebalance.c:1021:__dht_check_free_space] 0-vol1-dht: Could not find any subvol with space accomodating the file - <filename>. Consider adding bricks



Expected results:
The remove-brick status should display non-zero failures as some files cannot be moved.


Additional info:

The counter used to iterate over the decommissioned bricks array is incorrect in __dht_check_free_space ().


                if (conf->decommission_subvols_cnt) {
                        *ignore_failure = _gf_true;
                        for (i = 0; i < conf->decommission_subvols_cnt; i++) {
                                if (conf->decommissioned_bricks[i] == from) {
                                        *ignore_failure = _gf_false;
                                         break;
                                }
                        }



should be 


                if (conf->decommission_subvols_cnt) {
                        *ignore_failure = _gf_true;
                        for (i = 0; i < conf->subvolume_cnt; i++) {
                                if (conf->decommissioned_bricks[i] == from) {
                                        *ignore_failure = _gf_false;
                                         break;
                                }
                        }

--- Additional comment from Worker Ant on 2017-07-24 08:23:33 EDT ---

REVIEW: https://review.gluster.org/17861 (cluster/dht: Correct iterator for decommissioned bricks) posted (#1) for review on master by N Balachandran (nbalacha@redhat.com)

--- Additional comment from Worker Ant on 2017-07-25 05:31:40 EDT ---

REVIEW: https://review.gluster.org/17861 (cluster/dht: Correct iterator for decommissioned bricks) posted (#2) for review on master by Susant Palai (spalai@redhat.com)

--- Additional comment from Worker Ant on 2017-07-25 06:03:29 EDT ---

COMMIT: https://review.gluster.org/17861 committed in master by N Balachandran (nbalacha@redhat.com) 
------
commit 8c3e766fe0a473734e8eca0f70d0318a2b909e2e
Author: N Balachandran <nbalacha@redhat.com>
Date:   Mon Jul 24 17:48:47 2017 +0530

    cluster/dht: Correct iterator for decommissioned bricks
    
    Corrected the iterator for looping over the list of
    decommissioned bricks while checking if the new target
    determined because of min-free-disk values has been
    decommissioned.
    
    Change-Id: Iee778547eb7370a8069e954b5d629fcedf54e59b
    BUG: 1474318
    Signed-off-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-on: https://review.gluster.org/17861
    Reviewed-by: Susant Palai <spalai@redhat.com>
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Comment 1 Worker Ant 2017-07-26 03:52:46 EDT
REVIEW: https://review.gluster.org/17872 (cluster/dht: Correct iterator for decommissioned bricks) posted (#1) for review on release-3.12 by N Balachandran (nbalacha@redhat.com)
Comment 2 Worker Ant 2017-07-31 13:34:58 EDT
COMMIT: https://review.gluster.org/17872 committed in release-3.12 by Shyamsundar Ranganathan (srangana@redhat.com) 
------
commit a489aee130db4f6d04220f87e5c88ad4f5c3874e
Author: N Balachandran <nbalacha@redhat.com>
Date:   Mon Jul 24 17:48:47 2017 +0530

    cluster/dht: Correct iterator for decommissioned bricks
    
    Corrected the iterator for looping over the list of
    decommissioned bricks while checking if the new target
    determined because of min-free-disk values has been
    decommissioned.
    
    > BUG: 1474318
    > Signed-off-by: N Balachandran <nbalacha@redhat.com>
    > Reviewed-on: https://review.gluster.org/17861
    > Reviewed-by: Susant Palai <spalai@redhat.com>
    (cherry picked from commit 8c3e766fe0a473734e8eca0f70d0318a2b909e2e)
    Change-Id: Iee778547eb7370a8069e954b5d629fcedf54e59b
    BUG: 1475181
    Signed-off-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-on: https://review.gluster.org/17872
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Comment 3 Shyamsundar 2017-09-05 13:37:29 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.