Bug 1136702 - Add a warning message to check the removed-bricks for any files left post "remove-brick commit"
Summary: Add a warning message to check the removed-bricks for any files left post "re...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: pre-release
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1136711
TreeView+ depends on / blocked
 
Reported: 2014-09-03 06:20 UTC by Susant Kumar Palai
Modified: 2015-05-14 17:43 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.7.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1136711 (view as bug list)
Environment:
Last Closed: 2015-05-14 17:27:28 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Susant Kumar Palai 2014-09-03 06:20:19 UTC
Description of problem:

Currently rebalance as part of remove-brick leaves some files on the removed-brick[intermittent]. There should be a warning message for admins to check the removed-bricks for any files that might have not been migrated and move them to mount point.

Comment 1 Anand Avati 2014-09-03 06:41:27 UTC
REVIEW: http://review.gluster.org/8577 (CLI: Adding warning message in case remove-brick commit executed) posted (#2) for review on master by susant palai (spalai)

Comment 2 Anand Avati 2014-09-03 06:48:22 UTC
REVIEW: http://review.gluster.org/8577 (CLI: Adding warning message in case remove-brick commit executed) posted (#3) for review on master by susant palai (spalai)

Comment 3 Anand Avati 2014-09-03 08:46:14 UTC
COMMIT: http://review.gluster.org/8577 committed in master by Vijay Bellur (vbellur) 
------
commit b81cec326d4d43519593cb56b7a0e68ea5c3421c
Author: Susant Palai <spalai>
Date:   Tue Sep 2 05:29:52 2014 -0400

    CLI: Adding warning message in case remove-brick commit executed
    
    Change-Id: Ia2f1b2cd2687ca8e739e7a1e245e668a7424ffac
    BUG: 1136702
    Signed-off-by: Susant Palai <spalai>
    Reviewed-on: http://review.gluster.org/8577
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 4 Anand Avati 2014-09-09 10:07:01 UTC
REVIEW: http://review.gluster.org/8664 (CLI: Show warning on remove-brick commit Signed-off-by: Susant Palai <spalai>) posted (#1) for review on master by susant palai (spalai)

Comment 5 Anand Avati 2014-09-10 05:06:47 UTC
COMMIT: http://review.gluster.org/8664 committed in master by Vijay Bellur (vbellur) 
------
commit 1c8d4bf6ab299f8fb44dce354fb8f3232136be02
Author: Susant Palai <spalai>
Date:   Tue Sep 9 06:05:24 2014 -0400

    CLI: Show warning on remove-brick commit
    Signed-off-by: Susant Palai <spalai>
    
    Change-Id: I48a4168f81bd272216549c76b0bc1b23e34894d6
    BUG: 1136702
    Reviewed-on: http://review.gluster.org/8664
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 6 Joe Julian 2015-02-24 20:31:40 UTC
No, commit should fail if the migration did not complete successfully. We should *know* if the migration failed. There should *not* be files missing from the volume after a remove-brick completes.

Comment 7 Justin Clift 2015-02-24 21:52:23 UTC
Susant, this sounds like a real risk for data loss:

  Currently rebalance as part of remove-brick leaves some files on the removed-brick[intermittent].

Having a warning seems like the wrong kind of direction, and I'm generally inclined to agree with JoeJulian about this.

Why are we thinking that having unmigrated files is an ok thing at this point, and are we working on solving it completely? :)

Comment 8 Susant Kumar Palai 2015-02-25 09:51:42 UTC
Hi Joe/Justin,
   Agreed to comment 6 & 7. To start with, the above patch is not a permanent fix. It's just a work around till we find a proper solution for the problem.

Comment 9 Joe Julian 2015-03-23 15:37:02 UTC
From my perspective, it will only take one user losing data to poison our reputation. Having a warning and a work-around will only cause confusion and cost me immeasurable time explaining to people what they will need to look for and how to fix it, and subject us to attacks on our competency.

I would prefer that this be a blocker and that the problem be corrected.

Comment 10 Alex 2015-04-02 20:33:22 UTC
The remaining files on the brick after a remove-brick seem to be due to the fact that the brick will continue to accept new files created in the cluster during the removal process.. any new files that were hashed onto that brick during the removal process stand to be orphaned after the process has completed.

see:

https://gist.github.com/mandb/93369097139c6cc3ff98

Expected behavior would be that while a brick is in remove-brick state, new file creation requests would be relayed to another brick.

The issue is likely complicated by a few scenarios:

a) Clients still see the brick's space as part of the volume capacity...

b) Files that are STILL on the brick need to remain write-available, and those files could be grown, so the capacity in a) has to reflect this available write space.

I will test with 3.6.2 and see if the behavior has changed.

Comment 11 Niels de Vos 2015-05-14 17:27:28 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 12 Niels de Vos 2015-05-14 17:35:34 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 13 Niels de Vos 2015-05-14 17:37:56 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 14 Niels de Vos 2015-05-14 17:43:30 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.