Bug 852556 - Need information on retiring bricks/nodes
Need information on retiring bricks/nodes
Status: ASSIGNED
Product: Gluster-Documentation
Classification: Community
Component: Other (Show other bugs)
unspecified
x86_64 Linux
unspecified Severity low
: ---
: ---
Assigned To: Divya
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-28 18:42 EDT by Shawn Heisey
Modified: 2016-01-11 03:46 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Shawn Heisey 2012-08-28 18:42:52 EDT
Description of problem:
The documentation does not explain how to retire bricks/nodes.  This is a multi-step process that involves moving data off the brick(s), removing the brick(s) from the volume, then optionally removing the node(s) from the cluster.  Here's the proper way to take care of the bricks:

gluster volume <volname> remove-brick node1:/brick1 node2:/brick2 start
gluster volume <volname> remove-brick node1:/brick1 node2:/brick2 status
.
.
.
gluster volume <volname> remove-brick node1:/brick1 node2:/brick2 status

<when status shows completed, do the following>

gluster volume <volname> remove-brick node1:/brick1 node2:/brick2 commit
<answer "y" to the confirmation prompt>

Version-Release number of selected component (if applicable):
3.3.0
Comment 1 Shawn Heisey 2012-09-28 17:59:48 EDT
When I did this procedure before, I did not test to see whether the migrated data was accessible.  Today I tried a new test.  This is probably going to require a new bug on gluster itself rather than the documentation, but I wanted to get the info written down while it's fresh.

I loaded a small 4x2 volume two thirds full and tried to gracefully remove the last set of bricks with the procedure I have outlined above.  All of the remaining bricks ran out of disk space during the migration, and there were thousands of migration failures in the log.  I started the remove-brick back up, and it again ran out of disk space.  A third time completed without migration errors in the log.  At this point, I had not issued the commit.

After this, I tried to access files in the volume from a client mount.  Everything that originally existed on the removed bricks was inaccessible.

Final status: Once I did the remove-brick commit, everything magically started working.  I'm glad that there was no actual data loss, but if I am removing a set of 4TB bricks that's 75% or so full, it's going to take a really long time for 3TB of data (millions of files) to get migrated.  The files that get migrated first will be unavailable for the entire time of the migration effort, which is unacceptable by any standard.

Note You need to log in before you can comment on or make changes to this bug.