Bug 1514683 - Removal of bricks in volume isn't prevented if remaining brick doesn't contain all the files
Summary: Removal of bricks in volume isn't prevented if remaining brick doesn't contai...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: x86_64
OS: Linux
high
low
Target Milestone: ---
Assignee: Vishal Pandey
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1344758
TreeView+ depends on / blocked
 
Reported: 2017-11-18 01:45 UTC by Nithya Balachandran
Modified: 2019-10-04 09:21 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1344758
Environment:
Last Closed: 2019-08-25 05:20:10 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 23111 0 None Abandoned cli: Add warning for user before remove-brick commit 2019-08-07 08:59:33 UTC
Gluster.org Gerrit 23171 0 None Merged glusterd: Add warning and abort in case of failures in migration during remove-brick commit 2019-08-25 05:20:09 UTC

Description Nithya Balachandran 2017-11-18 01:45:40 UTC
+++ This bug was initially created as a clone of Bug #1344758 +++

Description of problem:

Customer wanted to test migration of data by adding and removing bricks to simulate moving between different storage.

1. Created a replicated volume made up of 2 x 279GB bricks.

2. Mounted as a gluster mount and then filed up untill approximately 500MB full.

3. Added additional bricks that were smaller, only 244MB.

4. Started removal of one 279GB brick which completed successfully.

5. Checked and there were still files left on the bricks being removed that couldn't be copied as the remaining brick wasn't large enough.

Removed brick without error, just a warning that files might be left behind.


Version-Release number of selected component (if applicable):

Gluster 3.7.5 (RHGS 3.1.2)


How reproducible:

Easily


Steps to Reproduce:

See above

Actual results:

Brick removal completed without error even though all files were not migrated.


Expected results:
Removal should fail or give more warning/require explicit approval before proceeding.

Additional info:

--- Additional comment from Nithya Balachandran on 2017-06-22 11:51:40 EDT ---


It was decided that we would check the remove-brick status to check if:

1. Any files were skipped/failed
2. Rebalance is still in progress on any node

If either is true, the remove-brick commit will inform the user and require explicit confirmation to proceed.

Comment 1 Worker Ant 2017-11-18 02:12:45 UTC
REVIEW: https://review.gluster.org/18801 (cli: WIP) posted (#2) for review on master by N Balachandran

Comment 2 Nithya Balachandran 2017-11-23 09:23:33 UTC
Explanation of the approach:

In gf_cli_remove_brick:

if (cmd == GF_OP_CMD_COMMIT) get the rebalance status
Check the rebalance status for failed file migrations or in progress/failed rebalance. If either of these are found, do not commit the operation. Display a warning to the user asking them to retry the remove-brick after fixing the issue or use force to commit the operation anyway.

The remove-brick commit and status operations use the 'count' in the dictionary differently. Remove-brick commit requires 'count' to get the brick count.
Remove-brick status will increment and update the 'count' causing the rebalance status processing to go wrong

So for now, for a remove-brick commit operation, the original 'count' is saved in 'tmp-count' in the dict before sending the status request. If the rebalance status indicates that the commit can go through, the value of 'count' in dict is updated to the value of 'tmp-count' before the commit request is sent.

Comment 3 Worker Ant 2019-07-29 08:22:43 UTC
REVIEW: https://review.gluster.org/23111 (cli: Add warning for user before remove-brick commit) posted (#2) for review on master by Vishal Pandey

Comment 4 Worker Ant 2019-08-07 07:32:28 UTC
REVIEW: https://review.gluster.org/23171 (cli: Add warning for user before remove-brick commit) posted (#1) for review on master by Vishal Pandey

Comment 5 Worker Ant 2019-08-25 05:20:10 UTC
REVIEW: https://review.gluster.org/23171 (glusterd: Add warning and abort in case of failures in migration during remove-brick commit) merged (#8) on master by Atin Mukherjee


Note You need to log in before you can comment on or make changes to this bug.