Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1514683

Summary: Removal of bricks in volume isn't prevented if remaining brick doesn't contain all the files
Product: [Community] GlusterFS Reporter: Nithya Balachandran <nbalacha>
Component: distributeAssignee: Vishal Pandey <vpandey>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: high    
Version: mainlineCC: amukherj, atumball, bkunal, bugs, ccalhoun, nbalacha, phil.coleman, rgowdapp, rhinduja, rhs-bugs, sheggodu, tdesala
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1344758 Environment:
Last Closed: 2019-08-25 05:20:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1344758    

Description Nithya Balachandran 2017-11-18 01:45:40 UTC
+++ This bug was initially created as a clone of Bug #1344758 +++

Description of problem:

Customer wanted to test migration of data by adding and removing bricks to simulate moving between different storage.

1. Created a replicated volume made up of 2 x 279GB bricks.

2. Mounted as a gluster mount and then filed up untill approximately 500MB full.

3. Added additional bricks that were smaller, only 244MB.

4. Started removal of one 279GB brick which completed successfully.

5. Checked and there were still files left on the bricks being removed that couldn't be copied as the remaining brick wasn't large enough.

Removed brick without error, just a warning that files might be left behind.


Version-Release number of selected component (if applicable):

Gluster 3.7.5 (RHGS 3.1.2)


How reproducible:

Easily


Steps to Reproduce:

See above

Actual results:

Brick removal completed without error even though all files were not migrated.


Expected results:
Removal should fail or give more warning/require explicit approval before proceeding.

Additional info:

--- Additional comment from Nithya Balachandran on 2017-06-22 11:51:40 EDT ---


It was decided that we would check the remove-brick status to check if:

1. Any files were skipped/failed
2. Rebalance is still in progress on any node

If either is true, the remove-brick commit will inform the user and require explicit confirmation to proceed.

Comment 1 Worker Ant 2017-11-18 02:12:45 UTC
REVIEW: https://review.gluster.org/18801 (cli: WIP) posted (#2) for review on master by N Balachandran

Comment 2 Nithya Balachandran 2017-11-23 09:23:33 UTC
Explanation of the approach:

In gf_cli_remove_brick:

if (cmd == GF_OP_CMD_COMMIT) get the rebalance status
Check the rebalance status for failed file migrations or in progress/failed rebalance. If either of these are found, do not commit the operation. Display a warning to the user asking them to retry the remove-brick after fixing the issue or use force to commit the operation anyway.

The remove-brick commit and status operations use the 'count' in the dictionary differently. Remove-brick commit requires 'count' to get the brick count.
Remove-brick status will increment and update the 'count' causing the rebalance status processing to go wrong

So for now, for a remove-brick commit operation, the original 'count' is saved in 'tmp-count' in the dict before sending the status request. If the rebalance status indicates that the commit can go through, the value of 'count' in dict is updated to the value of 'tmp-count' before the commit request is sent.

Comment 3 Worker Ant 2019-07-29 08:22:43 UTC
REVIEW: https://review.gluster.org/23111 (cli: Add warning for user before remove-brick commit) posted (#2) for review on master by Vishal Pandey

Comment 4 Worker Ant 2019-08-07 07:32:28 UTC
REVIEW: https://review.gluster.org/23171 (cli: Add warning for user before remove-brick commit) posted (#1) for review on master by Vishal Pandey

Comment 5 Worker Ant 2019-08-25 05:20:10 UTC
REVIEW: https://review.gluster.org/23171 (glusterd: Add warning and abort in case of failures in migration during remove-brick commit) merged (#8) on master by Atin Mukherjee