+++ This bug was initially created as a clone of Bug #1344758 +++ Description of problem: Customer wanted to test migration of data by adding and removing bricks to simulate moving between different storage. 1. Created a replicated volume made up of 2 x 279GB bricks. 2. Mounted as a gluster mount and then filed up untill approximately 500MB full. 3. Added additional bricks that were smaller, only 244MB. 4. Started removal of one 279GB brick which completed successfully. 5. Checked and there were still files left on the bricks being removed that couldn't be copied as the remaining brick wasn't large enough. Removed brick without error, just a warning that files might be left behind. Version-Release number of selected component (if applicable): Gluster 3.7.5 (RHGS 3.1.2) How reproducible: Easily Steps to Reproduce: See above Actual results: Brick removal completed without error even though all files were not migrated. Expected results: Removal should fail or give more warning/require explicit approval before proceeding. Additional info: --- Additional comment from Nithya Balachandran on 2017-06-22 11:51:40 EDT --- It was decided that we would check the remove-brick status to check if: 1. Any files were skipped/failed 2. Rebalance is still in progress on any node If either is true, the remove-brick commit will inform the user and require explicit confirmation to proceed.
REVIEW: https://review.gluster.org/18801 (cli: WIP) posted (#2) for review on master by N Balachandran
Explanation of the approach: In gf_cli_remove_brick: if (cmd == GF_OP_CMD_COMMIT) get the rebalance status Check the rebalance status for failed file migrations or in progress/failed rebalance. If either of these are found, do not commit the operation. Display a warning to the user asking them to retry the remove-brick after fixing the issue or use force to commit the operation anyway. The remove-brick commit and status operations use the 'count' in the dictionary differently. Remove-brick commit requires 'count' to get the brick count. Remove-brick status will increment and update the 'count' causing the rebalance status processing to go wrong So for now, for a remove-brick commit operation, the original 'count' is saved in 'tmp-count' in the dict before sending the status request. If the rebalance status indicates that the commit can go through, the value of 'count' in dict is updated to the value of 'tmp-count' before the commit request is sent.
REVIEW: https://review.gluster.org/23111 (cli: Add warning for user before remove-brick commit) posted (#2) for review on master by Vishal Pandey
REVIEW: https://review.gluster.org/23171 (cli: Add warning for user before remove-brick commit) posted (#1) for review on master by Vishal Pandey
REVIEW: https://review.gluster.org/23171 (glusterd: Add warning and abort in case of failures in migration during remove-brick commit) merged (#8) on master by Atin Mukherjee