Description of problem: ======================= Many files were not migrated from the decommissioned bricks; commit results in data loss. Version-Release number of selected component (if applicable): 3.12.2-5.el7rhgs.x86_64 How reproducible: Reporting at first occurrence Steps to Reproduce: =================== 1) Create a x3 volume with brick-mux enabled and start it. 2) FUSE mount it on multiple clients. 3) From Client-1 : run script to create folders and files continuously From client-2 : start linux kernel untar From client-3 : while true;do find;done From client-4 : while true;do ls -lRt;done 4) While step-3 is in-progress, killed server-1 brick process using kill -9 <pid>. As brick mux is enabled killing single brick on the server using kill -9 would take down all the bricks on the node. 5) Now, add 3 bricks to the volume and after few secs immediately start removing old bricks. 6) Wait for remove-brick to complete. Actual results: =============== Many files were not migrated from the decommissioned bricks; commit results in data loss. Expected results: ================= Remove-brick operation should migrate all the files from the decommissioned brick.
Upstream patches: https://review.gluster.org/#/c/19827/ https://review.gluster.org/#/c/19831
Verified this BZ on glusterfs version 3.12.2-8.el7rhgs. Followed the same steps as in the description, there are no left over files on the decommissioned bricks. Moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607