Red Hat Bugzilla – Bug 1286042
rebalance :- rm -rf failed with with error 'No such file or directory' for few files and directory while rebalance is in progress
Last modified: 2017-03-25 10:24:17 EDT
From the logs posted in the initial bug-report, opendir failed with ENOENT errors on newly added bricks (client36 till client47). I didn't see any failure logs on bricks that were part of dht before add-brick (client0 till client35). So, I assume this bug is due to directory not healed after add-brick. There are some fixes  in rhgs-3.1.3 which adds healing of directory and layout even in nameless lookup codepath. Since there is atleast one nameless lookup done on gfid before opendir is sent in new graph (which is aware of new bricks), this issue should be fixed in 3.1.3.
Also, Sakshi reported saying that she didn't see any issues with parallel rm -rf and rebalance post add-brick in rhgs-3.1.3.
Waiting for Karthick's confirmation of our observations.
Please note that  is not necessary fix for this bug (it is not present in rhgs-3.1.3). However it solves related issue of directory having holes post lookup.
This issue is no longer seen with glusterfs version: 3.8.4-2.el7rhgs.x86_64.
Here are the steps that were followed,
1. Created and distributed replica volume and started it.
2. Created files and Directories on it.
3. Added few bricks to that volume.
4. started rebalance with start force option.
5. From mount point started deleting data using rm -rf *.
The error reported in this BZ was not seen and the command executed successfully without any errors/issues.
Hence, moving this BZ to Verified.