+++ This bug was initially created as a clone of Bug #1412069 +++ Description of problem: As with dht, dirs are present on all subvolumes, renaming them is a compound operation and thus a partial success + partial failure scenario is possible, resulting in an inconsistent state. For purposes of reproduction, such a scenario can easily be produced by stopping the volume, edit the volfile of a certain subvolume to get at an "option read-only on" setting, and then restart the volume. Thus those operations that are to make change on the affected subvolume will fail with EROFS. Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Worker Ant on 2017-01-11 02:01:30 EST --- REVIEW: http://review.gluster.org/15739 (feature/dht: undo partially successful dir rename) posted (#7) for review on master by Raghavendra G (rgowdapp) --- Additional comment from Worker Ant on 2017-01-11 10:40:29 EST --- COMMIT: http://review.gluster.org/15739 committed in master by Raghavendra G (rgowdapp) ------ commit bb438d849a4a3941c1a9b525213f695f0a2c961b Author: Csaba Henk <csaba> Date: Thu Oct 27 07:30:48 2016 +0200 feature/dht: undo partially successful dir rename As with dht, dirs are present on all subvolumes, renaming them is a compound operation and thus a partial success + partial failure scenario is possible, resulting in an inconsistent state. For purposes of reproduction, such a scenario can easily be produced by stopping the volume, edit the volfile of a certain subvolume to get at an "option read-only on" setting, and then restart the volume. Thus those operations that are to make change on the affected subvolume will fail with EROFS. To handle such scenarios, we introduce an in-memory cache where we record the return values obtained from the subvolumes. At the final stage of the dir rename operation we check if it's a partial success/fail situation. If yes, then we perform a reverse rename op on those subvolumes where the operation succeeded. Change-Id: I3d05f74f53932cb984a918d252a7309c1009a51d BUG: 1412069 Signed-off-by: Raghavendra G <rgowdapp> Reviewed-on: http://review.gluster.org/15739 NetBSD-regression: NetBSD Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: N Balachandran <nbalacha> --- Additional comment from Shyamsundar on 2017-03-06 12:43:33 EST --- This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/
Reproduced this issue on 3.3.1 and followed the same steps for verifying this BZ on 3.4.0 (3.12.2-8.el7rhgs.x86_64). 1) Created a distributed-replicate volume and start it. 2) FUSE mount it on a client. 3) On mount point, create a directory "dir1" 4) Select a replica pair and for all the bricks in this replica pair set read-only option to on by making changes in the brick vol file. 5) Stop and start the volume. 6) From mount point, rename the directory from dir1 to dir2. Before fix, dir1 is not renamed on the read-only bricks and on other bricks rename is successful leading to inconsistency across the nodes and both dir1 and dir2 are having same gfid. After fix, all the backend bricks are having the same directory. Moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607