1550896 – No rollback of renames on succeeded subvols during failure

Bug 1550896 - No rollback of renames on succeeded subvols during failure

Summary: No rollback of renames on succeeded subvols during failure

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Csaba Henk
QA Contact:	Prasad Desala
Docs Contact:
URL:
Whiteboard:
Depends On:	1412069
Blocks:	1503137 1550771
TreeView+	depends on / blocked

Reported:	2018-03-02 08:27 UTC by Raghavendra G
Modified:	2018-09-06 06:11 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.12.2-8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1412069
Environment:
Last Closed:	2018-09-04 06:44:11 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:45:12 UTC

Description Raghavendra G 2018-03-02 08:27:14 UTC

+++ This bug was initially created as a clone of Bug #1412069 +++

Description of problem:
As with dht, dirs are present on all subvolumes, renaming them is a compound operation and thus a partial success + partial failure scenario is possible, resulting in an inconsistent state. For purposes of reproduction, such a scenario can easily be produced by stopping the volume, edit the volfile of a certain subvolume to get at an "option read-only on" setting, and then restart the volume. Thus those operations that are to make change on the affected subvolume will fail with EROFS. 

Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Worker Ant on 2017-01-11 02:01:30 EST ---

REVIEW: http://review.gluster.org/15739 (feature/dht: undo partially successful dir rename) posted (#7) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Worker Ant on 2017-01-11 10:40:29 EST ---

COMMIT: http://review.gluster.org/15739 committed in master by Raghavendra G (rgowdapp) 
------
commit bb438d849a4a3941c1a9b525213f695f0a2c961b
Author: Csaba Henk <csaba>
Date:   Thu Oct 27 07:30:48 2016 +0200

    feature/dht: undo partially successful dir rename
    
    As with dht, dirs are present on all subvolumes,
    renaming them is a compound operation and thus a
    partial success + partial failure scenario is
    possible, resulting in an inconsistent state.
    
    For purposes of reproduction, such a scenario can
    easily be produced by stopping the volume, edit the
    volfile of a certain subvolume to get at an
    "option read-only on" setting, and then restart
    the volume. Thus those operations that are to make change
    on the affected subvolume will fail with EROFS.
    
    To handle such scenarios, we introduce an in-memory cache
    where we record the return values obtained from the
    subvolumes. At the final stage of the dir rename operation
    we check if it's a partial success/fail situation. If yes,
    then we perform a reverse rename op on those subvolumes
    where the operation succeeded.
    
    Change-Id: I3d05f74f53932cb984a918d252a7309c1009a51d
    BUG: 1412069
    Signed-off-by: Raghavendra G <rgowdapp>
    Reviewed-on: http://review.gluster.org/15739
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: N Balachandran <nbalacha>

--- Additional comment from Shyamsundar on 2017-03-06 12:43:33 EST ---

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 7 Prasad Desala 2018-04-23 09:37:07 UTC

Reproduced this issue on 3.3.1 and followed the same steps for verifying this
BZ on 3.4.0 (3.12.2-8.el7rhgs.x86_64).

1) Created a distributed-replicate volume and start it.
2) FUSE mount it on a client.
3) On mount point, create a directory "dir1"
4) Select a replica pair and for all the bricks in this replica pair set
read-only option to on by making changes in the brick vol file.
5) Stop and start the volume.
6) From mount point, rename the directory from dir1 to dir2.

Before fix, dir1 is not renamed on the read-only bricks and on other bricks
rename is successful leading to inconsistency across the nodes and both dir1
and dir2 are having same gfid.

After fix, all the backend bricks are having the same directory.

Moving this BZ to Verified.

Comment 9 errata-xmlrpc 2018-09-04 06:44:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.