Bug 1139999 - Rename of a file from 2 clients racing and resulting in an error on both clients
Summary: Rename of a file from 2 clients racing and resulting in an error on both clients
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.4.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: GlusterFS Bugs list
QA Contact:
URL:
Whiteboard:
Depends On: 1123950
Blocks: 1138390
TreeView+ depends on / blocked
 
Reported: 2014-09-10 07:09 UTC by Raghavendra G
Modified: 2014-10-20 15:27 UTC (History)
3 users (show)

Fixed In Version:
Clone Of: 1123950
Environment:
Last Closed: 2014-10-20 15:08:26 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra G 2014-09-10 07:09:46 UTC
+++ This bug was initially created as a clone of Bug #1123950 +++

Description of problem:
This problem is hit as a part of this test case, tests/bugs/bug-1117851.t about once every 100 files (based on the backend disk for the volume (i.e ram disk/ssd/others)).

The issue being seen when hashed and cached subvols for a file are the same and it is being renamed to another file whose hased subvol is different.

The root cause of this issue is due to the fact that both clients race to create the link and linkto file in the above scenario, and the losing client goes ahead and deletes the linkto file in its cleanup, thereby the actual rename attempted by the winning client fails, ending up in both clients failing to rename the file.

Fixing the part of the client that fails to create the linkto file, to not delete the linkto file will not be sufficient, as the losing client could have won that race (as link and linkto are wound in parallel). Which is present in this review, http://review.gluster.org/#/c/8338/

The additional fix to handle this failure, is to make the wind's to create the link and the linkto sequential, so that whichever client wins the link race, can then go ahead with creating the linkto file and hence have a clear client proceeding and the other client getting the required errors.

Version-Release number of selected component (if applicable):
Gluster master

How reproducible:
1 in 100 renames if run on bircks on SSD or RAM disk

Steps to Reproduce:
Test case, tests/bugs/bug-1117851.t treating warnings on file rename failures as errors (see comment in tesst case file)

Also, this should be a fork from bug #1117851, but as this is not a data loss, only refering the original bug here.

--- Additional comment from Anand Avati on 2014-07-28 15:37:34 EDT ---

REVIEW: http://review.gluster.org/8382 (cluster/dht: Fix rename failures when multiple clients race) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)

--- Additional comment from Anand Avati on 2014-07-30 07:30:09 EDT ---

REVIEW: http://review.gluster.org/8382 (cluster/dht: Fix rename failures when multiple clients race) posted (#2) for review on master by Shyamsundar Ranganathan (srangana)

--- Additional comment from Anand Avati on 2014-07-30 10:22:25 EDT ---

REVIEW: http://review.gluster.org/8382 (cluster/dht: Fix rename failures when multiple clients race) posted (#3) for review on master by Shyamsundar Ranganathan (srangana)

--- Additional comment from Anand Avati on 2014-07-31 09:54:35 EDT ---

REVIEW: http://review.gluster.org/8382 (cluster/dht: Fix rename failures when multiple clients race) posted (#4) for review on master by Shyamsundar Ranganathan (srangana)

--- Additional comment from Anand Avati on 2014-08-13 13:49:49 EDT ---

REVIEW: http://review.gluster.org/8382 (cluster/dht: Fix rename failures when multiple clients race) posted (#5) for review on master by Shyamsundar Ranganathan (srangana)

--- Additional comment from Shyamsundar on 2014-09-02 11:00:56 EDT ---

Abandoned: http://review.gluster.org/8382

This change is made differently where handling the linkto creation was needed first due to FUSE behavior.

These changes can be found here,

    http://review.gluster.org/#/c/8563/
    http://review.gluster.org/#/c/8570/

These changes would now make the winning client not fail a rename, in case it failed to rename the linkto file. Hence when one client wins the link race, and the other still deletes the linkto file, the rename failure by the winning client is not a critical failure, hence resolving the issue.

The test case modified as a part of this commit will be posted as a separate commit for inclusion post which this bug can be marked for verification.

--- Additional comment from Anand Avati on 2014-09-02 12:42:40 EDT ---

REVIEW: http://review.gluster.org/8579 (cluster/dht: Modified test case to note rename failures as errors) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)

--- Additional comment from Anand Avati on 2014-09-02 14:48:55 EDT ---

COMMIT: http://review.gluster.org/8579 committed in master by Vijay Bellur (vbellur) 
------
commit 4adfb6fb7c371c6bc03acdaf61f1cca496388356
Author: Shyam <srangana>
Date:   Tue Sep 2 12:37:07 2014 -0400

    cluster/dht: Modified test case to note rename failures as errors
    
    The bug referenced in this change, had an race condition that is now
    fixed by the following commits that are posted for review.
    
        http://review.gluster.org/#/c/8563/
        http://review.gluster.org/#/c/8570/
    
    These changes would now make the winning client not fail a rename,
    in case it failed to rename the linkto file. Hence when one client
    wins the link race, and the other still deletes the linkto file,
    the rename failure by the winning client is not a critical failure,
    hence it resolves the issue posted in the bug.
    
    As a result modifying the test case to treat the rename failures
    as errors, to catch any future issues.
    
    Change-Id: Ibe9caac7ee87dcbc4f581cfbd36173b734859ccb
    BUG: 1123950
    Signed-off-by: Shyam <srangana>
    Reviewed-on: http://review.gluster.org/8579
    Reviewed-by: Jeff Darcy <jdarcy>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 1 Anand Avati 2014-09-10 08:38:36 UTC
REVIEW: http://review.gluster.org/8684 (cluster/dht: Modified test case to note rename failures as errors) posted (#1) for review on release-3.4 by Raghavendra G (rgowdapp)

Comment 2 Anand Avati 2014-09-24 15:13:51 UTC
REVIEW: http://review.gluster.org/8684 (cluster/dht: Modified test case to note rename failures as errors) posted (#2) for review on release-3.4 by Raghavendra G (rgowdapp)

Comment 3 Kaleb KEITHLEY 2014-10-20 15:08:26 UTC
change was abandoned with reason "not needed"


Note You need to log in before you can comment on or make changes to this bug.