Bug 1138390

Summary: Rename of a file from 2 clients racing and resulting in an error on both clients
Product: [Community] GlusterFS Reporter: Shyamsundar <srangana>
Component: distributeAssignee: Shyamsundar <srangana>
Status: CLOSED WONTFIX QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1123950 Environment:
Last Closed: 2014-09-19 14:11:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1123950, 1139999    
Bug Blocks:    

Description Shyamsundar 2014-09-04 17:18:29 UTC
+++ This bug was initially created as a clone of Bug #1123950 +++

Description of problem:
This problem is hit as a part of this test case, tests/bugs/bug-1117851.t about once every 100 files (based on the backend disk for the volume (i.e ram disk/ssd/others)).

The issue being seen when hashed and cached subvols for a file are the same and it is being renamed to another file whose hased subvol is different.

The root cause of this issue is due to the fact that both clients race to create the link and linkto file in the above scenario, and the losing client goes ahead and deletes the linkto file in its cleanup, thereby the actual rename attempted by the winning client fails, ending up in both clients failing to rename the file.

Fixing the part of the client that fails to create the linkto file, to not delete the linkto file will not be sufficient, as the losing client could have won that race (as link and linkto are wound in parallel). Which is present in this review, http://review.gluster.org/#/c/8338/

The additional fix to handle this failure, is to make the wind's to create the link and the linkto sequential, so that whichever client wins the link race, can then go ahead with creating the linkto file and hence have a clear client proceeding and the other client getting the required errors.

Version-Release number of selected component (if applicable):
Gluster master

How reproducible:
1 in 100 renames if run on bircks on SSD or RAM disk

Steps to Reproduce:
Test case, tests/bugs/bug-1117851.t treating warnings on file rename failures as errors (see comment in tesst case file)

Also, this should be a fork from bug #1117851, but as this is not a data loss, only refering the original bug here.

Comment 1 Anand Avati 2014-09-04 18:48:37 UTC
REVIEW: http://review.gluster.org/8616 (cluster/dht: Modified test case to note rename failures as errors) posted (#1) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 2 Anand Avati 2014-09-04 20:25:33 UTC
REVIEW: http://review.gluster.org/8616 (cluster/dht: Modified test case to note rename failures as errors) posted (#2) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 3 Anand Avati 2014-09-05 15:10:40 UTC
REVIEW: http://review.gluster.org/8616 (cluster/dht: Modified test case to note rename failures as errors) posted (#3) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 4 Anand Avati 2014-09-05 15:19:51 UTC
REVIEW: http://review.gluster.org/8616 (cluster/dht: Modified test case to note rename failures as errors) posted (#4) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 5 Anand Avati 2014-09-10 18:00:27 UTC
REVIEW: http://review.gluster.org/8616 (cluster/dht: Modified test case to note rename failures as errors) posted (#5) for review on release-3.6 by Shyamsundar Ranganathan (srangana)

Comment 6 Anand Avati 2014-09-16 17:11:55 UTC
REVIEW: http://review.gluster.org/8616 (cluster/dht: Modified test case to note rename failures as errors) posted (#6) for review on release-3.6 by Shyamsundar Ranganathan (srangana)