Bug 1304962 - Intermittent file creation fail,while doing concurrent writes on distributed volume has more than 40 bricks
Summary: Intermittent file creation fail,while doing concurrent writes on distributed ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: posix
Version: mainline
Hardware: All
OS: All
medium
medium
Target Milestone: ---
Assignee: Mohammed Rafi KC
QA Contact:
URL:
Whiteboard:
Depends On: 1301474
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-05 05:47 UTC by Sakshi
Modified: 2018-06-20 17:56 UTC (History)
4 users (show)

Fixed In Version: glusterfs-v4.1.0
Clone Of: 1301474
Environment:
Last Closed: 2018-06-20 17:56:51 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Sakshi 2016-02-05 05:47:51 UTC
+++ This bug was initially created as a clone of Bug #1301474 +++

Description of problem:
File creation fails intermittently on a distributed volume, while trying to write concurrently from two clients at the same time. 


How reproducible:

Not always , Intermittent 


Steps to Reproduce:

1. Create a distributed volume having 60 bricks or a relative high number of bricks on different nodes.

2. Mount the volume on two different clients. 

3. Try to create directories/files from simultaneously from both the clients 
Actual results:

For some files , it fails to create from one of the server. At times it gives , the "No such file/directory" for the file , like below. 

[root@server2 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/1': File exists
mkdir: cannot create directory `/mnt/test/2': File exists
mkdir: cannot create directory `/mnt/test/5': File exists
mkdir: cannot create directory `/mnt/test/6': File exists
touch: cannot touch `server2.6': No such file or directory
mkdir: cannot create directory `/mnt/test/7': File exists
mkdir: cannot create directory `/mnt/test/9': File exists

[root@server1 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/3': File exists
mkdir: cannot create directory `/mnt/test/4': File exists
mkdir: cannot create directory `/mnt/test/8': File exists


And at some different point of time it failed giving , stale file handle. 

[root@server2 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/4': File exists
touch: cannot touch `server2.4': Stale file handle
mkdir: cannot create directory `/mnt/test/6': File exists
mkdir: cannot create directory `/mnt/test/7': File exists
mkdir: cannot create directory `/mnt/test/8': File exists
mkdir: cannot create directory `/mnt/test/9': File exists



[root@server1 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/1': File exists
mkdir: cannot create directory `/mnt/test/2': File exists
mkdir: cannot create directory `/mnt/test/3': File exists
mkdir: cannot create directory `/mnt/test/5': File exists


Expected results:

The file creation should work without any issue. 



Looks like it is a case of negative caching by kernel. One workaround would be to set the entry timeout as 0 while mounting.

mount -t glusterfs <host:mount_point> -o entry-timeout=0

The ESTALE error is due to a race between mkdir and lookup. Consider the following scenario:

Client 1 issues an mkdir <dir> and would have just created the entry on say subvol1, but has not set the gfid yet. Meanwhile Client 2 tries to create directory on the same subvol subvol1, which will fail with an EEXIST (as it finds the directory entry). Now when client2 tries to take lock on the subvol to perform setxattr, it requires the gfid (which has not yet been set). Hence the inodelk will fail with ESTALE since it cannot find the gfid and hence mkdir fails with ESTALE.

Comment 1 Vijay Bellur 2016-02-05 06:20:36 UTC
REVIEW: http://review.gluster.org/13362 (posix: create gfid handle before creating the directory) posted (#1) for review on master by Sakshi Bansal

Comment 2 Vijay Bellur 2016-02-08 07:08:05 UTC
REVIEW: http://review.gluster.org/13362 (posix: create gfid handle before creating the entry) posted (#2) for review on master by Sakshi Bansal

Comment 3 Vijay Bellur 2016-02-16 06:45:30 UTC
REVIEW: http://review.gluster.org/13451 (serial-fop: added a server side xlator for fop serialization) posted (#1) for review on master by Sakshi Bansal

Comment 4 Vijay Bellur 2016-02-16 16:20:18 UTC
REVIEW: http://review.gluster.org/13451 (serial-fop: added a server side xlator for fop serialization) posted (#2) for review on master by Sakshi Bansal

Comment 5 Vijay Bellur 2016-02-17 09:08:32 UTC
REVIEW: http://review.gluster.org/13451 (serial-fop: added a server side xlator for fop serialization) posted (#3) for review on master by Sakshi Bansal

Comment 6 Vijay Bellur 2016-02-17 11:52:12 UTC
REVIEW: http://review.gluster.org/13451 (serial-fop: added a server side xlator for fop serialization) posted (#4) for review on master by Sakshi Bansal

Comment 7 Vijay Bellur 2016-03-01 04:57:34 UTC
REVIEW: http://review.gluster.org/13451 (serial-fop: added a server side xlator for fop serialization) posted (#5) for review on master by Sakshi Bansal

Comment 8 Vijay Bellur 2016-03-07 04:49:43 UTC
REVIEW: http://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry                        fop serialization) posted (#6) for review on master by Sakshi Bansal

Comment 9 Mike McCune 2016-03-28 23:31:34 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 10 Vijay Bellur 2016-04-18 15:25:30 UTC
REVIEW: http://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry                        fop serialization) posted (#7) for review on master by Sakshi Bansal

Comment 11 Vijay Bellur 2016-07-29 07:19:51 UTC
REVIEW: http://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry                        fop serialization) posted (#8) for review on master by Sakshi Bansal

Comment 14 Worker Ant 2017-08-22 03:38:46 UTC
REVIEW: https://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#11) for review on master by Raghavendra G (rgowdapp)

Comment 15 Worker Ant 2017-08-22 03:42:05 UTC
REVIEW: https://review.gluster.org/18081 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#1) for review on master by Raghavendra G (rgowdapp)

Comment 16 Worker Ant 2017-08-22 05:29:17 UTC
REVIEW: https://review.gluster.org/18082 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#1) for review on experimental by Raghavendra G (rgowdapp)

Comment 17 Worker Ant 2017-08-22 06:15:21 UTC
REVIEW: https://review.gluster.org/18082 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#2) for review on experimental by Raghavendra G (rgowdapp)

Comment 18 Worker Ant 2017-08-22 08:21:59 UTC
COMMIT: https://review.gluster.org/18082 committed in experimental by Amar Tumballi (amarts) 
------
commit f24055c6f0cb1695fbe04cf1c322bd12e8ca1d42
Author: Sakshi Bansal <sabansal>
Date:   Tue Feb 16 11:21:29 2016 +0530

    dentry fop serializer: added new server side xlator for dentry fop serialization
    
    Fops like mkdir/create, lookup, rename, unlink, link that
    happen on a particular dentry must be serialized to ensure
    atomicity. This helps prevent race between parallel mkdir,
    mkdir and lookup etc.
    
    To acheive this all fops on dentry must take entry locks
    before they proceed, once they have acquired locks, they
    perform the fop and then release the lock.
    
    Some documentation from email conversation:
    [1] http://www.gluster.org/pipermail/gluster-devel/2015-December/047314.html
    
    [2] http://www.gluster.org/pipermail/gluster-devel/2015-August/046428.html
    
    Change-Id: I6e80ba3cabfa6facd5dda63bd482b9bf18b6b79b
    BUG: 1304962
    Signed-off-by: Sakshi Bansal <sabansal>
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: https://review.gluster.org/18082
    Tested-by: Raghavendra G <rgowdapp>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Amar Tumballi <amarts>

Comment 19 Worker Ant 2017-10-16 05:47:57 UTC
REVIEW: https://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#12) for review on master by Amar Tumballi (amarts)

Comment 20 Worker Ant 2017-10-16 06:49:22 UTC
REVIEW: https://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#13) for review on master by Amar Tumballi (amarts)

Comment 21 Worker Ant 2017-10-18 13:58:06 UTC
REVIEW: https://review.gluster.org/13451 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#14) for review on master by Amar Tumballi (amarts)

Comment 22 Worker Ant 2017-11-22 10:23:14 UTC
REVIEW: https://review.gluster.org/18839 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#1) for review on experimental by Amar Tumballi

Comment 23 Worker Ant 2017-11-22 10:43:08 UTC
COMMIT: https://review.gluster.org/18839 committed in experimental by \"Amar Tumballi\" <amarts> with a commit message- dentry fop serializer: added new server side xlator for dentry fop serialization

Fops like mkdir/create, lookup, rename, unlink, link that
happen on a particular dentry must be serialized to ensure
atomicity. This helps prevent race between parallel mkdir,
mkdir and lookup etc.

To acheive this all fops on dentry must take entry locks
before they proceed, once they have acquired locks, they
perform the fop and then release the lock.

Some documentation from email conversation:
[1] http://www.gluster.org/pipermail/gluster-devel/2015-December/047314.html

[2] http://www.gluster.org/pipermail/gluster-devel/2015-August/046428.html

With this patch, the feature is optional, enable it by running:

 `gluster volume set $volname features.sdfs enable`

Also the feature is tested for a month without issues in the
experiemental branch for all the regression.

> Original Change-Id: I6e80ba3cabfa6facd5dda63bd482b9bf18b6b79b

BUG: 1304962
Change-Id: I70b5c37899a3d7d693a09133f026057ea90c209a
Signed-off-by: Sakshi Bansal <sabansal>
Signed-off-by: Amar Tumballi <amarts>

Comment 24 Worker Ant 2018-01-24 05:40:16 UTC
COMMIT: https://review.gluster.org/13451 committed in master by  with a commit message- dentry fop serializer: added new server side xlator for dentry fop serialization

Problems addressed by this xlator :

[1]. To prevent race between parallel mkdir,mkdir and lookup etc.

Fops like mkdir/create, lookup, rename, unlink, link that happen on a
particular dentry must be serialized to ensure atomicity.

Another possible case can be a fresh lookup to find existance of a path
whose gfid is not set yet. Further, storage/posix employs a ctime based
heuristic 'is_fresh_file' (interval time is less than 1 second of current
time) to check fresh-ness of file. With serialization of these two fops
(lookup & mkdir), we eliminate the race altogether.

[2]. Staleness of dentries

This causes exponential increase in traversal time for any inode in the
subtree of the directory pointed by stale dentry.

Cause :  Stale dentry is created because of following two operations:

      a. dentry creation due to inode_link, done during operations like
         lookup, mkdir, create, mknod, symlink, create and
      b. dentry unlinking due to various operations like rmdir, rename,
         unlink.

       The reason is __inode_link uses __is_dentry_cyclic, which explores
       all possible path to avoid cyclic link formation during inode
       linkage. __is_dentry_cyclic explores stale-dentry(ies) and its
       all ancestors which is increases traversing time exponentially.

Implementation : To acheive this all fops on dentry must take entry locks
before they proceed, once they have acquired locks, they perform the fop
and then release the lock.

Some documentation from email conversation:
[1] http://www.gluster.org/pipermail/gluster-devel/2015-December/047314.html

[2] http://www.gluster.org/pipermail/gluster-devel/2015-August/046428.html

With this patch, the feature is optional, enable it by running:

 `gluster volume set $volname features.sdfs enable`

Also the feature is tested for a month without issues in the
experiemental branch for all the regression.

Change-Id: I6e80ba3cabfa6facd5dda63bd482b9bf18b6b79b
Fixes: #397
BUG: 1304962
Signed-off-by: Sakshi Bansal <sabansal>
Signed-off-by: Amar Tumballi <amarts>
Signed-off-by: Sunny Kumar <sunkumar>

Comment 25 Worker Ant 2018-01-26 11:37:33 UTC
REVIEW: https://review.gluster.org/19340 (dentry fop serializer: added new server side xlator for dentry fop serialization) posted (#1) for review on release-4.0 by Raghavendra G

Comment 26 Shyamsundar 2018-06-20 17:56:51 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report.

glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.