Bug 1136821 - Open fails with ENOENT while renames/readdirs are in progress
Summary: Open fails with ENOENT while renames/readdirs are in progress
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: posix
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On: 1136159
Blocks: 1136622
TreeView+ depends on / blocked
 
Reported: 2014-09-03 11:52 UTC by Pranith Kumar K
Modified: 2014-11-11 08:37 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1136159
Environment:
Last Closed: 2014-11-11 08:37:37 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Pranith Kumar K 2014-09-03 11:52:47 UTC
+++ This bug was initially created as a clone of Bug #1136159 +++

Description of problem:
Executing renames/readdirp/cat in a loop can lead to opens failing with ENOENT.

Version-Release number of selected component (if applicable):


How reproducible:
Very

Steps to Reproduce:
1. Created a plain replicate volume, disabled all performance xlators.
gluster volume set $1 performance.quick-read off
gluster volume set $1 performance.io-cache off
gluster volume set $1 performance.write-behind off
gluster volume set $1 performance.stat-prefetch off
gluster volume set $1 performance.read-ahead off

2. Mounted the volume on 2 mounts using -o direct-io-mode=yes
3. On one mount execute ls -lR
4. On the other mount execute:
echo abc > abc-ln
while true; do ln abc-ln abc; mv abc-ln abc; echo 3>/proc/sys/vm/drop_caches; cat abc; ln abc abc-ln; mv abc abc-ln; echo 3>/proc/sys/vm/drop_caches; cat abc-ln; done

Actual results:
brick logs print 'Not able to open file, No such file or directory' quite a few times even though the file is always present'

Expected results:
No failures should come in opens of files

Additional info:

--- Additional comment from Anand Avati on 2014-09-02 00:29:40 EDT ---

REVIEW: http://review.gluster.org/8575 (storage/posix: Prefer gfid links for inode-handle) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2014-09-02 04:14:19 EDT ---

REVIEW: http://review.gluster.org/8575 (storage/posix: Prefer gfid links for inode-handle) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2014-09-02 08:10:23 EDT ---

COMMIT: http://review.gluster.org/8575 committed in master by Vijay Bellur (vbellur) 
------
commit 2c0a694b8d910c530899077c1d242ad1ea250965
Author: Pranith Kumar K <pkarampu>
Date:   Tue Sep 2 09:40:44 2014 +0530

    storage/posix: Prefer gfid links for inode-handle
    
    Problem:
    File path could change by other entry operations in-flight so if renames are in
    progress at the time of other operations like open, it may lead to failures.
    We observed that this issue can also happen while renames and readdirps/lookups
    are in progress because dentry-table is going stale sometimes.
    
    Fix:
    Prefer gfid-handles over paths for files. For directory handles prefering
    gfid-handles hits performance issues because it needs to resolve paths
    traversing up the symlinks.
    Tests which test if files are opened should check on gfid path after this change.
    So changed couple of tests to reflect the same.
    
    Note:
    This patch doesn't fix the issue for directories. I think a complete fix is to
    come up with an entry operation serialization xlator. Until then lets live with
    this.
    
    Change-Id: I10bda1083036d013f3a12588db7a71039d9da6c3
    BUG: 1136159
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/8575
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 1 Anand Avati 2014-09-03 12:03:40 UTC
REVIEW: http://review.gluster.org/8594 (storage/posix: Prefer gfid links for inode-handle) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2014-09-12 09:55:55 UTC
COMMIT: http://review.gluster.org/8594 committed in release-3.6 by Vijay Bellur (vbellur) 
------
commit 444ffda19e2052b5fc78f7dc020de161ebee8563
Author: Pranith Kumar K <pkarampu>
Date:   Tue Sep 2 09:40:44 2014 +0530

    storage/posix: Prefer gfid links for inode-handle
    
            Backport of http://review.gluster.org/8575
    
    Problem:
    File path could change by other entry operations in-flight so if renames are in
    progress at the time of other operations like open, it may lead to failures.
    We observed that this issue can also happen while renames and readdirps/lookups
    are in progress because dentry-table is going stale sometimes.
    
    Fix:
    Prefer gfid-handles over paths for files. For directory handles prefering
    gfid-handles hits performance issues because it needs to resolve paths
    traversing up the symlinks.
    Tests which test if files are opened should check on gfid path after this change.
    So changed couple of tests to reflect the same.
    
    Note:
    This patch doesn't fix the issue for directories. I think a complete fix is to
    come up with an entry operation serialization xlator. Until then lets live with
    this.
    
    BUG: 1136821
    Change-Id: If93e46d542a4e96a81a0639b5210330f7dbe8be0
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/8594
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Gluster Build System <jenkins.com>

Comment 3 Niels de Vos 2014-09-22 12:45:15 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 4 Anand Avati 2014-10-31 09:36:53 UTC
REVIEW: http://review.gluster.org/9030 (cluster/afr: Perform post-op in entry selfheal inside locks) posted (#1) for review on release-3.6 by Krutika Dhananjay (kdhananj)

Comment 5 Anand Avati 2014-10-31 13:09:32 UTC
COMMIT: http://review.gluster.org/9030 committed in release-3.6 by Vijay Bellur (vbellur) 
------
commit e80bfe76b89ae3f40b3258a4ac388f18a0b53034
Author: Krutika Dhananjay <kdhananj>
Date:   Fri Oct 31 12:51:15 2014 +0530

    cluster/afr: Perform post-op in entry selfheal inside locks
    
            Backport of: http://review.gluster.org/#/c/9020
    
    Take entrylks in xlator domain before doing post-op (undo-pending) in
    entry self-heal. This is to prevent a parallel name self-heal on
    an entry under @fd->inode from reading pending xattrs while it is
    being modified by SHD after entry sh below, given that
    name self-heal takes locks ONLY in xlator domain and is free to read
    pending changelog in the absence of the following locking.
    
    Change-Id: I0bc92978efc0741d6e3f2439540d008e31472313
    BUG: 1136821
    Signed-off-by: Krutika Dhananjay <kdhananj>
    Reviewed-on: http://review.gluster.org/9030
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Pranith Kumar Karampuri <pkarampu>
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>

Comment 6 Niels de Vos 2014-11-11 08:37:37 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.