Bug 1032894 - spurious ENOENTs when using libgfapi
Summary: spurious ENOENTs when using libgfapi
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Anand Avati
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-21 07:45 UTC by Anand Avati
Modified: 2015-09-01 23:06 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Clone Of:
Environment:
Last Closed: 2014-11-11 08:24:46 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Anand Avati 2013-11-21 07:45:26 UTC
app1: glfs_mkdir("/dir")
app1: glfs_create("/dir/file")

app2: glfs_unlink("/dir/file")
app2: glfs_rmdir("/dir")
app2: glfs_mkdir("/dir")
app2: glfs_create("/dir/file")

app1: glfs_lstat("/dir/file") => ENOENT

This is because gfapi expects underlying layers to return ESTALE when a non-existent GFID is referred (in this case, of "/dir" by the resolver) and not ENOENT.

Comment 1 Anand Avati 2013-11-25 20:20:00 UTC
REVIEW: http://review.gluster.org/6318 (core: fix errno for non-existent GFID) posted (#10) for review on master by Anand Avati (avati)

Comment 2 Anand Avati 2013-11-25 20:27:16 UTC
REVIEW: http://review.gluster.org/6322 (core: fix errno for non-existent GFID) posted (#3) for review on release-3.4 by Anand Avati (avati)

Comment 3 Anand Avati 2013-11-26 18:29:32 UTC
COMMIT: http://review.gluster.org/6318 committed in master by Vijay Bellur (vbellur) 
------
commit d1879d04e39258ea25a49eed3244b395d4af2c1d
Author: Anand Avati <avati>
Date:   Thu Nov 21 06:48:17 2013 -0800

    core: fix errno for non-existent GFID
    
    When clients refer to a GFID which does not exist, the errno to
    be returned in ESTALE (and not ENOENT). Even though ENOENT might
    look "proper" most of the time, as the application eventually expects
    ENOENT even if a parent directory does not exist, not returning
    ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution
    in uncached mode. This can result in spurious ENOENTs during
    concurrent path modification operations.
    
    Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936
    BUG: 1032894
    Signed-off-by: Anand Avati <avati>
    Reviewed-on: http://review.gluster.org/6318
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Amar Tumballi <amarts>
    Reviewed-by: Brian Foster <bfoster>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 4 Anand Avati 2013-11-26 19:49:12 UTC
COMMIT: http://review.gluster.org/6322 committed in release-3.4 by Anand Avati (avati) 
------
commit 837422858c2e4ab447879a4141361fd382645406
Author: Anand Avati <avati>
Date:   Thu Nov 21 06:48:17 2013 -0800

    core: fix errno for non-existent GFID
    
    When clients refer to a GFID which does not exist, the errno to
    be returned in ESTALE (and not ENOENT). Even though ENOENT might
    look "proper" most of the time, as the application eventually expects
    ENOENT even if a parent directory does not exist, not returning
    ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution
    in uncached mode. This can result in spurious ENOENTs during
    concurrent path modification operations.
    
    Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936
    BUG: 1032894
    Signed-off-by: Anand Avati <avati>
    Reviewed-on: http://review.gluster.org/6322
    Tested-by: Gluster Build System <jenkins.com>

Comment 5 Anand Avati 2013-12-12 23:55:36 UTC
REVIEW: http://review.gluster.org/6496 (dht: handle ESTALE/ENOENT in dht_access) posted (#1) for review on master by Anand Avati (avati)

Comment 6 Anand Avati 2013-12-13 10:19:34 UTC
COMMIT: http://review.gluster.org/6496 committed in master by Vijay Bellur (vbellur) 
------
commit ea89a25b0b4e8796c421c32fb6dbc4661081f6e1
Author: Anand Avati <avati>
Date:   Thu Dec 12 15:43:28 2013 -0800

    dht: handle ESTALE/ENOENT in dht_access
    
    Had misssed out dht_access in the previous round of cleanup
    
    Change-Id: Ib255b9ad13ca62a8bc2eea225c46632aff8e820f
    BUG: 1032894
    Signed-off-by: Anand Avati <avati>
    Reviewed-on: http://review.gluster.org/6496
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Amar Tumballi <amarts>

Comment 7 Anand Avati 2013-12-23 17:26:14 UTC
REVIEW: http://review.gluster.org/6582 (cluster/dht: interim fix for reverting 837422858c) posted (#1) for review on release-3.4 by Vijay Bellur (vbellur)

Comment 8 Anand Avati 2013-12-24 07:46:51 UTC
REVIEW: http://review.gluster.org/6582 (cluster/dht: interim fix for reverting 837422858c) posted (#2) for review on release-3.4 by Vijay Bellur (vbellur)

Comment 9 Anand Avati 2013-12-24 09:53:56 UTC
COMMIT: http://review.gluster.org/6582 committed in release-3.4 by Vijay Bellur (vbellur) 
------
commit 92ad6c28936904ed2a43d254892a325bc2c695dc
Author: Vijay Bellur <vbellur>
Date:   Mon Dec 23 22:55:15 2013 +0530

    cluster/dht: interim fix for reverting 837422858c
    
    Change-Id: I74818a03f7c5d7891561515af2fa35ea3775255c
    BUG: 1032894
    Signed-off-by: Vijay Bellur <vbellur>
    Reviewed-on: http://review.gluster.org/6582
    Tested-by: Gluster Build System <jenkins.com>

Comment 10 Anand Avati 2013-12-26 06:42:26 UTC
REVIEW: http://review.gluster.org/6592 (cluster/afr: Remove stale index in self-heal codepath) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 11 Anand Avati 2013-12-26 06:48:04 UTC
REVIEW: http://review.gluster.org/6593 (cluster/afr: Remove stale index in self-heal codepath) posted (#1) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)

Comment 12 Anand Avati 2014-01-27 11:19:31 UTC
REVIEW: http://review.gluster.org/6593 (cluster/afr: Treat ESTALE on nameless lookup as ENOENT) posted (#2) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)

Comment 13 Anand Avati 2014-01-27 17:17:47 UTC
COMMIT: http://review.gluster.org/6593 committed in release-3.5 by Vijay Bellur (vbellur) 
------
commit cc1728766620e13ccfe2cd0b162cbc848b20e422
Author: Pranith Kumar K <pkarampu>
Date:   Thu Dec 26 11:31:49 2013 +0530

    cluster/afr: Treat ESTALE on nameless lookup as ENOENT
    
    Change-Id: I635fc0fa955b33590f1c5b4dfec22d591ea8575c
    BUG: 1032894
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/6593
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 14 Anand Avati 2014-05-05 05:07:33 UTC
REVIEW: http://review.gluster.org/6592 (cluster/afr: Remove stale index in self-heal codepath) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 15 Anand Avati 2014-05-08 18:25:24 UTC
COMMIT: http://review.gluster.org/6592 committed in master by Anand Avati (avati) 
------
commit 1b042296ddc65f5eab9d6e5f1e30e353413d9bbb
Author: Pranith Kumar K <pkarampu>
Date:   Mon May 5 09:18:35 2014 +0530

    cluster/afr: Remove stale index in self-heal codepath
    
    Change-Id: I635fc0fa955b33590f1c5b4dfec22d591ea8575c
    BUG: 1032894
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/6592
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 16 Anand Avati 2014-06-21 14:18:26 UTC
REVIEW: http://review.gluster.org/8142 (cluster/dht: handle ESTALE appropriately in rmdir codepath.) posted (#1) for review on master by Raghavendra G (rgowdapp)

Comment 17 Anand Avati 2014-06-23 09:41:23 UTC
REVIEW: http://review.gluster.org/8142 (cluster/dht: handle ESTALE appropriately in rmdir codepath.) posted (#2) for review on master by Raghavendra G (rgowdapp)

Comment 18 Anand Avati 2014-06-23 10:29:07 UTC
REVIEW: http://review.gluster.org/8142 (cluster/dht: handle ESTALE appropriately in rmdir codepath.) posted (#3) for review on master by Raghavendra G (rgowdapp)

Comment 19 Anand Avati 2014-06-23 10:30:13 UTC
REVIEW: http://review.gluster.org/8142 (cluster/dht: handle ESTALE appropriately in rmdir codepath.) posted (#4) for review on master by Raghavendra G (rgowdapp)

Comment 20 Anand Avati 2014-06-23 11:56:51 UTC
REVIEW: http://review.gluster.org/8142 (cluster/dht: handle ESTALE appropriately in rmdir codepath.) posted (#5) for review on master by Raghavendra G (rgowdapp)

Comment 21 Anand Avati 2014-06-23 12:56:47 UTC
COMMIT: http://review.gluster.org/8142 committed in master by Vijay Bellur (vbellur) 
------
commit 83fa1cfe185f05319a0048a63c8c163e4e632cf7
Author: Raghavendra G <rgowdapp>
Date:   Sat Jun 21 19:20:46 2014 +0530

    cluster/dht: handle ESTALE appropriately in rmdir codepath.
    
    Till we separated the scenario of a file/directory not existing from
    parent not existing [1], we used to include a subvolume in the layout
    of a directory even if it is not present on that subvolume. This was
    done to allow a lookup racing with mkdir to create correct layout.
    However, there are other scenarios as well where a directory is not
    present. One such situation is trying to create a directory after an
    add-brick. Since there is no guarantee that all the ancestors are
    created after an add-brick (and hence directory cannot be created), the
    newly added brick should not be part of the layout. However, we used to
    consider newly added brick as part of layout (even before we do
    fix-layout of all the ancestors) and this was the root cause of [2].
    With [1], this issue got fixed and hence [2] got fixed too. However,
    [1] is not complete in the sense we didn't modify rmdir codepath
    appropriately. This patch fixes that gap.
    
    [1] http://review.gluster.org/6322
    [2] https://bugzilla.redhat.com/show_bug.cgi?id=1006809
    
    Change-Id: I79ab96bb8abb6f3d90bb6e235a1c465e1be0fd19
    BUG: 1032894
    Signed-off-by: Raghavendra G <rgowdapp>
    Reviewed-on: http://review.gluster.org/8142
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>

Comment 22 Niels de Vos 2014-09-22 12:32:53 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 23 Niels de Vos 2014-11-11 08:24:46 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users

Comment 24 Oleksandr Natalenko 2015-02-18 07:57:24 UTC
Unfortunately, 3.6.1 still has this bug. Here is pseudocode I used to reproduce it:

===
func thread_1()
{
  glfs_mkdir(path);
  res = glfs_creat(fs, path + "/somefile1", O_CREAT | O_WRONLY | O_TRUNC, chmod_644);
}

func thread_2()
{
  glfs_mkdir(path);
  res = glfs_creat(fs, path + "/somefile2", O_CREAT | O_WRONLY | O_TRUNC, chmod_644);
}

func main_thread()
{
  call_thread(thread_1());
  call_thread(thread_2());
}
===

Rarely glfs_creat returns ENOENT even if both glfs_mkdir() report success.


Note You need to log in before you can comment on or make changes to this bug.