Bug 1529089 - opening a file that is destination of rename results in ENOENT errors
Summary: opening a file that is destination of rename results in ENOENT errors
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: fuse
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1500269
Blocks: 1492591 1529088
TreeView+ depends on / blocked
 
Reported: 2017-12-26 10:03 UTC by Raghavendra G
Modified: 2018-01-08 16:03 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.10.9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1500269
Environment:
Last Closed: 2018-01-08 16:03:38 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra G 2017-12-26 10:03:20 UTC
+++ This bug was initially created as a clone of Bug #1500269 +++

Description of problem:

Consider two programs:

rename:
=======
while true; do touch newfile; mv newfile file; done

open:
=====
keeps opening an fd on "file"

If I run these two programs in parallel (after making sure rename results in "file"), open on existing file "file" fails with ENOENT. Since "file" always existed and rename is atomic open should not fail with ENOENT.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Raghavendra G on 2017-10-10 06:00 EDT ---



--- Additional comment from Raghavendra G on 2017-10-10 06:00 EDT ---



--- Additional comment from Raghavendra G on 2017-10-10 06:04:15 EDT ---

Consider rename (index.new, store.idx) and open (store.idx) being executed in parallel. When we break down operations following sequence is possible.

* lookup (store.idx) - as part of open(store.idx) returns gfid1 as the result.
* rename (index.new, store.idx) changes gfid of store.idx to gfid2. Note that gfid2 was the nodeid of index.new. Since rename is successful, gfid2 is associated with store.idx.
* open (store.idx) resumes and issues open fop to glusterfs with gfid1. open in glusterfs fails as gfid1 doesn't exist and the error returned by glusterfs to kernel-fuse is ENOENT.
* kernel passes back the same error to application as a result to open.

This error could've been prevented if kernel retries open with gfid2. Interestingly kernel do retry open when it receives ESTALE error.

From kernel source code fs/namei.c:

<do_filep_open>

        filp = path_openat(&nd, op, flags | LOOKUP_RCU);
        if (unlikely(filp == ERR_PTR(-ECHILD)))
                filp = path_openat(&nd, op, flags);

        if (unlikely(filp == ERR_PTR(-ESTALE)))           <==== NOTE HERE
                filp = path_openat(&nd, op, flags | LOOKUP_REVAL);

</do_filep_open>

However that this hypothesis is _VALID ONLY IF_ kernel doesn't maintain the atomicity  of (lookup + open) in do_sys_open. If it is not maintaining atomicity, but instead relying on ESTALE error, this could be solved by passing back ESTALE as the error when we find that gfid is missing.

--- Additional comment from Raghavendra G on 2017-10-10 06:09:36 EDT ---

To verify the hypothesis I made following fixes

* reverted [1]
* converted ENOENT error to ESTALE in fuse_fd_cbk as open is an operation on gfid and correct error for a missing gfid is ESTALE. Note that some cases brick can return ENOENT for missing gfid. This can happen if resolver in protocol/server successfully resolves gfid, but the fop (open in this example) fails in storage/posix due to gfid-handle missing in .glusterfs

On applying these two fixes and rerunning the above test - open and rename.sh in infinite loop in parallel on the same mount - I was not able to hit the issue.

However if I run open and rename (c program), the issue is reproducible even with fixes. The explanation I could come up with is that,
* Kernel retries only once on getting an ESTALE error during open (as can be seen in do_filep_open code in kernel VFS)
* Binary "rename" generated by C program is fast enough that it could do more than one rename in the window of single "open".

This explanation is bit counter intuitive given that the program rename has two syscalls and the program open has one syscall. But, that's the best explanation I can come up with for now.

[1] http://review.gluster.org/13816

--- Additional comment from Worker Ant on 2017-10-10 06:26:08 EDT ---

REVIEW: https://review.gluster.org/18463 (Revert "mount/fuse: report ESTALE as ENOENT") posted (#1) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Worker Ant on 2017-10-13 10:33:20 EDT ---

REVIEW: https://review.gluster.org/18463 (Revert "mount/fuse: report ESTALE as ENOENT") posted (#2) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Worker Ant on 2017-10-13 10:33:23 EDT ---

REVIEW: https://review.gluster.org/18521 (mount/fuse: never fail open(dir) with ENOENT) posted (#1) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Worker Ant on 2017-10-17 01:14:45 EDT ---

COMMIT: https://review.gluster.org/18463 committed in master by Raghavendra G (rgowdapp) 
------
commit 019a55e708375d2b1e576fcc948a691bcdc5c749
Author: Raghavendra G <rgowdapp>
Date:   Tue Oct 10 11:29:04 2017 +0530

    Revert "mount/fuse: report ESTALE as ENOENT"
    
    This reverts commit 26d16b90ec7f8acbe07e56e8fe1baf9c9fa1519e.
    
    Consider rename (index.new, store.idx) and open (store.idx) being
    executed in parallel. When we break down operations following sequence
    is possible.
    
    * lookup (store.idx) - as part of open(store.idx) returns gfid1 as the
      result.
    * rename (index.new, store.idx) changes gfid of store.idx to
      gfid2. Note that gfid2 was the nodeid of index.new. Since rename is
      successful, gfid2 is associated with store.idx.
    * open (store.idx) resumes and issues open fop to glusterfs with
      gfid1. open in glusterfs fails as gfid1 doesn't exist and the error
      returned by glusterfs to kernel-fuse is ENOENT.
    * kernel passes back the same error to application as a result to
      open.
    
    This error could've been prevented if kernel retries open with
    gfid2. Interestingly kernel do retry open when it receives ESTALE
    error. Even though failure to find gfid resulted in ESTALE error,
    commit 26d16b90ec7f8acb converted that error to ENOENT while sending
    an error reply to kernel. This prevented kernel from retrying open
    resulting in error.
    
    Change-Id: I2e752ca60dd8af1b989dd1d29c7b002ee58440b4
    BUG: 1500269
    Signed-off-by: Raghavendra G <rgowdapp>

--- Additional comment from Worker Ant on 2017-10-17 01:15:36 EDT ---

REVIEW: https://review.gluster.org/18521 (mount/fuse: never fail open(dir) with ENOENT) posted (#2) for review on master by Raghavendra G (rgowdapp)

--- Additional comment from Worker Ant on 2017-10-17 01:31:47 EDT ---

COMMIT: https://review.gluster.org/18521 committed in master by Raghavendra G (rgowdapp) 
------
commit fb4b914ce84bc83a5f418719c5ba7c25689a9251
Author: Raghavendra G <rgowdapp>
Date:   Fri Oct 13 20:00:47 2017 +0530

    mount/fuse: never fail open(dir) with ENOENT
    
    open(dir) being an operation on inode should never fail with
    ENOENT. If gfid is not present, the appropriate error is ESTALE. This
    will enable kernel to retry open after a revalidate lookup.
    
    Change-Id: I8d07d2ebb5a0da6c3ea478317442cb42f1797a4b
    BUG: 1500269
    Signed-off-by: Raghavendra G <rgowdapp>

--- Additional comment from Shyamsundar on 2017-12-08 12:43:00 EST ---

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report.

glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 1 Worker Ant 2017-12-26 10:08:20 UTC
REVIEW: https://review.gluster.org/19088 (Revert \"mount/fuse: report ESTALE as ENOENT\") posted (#1) for review on release-3.10 by Raghavendra G

Comment 2 Worker Ant 2017-12-26 10:10:03 UTC
REVIEW: https://review.gluster.org/19089 (mount/fuse: never fail open(dir) with ENOENT) posted (#1) for review on release-3.10 by Raghavendra G

Comment 3 Worker Ant 2018-01-02 16:21:03 UTC
COMMIT: https://review.gluster.org/19089 committed in release-3.10 by \"Raghavendra G\" <rgowdapp> with a commit message- mount/fuse: never fail open(dir) with ENOENT

open(dir) being an operation on inode should never fail with
ENOENT. If gfid is not present, the appropriate error is ESTALE. This
will enable kernel to retry open after a revalidate lookup.

>Change-Id: I8d07d2ebb5a0da6c3ea478317442cb42f1797a4b
>BUG: 1500269
>Signed-off-by: Raghavendra G <rgowdapp>

(cherry picked from commit fb4b914ce84bc83a5f418719c5ba7c25689a9251)
Change-Id: I8d07d2ebb5a0da6c3ea478317442cb42f1797a4b
BUG: 1529089
Signed-off-by: Raghavendra G <rgowdapp>

Comment 4 Worker Ant 2018-01-02 16:21:57 UTC
COMMIT: https://review.gluster.org/19088 committed in release-3.10 by \"Raghavendra G\" <rgowdapp> with a commit message- Revert "mount/fuse: report ESTALE as ENOENT"

This reverts commit 26d16b90ec7f8acbe07e56e8fe1baf9c9fa1519e.

Consider rename (index.new, store.idx) and open (store.idx) being
executed in parallel. When we break down operations following sequence
is possible.

* lookup (store.idx) - as part of open(store.idx) returns gfid1 as the
  result.
* rename (index.new, store.idx) changes gfid of store.idx to
  gfid2. Note that gfid2 was the nodeid of index.new. Since rename is
  successful, gfid2 is associated with store.idx.
* open (store.idx) resumes and issues open fop to glusterfs with
  gfid1. open in glusterfs fails as gfid1 doesn't exist and the error
  returned by glusterfs to kernel-fuse is ENOENT.
* kernel passes back the same error to application as a result to
  open.

This error could've been prevented if kernel retries open with
gfid2. Interestingly kernel do retry open when it receives ESTALE
error. Even though failure to find gfid resulted in ESTALE error,
commit 26d16b90ec7f8acb converted that error to ENOENT while sending
an error reply to kernel. This prevented kernel from retrying open
resulting in error.

>Change-Id: I2e752ca60dd8af1b989dd1d29c7b002ee58440b4
>BUG: 1500269
>Signed-off-by: Raghavendra G <rgowdapp>

(cherry picked from commit 019a55e708375d2b1e576fcc948a691bcdc5c749)
Change-Id: I2e752ca60dd8af1b989dd1d29c7b002ee58440b4
BUG: 1529089
Signed-off-by: Raghavendra G <rgowdapp>

Comment 5 Shyamsundar 2018-01-08 16:03:38 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.9, please open a new bug report.

glusterfs-3.10.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-January/000088.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.