Bug 1467010 - Fd based fops fail with EBADF on file migration
Summary: Fd based fops fail with EBADF on file migration
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On: 1465075
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-01 15:39 UTC by Raghavendra Talur
Modified: 2018-06-20 18:26 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1465075
Environment:
Last Closed: 2018-06-20 18:26:09 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Talur 2017-07-01 15:39:14 UTC
+++ This bug was initially created as a clone of Bug #1465075 +++

Description of problem:

In DHT, there exists a scenario where fd based fops may be sent on the dst subvolume after the file has been migrated but before the fd has been opened on it. This is because certain operations update the cached subvol in the dht inode ctx without checking to see if an fd has been opened on it on the original subvol. Dht fd based fops currently rely on a phase1/phase2 migration checks to open fds on the dst subvol. However, no such check is made causing the fop to fail with EBADF.

This is seen with dist-rep volumes.



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:

1. Create a 2x2 volume
2. Create a file FILE1. Assume it is created on subvol1. Rename it to NFILE1 so it hashes to subvol2.
3. Open an fd on NFILE1 (on subvol1).
2. Perform a rebalance so the file is migrated to subvol2.                                           
3. On the same mount point, perform a lookup/readdirp so the cached subvol in the inode_ctx of NFILE1 is updated to subvol2.          
4. Perform a write on the fd.
                                                  
The write is sent to subvol2 on an fd which has been opened only on subvol1.         Since the migration phase checks don't kick in, the fd is not opened on subvol2 and the fop fails with EBADF.

Actual results:


Expected results:


Additional info:

This is being fixed by having every fd based fop check if the fd has been opened on the cached subvol before winding the fop down.

--- Additional comment from Worker Ant on 2017-06-26 21:21:07 IST ---

REVIEW: https://review.gluster.org/17630 (cluster/dht: Check if fd is opened on dst subvol) posted (#1) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-06-27 16:56:10 IST ---

REVIEW: https://review.gluster.org/17630 (cluster/dht: Check if fd is opened on dst subvol) posted (#2) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-06-27 21:17:07 IST ---

REVIEW: https://review.gluster.org/17630 (cluster/dht: Check if fd is opened on dst subvol) posted (#3) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-06-27 22:21:29 IST ---

REVIEW: https://review.gluster.org/17630 (cluster/dht: Check if fd is opened on dst subvol) posted (#4) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-06-28 09:55:55 IST ---

REVIEW: https://review.gluster.org/17630 (cluster/dht: Check if fd is opened on dst subvol) posted (#5) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-06-28 12:39:48 IST ---

REVIEW: https://review.gluster.org/17630 (cluster/dht: Check if fd is opened on dst subvol) posted (#6) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-06-28 17:12:25 IST ---

COMMIT: https://review.gluster.org/17630 committed in master by Raghavendra G (rgowdapp) 
------
commit 91db0d47ca267aecfc6124a3f337a4e2f2c9f1e2
Author: N Balachandran <nbalacha>
Date:   Mon Jun 26 21:12:56 2017 +0530

    cluster/dht: Check if fd is opened on dst subvol
    
    If an fd is opened on a file, the file is migrated
    and the cached subvol is updated in the inode_ctx
    before an fd based fop is sent, the fop is sent to
    the dst subvol on which the fd is not opened.
    This causes the FOP to fail with EBADF.
    
    Now, every fd based fop will check to see that the fd
    has been opened on the dst subvol before winding it down.
    
    Change-Id: Id92ef5eb7a5b5226688e2d2868b15e383f5f240e
    BUG: 1465075
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17630
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Susant Palai <spalai>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 1 Worker Ant 2017-07-01 15:42:40 UTC
REVIEW: https://review.gluster.org/17632 (Revert "Revert "glusterd: disallow rebalance & remove-brick on a sharded volume"") posted (#2) for review on release-3.10 by Raghavendra Talur (rtalur)

Comment 2 Worker Ant 2017-07-04 14:39:33 UTC
COMMIT: https://review.gluster.org/17632 committed in release-3.10 by Raghavendra Talur (rtalur) 
------
commit 5d21aaf97618f6e220fc0c028eba6b14c7439499
Author: Shyamsundar Ranganathan <srangana>
Date:   Mon Jun 26 16:26:11 2017 +0000

    Revert "Revert "glusterd: disallow rebalance & remove-brick on a sharded volume""
    
    This is being reverted as a new bug around rebalance has been uncovered. As a result we would like to retain the warning in the code and in the release-notes.
    
    The new bug being, https://bugzilla.redhat.com/show_bug.cgi?id=1465075
    
    A similar revert is being tracked for 3.11 here, https://review.gluster.org/17631 based on votes to both, we may want to consider this for 3.10 as well.
    
    This reverts commit abaf577626650edb4b9dfdddd43ba04a2a8e8ef3.
    
    BUG: 1467010
    Change-Id: Iecd0357c44e41e2b421222e8f98fe8300513f963
    Reviewed-on: https://review.gluster.org/17632
    Reviewed-by: Raghavendra Talur <rtalur>
    Tested-by: Raghavendra Talur <rtalur>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Worker Ant 2017-07-12 07:17:42 UTC
REVIEW: https://review.gluster.org/17752 (cluster/dht: Check if fd is opened on dst subvol) posted (#1) for review on release-3.10 by N Balachandran (nbalacha)

Comment 4 Worker Ant 2017-07-19 21:34:19 UTC
COMMIT: https://review.gluster.org/17752 committed in release-3.10 by Raghavendra Talur (rtalur) 
------
commit babde83eb1e2bb0fa072a2eb8de962c777616875
Author: N Balachandran <nbalacha>
Date:   Mon Jun 26 21:12:56 2017 +0530

    cluster/dht: Check if fd is opened on dst subvol
    
    If an fd is opened on a file, the file is migrated
    and the cached subvol is updated in the inode_ctx
    before an fd based fop is sent, the fop is sent to
    the dst subvol on which the fd is not opened.
    This causes the FOP to fail with EBADF.
    
    Now, every fd based fop will check to see that the fd
    has been opened on the dst subvol before winding it down.
    
    > BUG: 1465075
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17630
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > Reviewed-by: Susant Palai <spalai>
    > CentOS-regression: Gluster Build System <jenkins.org>
    (cherry picked from commit 91db0d47ca267aecfc6124a3f337a4e2f2c9f1e2)
    Change-Id: Id92ef5eb7a5b5226688e2d2868b15e383f5f240e
    BUG: 1467010
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17752
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 5 Worker Ant 2017-07-20 04:37:28 UTC
REVIEW: https://review.gluster.org/17829 (cluster/dht: Fix fd check race) posted (#1) for review on release-3.10 by N Balachandran (nbalacha)

Comment 6 Worker Ant 2017-07-20 09:33:24 UTC
COMMIT: https://review.gluster.org/17829 committed in release-3.10 by Raghavendra Talur (rtalur) 
------
commit 7eff3f644b67fdb7a44174f836ffe1a84b6f479e
Author: N Balachandran <nbalacha>
Date:   Mon Jul 10 09:38:54 2017 +0530

    cluster/dht: Fix fd check race
    
    There is a another race between the cached subvol
    being updated in the inode_ctx and the fd being opened on
    the target.
    
    1. fop1 -> fd1 -> subvol0
    2. file migrated from subvol0 to subvol1 and cached_subvol
       changed to subvol1 in inode_ctx
    3. fop2 -> fd1 -> subvol1 [takes new cached subvol]
    4. fop2 -> checks fd ctx (fd not open on subvol1) -> opens fd1 on subvol1
    5. fop1 -> checks fd ctx (fd not open on subvol0)
       -> tries to open fd1 on subvol0 -> fails with "No such file on directory".
    
    Fix:
    If dht_fd_open_on_dst fails with ENOENT or ESTALE, wind to old subvol
    and let the phase1/phase2 checks handle it.
    
    Change-Id: I34f8011574a8b72e3bcfe03b0cc4f024b352f225
    BUG: 1467010
    > BUG: 1465075
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17731
    > Smoke: Gluster Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > Reviewed-by: Amar Tumballi <amarts>
    (cherry picked from commit f7a450c17fee7e43c544473366220887f0534ed7)
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17829
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra Talur <rtalur>

Comment 7 Shyamsundar 2017-08-12 16:23:00 UTC
@nithya are we ready to revert the commit posted in comment #2 for 3.10.6 (3.10.5 was tagged today, 12th Aug, 2017), which is about 20 days away?

Looking at if these changes are the last in fixing the corruption to VM images when sharding is in the stack.

Comment 8 Worker Ant 2017-08-18 05:10:02 UTC
REVIEW: https://review.gluster.org/18057 (cluster/dht: Check for open fd only on EBADF) posted (#1) for review on release-3.10 by N Balachandran (nbalacha)

Comment 9 Nithya Balachandran 2017-08-18 05:12:50 UTC
(In reply to Shyamsundar from comment #7)
> @nithya are we ready to revert the commit posted in comment #2 for 3.10.6
> (3.10.5 was tagged today, 12th Aug, 2017), which is about 20 days away?
> 
> Looking at if these changes are the last in fixing the corruption to VM
> images when sharding is in the stack.

I have backported the first of these patches to 3.10. I will backport the second one once this is merged.

Comment 10 Worker Ant 2017-09-17 12:47:31 UTC
COMMIT: https://review.gluster.org/18057 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit de744c3cbb4ec91e4a2f66575d778633700933c1
Author: N Balachandran <nbalacha>
Date:   Fri Aug 4 14:46:38 2017 +0530

    cluster/dht: Check for open fd only on EBADF
    
    DHT fd based fops used to check if the fd was open
    on the cached subvol before winding the call. However,
    this introduced a performance regression of about
    30% for reads.
    
    This check was introduced to handle cases where files
    were migrated while IOs were happening. As this is not
    the common case, dht will now check if the fd is
    open on the cached subvol only if the call fails
    with EBADF.
    
    This will prevent a performance hit where a rebalance
    is not running.
    
    > BUG: 1476665
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17976
    > Smoke: Gluster Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Amar Tumballi <amarts>
    > Reviewed-by: Susant Palai <spalai>
    > Reviewed-by: Raghavendra G <rgowdapp>
    
    Change-Id: I2035a858d63c3fcd22bb634055bbb0ad01686808
    BUG: 1467010
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/18057
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 11 Worker Ant 2017-09-27 04:04:02 UTC
REVIEW: https://review.gluster.org/18400 (cluster/dht: EBADF handling for fremovexattr and fsetxattr) posted (#1) for review on release-3.10 by N Balachandran (nbalacha)

Comment 12 Worker Ant 2017-10-02 12:37:57 UTC
REVIEW: https://review.gluster.org/18400 (cluster/dht: EBADF handling for fremovexattr and fsetxattr) posted (#2) for review on release-3.10 by Shyamsundar Ranganathan (srangana)

Comment 13 Worker Ant 2017-10-03 12:26:52 UTC
COMMIT: https://review.gluster.org/18400 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 832d7fac5455be2f4a2ef9b374da2f2b9f1987da
Author: N Balachandran <nbalacha>
Date:   Wed Sep 27 09:32:48 2017 +0530

    cluster/dht: EBADF handling for fremovexattr and fsetxattr
    
    Add EBADF handling for dht_fremovexattr and dht_fsetxattr.
    
    > BUG: 1476665
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17999
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: Shyamsundar Ranganathan <srangana>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Raghavendra G <rgowdapp>
    
    (cherry picked from commit 747a08d34e2a1e94d7fce68a3577370288bb1955)
    Change-Id: Ide0d5812dae79655d2565157e5baabcd753b4309
    BUG: 1467010
    Signed-off-by: N Balachandran <nbalacha>

Comment 14 Shyamsundar 2018-06-20 18:26:09 UTC
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.


Note You need to log in before you can comment on or make changes to this bug.