Bug 1293827 - fops-during-migration.t fails if hot and cold tiers are dist-rep
fops-during-migration.t fails if hot and cold tiers are dist-rep
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: replicate (Show other bugs)
3.7.6
All All
high Severity high
: ---
: ---
Assigned To: Nithya Balachandran
: Triaged
Depends On: 1284823
Blocks: 1285625 1285783
  Show dependency treegraph
 
Reported: 2015-12-23 03:15 EST by Nithya Balachandran
Modified: 2016-04-19 03:51 EDT (History)
3 users (show)

See Also:
Fixed In Version: glusterfs-3.7.7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1284823
Environment:
Last Closed: 2016-04-19 03:51:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Nithya Balachandran 2015-12-23 03:15:05 EST
+++ This bug was initially created as a clone of Bug #1284823 +++

Description of problem:

tests/basic/tier/fop-during-migration.t was written using pure distribute hot and cold tiers
If the test is modified to have dist-rep cold and hot tiers, the following operation fails with EINVAL:

echo $TEST_STR > $M0/dir1/FILE1



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Modify tests/basic/tier/fop-during-migration.t to use dist-rep tiers
2. run tests/basic/tier/fop-during-migration.t
3.

Actual results:

Fails with EINVAL

Expected results:

The test should pass.

Additional info:

--- Additional comment from Nithya Balachandran on 2015-11-24 04:58:28 EST ---

Analysis:

echo $TEST_STR > $M0/dir1/FILE1

performs an ftruncate operation.

Investigation reveals that when a file is being migrated by the tier layer, the redirection of the ftruncate FOP to the dst subvol fails as posix_ftruncate fails in sys_ftruncate with op_errno EINVAL. This is because the fd being used has flags=0.

AFR calls afr_fix_open() to open the fd on the dst subvol but ends up using flags = 0 (instead of using the required flags). This causes the ftruncate to fail with EINVAL.


--- Additional comment from Vijay Bellur on 2015-12-22 14:13:05 EST ---

COMMIT: http://review.gluster.org/12985 committed in master by Dan Lambright (dlambrig@redhat.com) 
------
commit 430ad405294993ebb16387232281cc5a4f854c75
Author: N Balachandran <nbalacha@redhat.com>
Date:   Wed Dec 16 21:09:22 2015 +0530

    cluster/dht : Ftruncate on migrating file fails with EINVAL
    
    What:
    If dht_open is called on a migrating file after the inode_ctx is set,
    subsequent FOPs on that fd do not open the fd on the dst subvol.
    This is seen when the open-ftruncate-close sequence is repeatedly
    called on a migrating file.
    A second call to the sequence described above causes dht_truncate_cbk
    to call dht_truncate2 as the dht_inode_ctx was already set by the first
    call. As dht_rebalance_in_progress_check is not called, the fd is not
    opened on the dst subvol.
    On a distributed-replicate volume, this causes AFR to
    open the fd using afr_fix_open, but with the wrong flags, causing
    posix_ftruncate to fail with EINVAL.
    The fix: We require fd specific information to make a decision while
    handling migrating files.
    Set the fd_ctx to indicate the fd has been opened on the dst subvol
    and check if it has been set while processing Phase1/Phase2 checks
    in the FOP callback functions.
    
    Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14
    BUG: 1284823
    Signed-off-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-on: http://review.gluster.org/12985
    Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Dan Lambright <dlambrig@redhat.com>
    Tested-by: Dan Lambright <dlambrig@redhat.com>
Comment 1 Vijay Bellur 2015-12-23 03:20:07 EST
REVIEW: http://review.gluster.org/13071 (cluster/dht : Ftruncate on migrating file fails with EINVAL) posted (#1) for review on release-3.7 by N Balachandran (nbalacha@redhat.com)
Comment 2 Vijay Bellur 2016-01-29 00:22:05 EST
REVIEW: http://review.gluster.org/13071 (cluster/dht : Ftruncate on migrating file fails with EINVAL) posted (#2) for review on release-3.7 by Pranith Kumar Karampuri (pkarampu@redhat.com)
Comment 3 Vijay Bellur 2016-01-29 03:42:54 EST
COMMIT: http://review.gluster.org/13071 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu@redhat.com) 
------
commit c2fbcb6140585d9fc82367a9101857bf5d05d732
Author: N Balachandran <nbalacha@redhat.com>
Date:   Wed Dec 16 21:09:22 2015 +0530

    cluster/dht : Ftruncate on migrating file fails with EINVAL
    
    What:
    If dht_open is called on a migrating file after the inode_ctx is set,
    subsequent FOPs on that fd do not open the fd on the dst subvol.
    This is seen when the open-ftruncate-close sequence is repeatedly
    called on a migrating file.
    A second call to the sequence described above causes dht_truncate_cbk
    to call dht_truncate2 as the dht_inode_ctx was already set by the first
    call. As dht_rebalance_in_progress_check is not called, the fd is not
    opened on the dst subvol.
    On a distributed-replicate volume, this causes AFR to
    open the fd using afr_fix_open, but with the wrong flags, causing
    posix_ftruncate to fail with EINVAL.
    The fix: We require fd specific information to make a decision while
    handling migrating files.
    Set the fd_ctx to indicate the fd has been opened on the dst subvol
    and check if it has been set while processing Phase1/Phase2 checks
    in the FOP callback functions.
    
    > Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14
    > Signed-off-by: N Balachandran <nbalacha@redhat.com>
    > Reviewed-on: http://review.gluster.org/12985
    > Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
    > Tested-by: Gluster Build System <jenkins@build.gluster.com>
    > Reviewed-by: Dan Lambright <dlambrig@redhat.com>
    > Tested-by: Dan Lambright <dlambrig@redhat.com>
    
    Change-Id: I99f8aceec105f16631def06a263f0561954d14b3
    BUG: 1293827
    Signed-off-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-on: http://review.gluster.org/13071
    Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
    Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Comment 4 Kaushal 2016-04-19 03:51:49 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.7, please open a new bug report.

glusterfs-3.7.7 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-February/025292.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.