Bug 1367294 - IO ERROR when multiple graph switches
Summary: IO ERROR when multiple graph switches
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: 3.7.14
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
Assignee: Oleksandr Natalenko
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-16 07:08 UTC by Oleksandr Natalenko
Modified: 2016-09-01 09:33 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.7.15
Clone Of:
Environment:
Last Closed: 2016-09-01 09:21:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Oleksandr Natalenko 2016-08-16 07:08:12 UTC
Backport of https://bugzilla.redhat.com/show_bug.cgi?id=1343038 fix to 3.7.

Comment 1 Vijay Bellur 2016-08-16 07:16:38 UTC
REVIEW: http://review.gluster.org/15165 (gfapi: fix I/O error on consecutive graph switches) posted (#1) for review on release-3.7 by Oleksandr Natalenko (oleksandr)

Comment 2 Vijay Bellur 2016-08-16 08:14:06 UTC
REVIEW: http://review.gluster.org/15167 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#3) for review on release-3.7 by Oleksandr Natalenko (oleksandr)

Comment 3 Vijay Bellur 2016-08-16 09:54:03 UTC
REVIEW: http://review.gluster.org/14835 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#12) for review on release-3.7 by Oleksandr Natalenko (oleksandr)

Comment 4 Worker Ant 2016-08-25 04:38:26 UTC
COMMIT: http://review.gluster.org/14835 committed in release-3.7 by Kaushal M (kaushal) 
------
commit 9cd5066226770cf3c06a21757b963d315b8fe32b
Author: Poornima G <pgurusid>
Date:   Mon Jun 6 06:29:40 2016 -0400

    gfapi: Fix IO error caused when there is consecutive graph switches
    
    Issue:
    Consider a simple situation, where glfs_init() is done, i.e. initial
    graph is up. Now perform 2 volume sets that results in 2 client side
    graph changes. After this perform some IO, the IO fails with ENOTCON.
    The only way to recover this client is i guess another graph switch
    or restart.
    
    What actually is happening from code perspective:
    Initial graph lets say A, followed by 2 consecutive graph switches
    to B and C without any IO those two switches.
    
    - graph_setup (A) as a result of GF_EVENT_CHILD_UP, and
    fs->next_subvol = A
    
    - glfs_init() results in fs->active_subvol = A, fs->next_subvol = NULL
    
    - graph_setup (B) as a result of GF_EVENT_CHILD_UP, and
    fs->next_subvol = B
    
    - graph_setup (C) as a result of GF_EVENT_CHILD_UP, and
    fs->next_subvol = C. It also sees that the previous graph B was never
    set as fs->active_subvol, i.e. no IO or anything happened on B, so
    can safely send GF_EVENT_PARENT_DOWN (by calling glfs_subvol_done(B)).
    This parent down on B, results in child_down(B), which is fine.
    But child_down also triggers graph_setup(B).
    
    - graph_setup(B) as a result of GF_EVENT_CHILD_DOWN, and
    fs->next_subvol = B, and GF_EVENT_PARENT_DOWN on C as explained
    above. This again leads to GF_EVENT_CHILD_DOWN on C.
    
    - graph_setup(C) as a result of GF_EVENT_CHILD_DOWN, and
    fs->next_subvol = C, and GF_EVENT_PARENT_DOWN on B as explained
    above.
    
    Thus both the graphs B and C are disconnected, and hence the ENOTCON
    
    Solution:
    Remove the call to graph_setup() when the event is GF_EVENT_CHILD_DOWN.
    It don't see any reason why graph_setup should be called when there is
    child_down. Not sure what the original reason was, to have graph_setup
    in child_down. git hostory shows the first patch itself had this call.
    
    > Reviewed-on: http://review.gluster.org/14656
    > Smoke: Gluster Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > Reviewed-by: Jeff Darcy <jdarcy>
    
    BUG: 1367294
    Change-Id: I9de86555f66cc94a05649ac863b40ed3426ffd4b
    Signed-off-by: Poornima G <pgurusid>
    Signed-off-by: Oleksandr Natalenko <oleksandr>
    Reviewed-on: http://review.gluster.org/14835
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Kaushal M <kaushal>

Comment 5 Worker Ant 2016-08-25 04:42:53 UTC
COMMIT: http://review.gluster.org/15167 committed in release-3.7 by Kaushal M (kaushal) 
------
commit 9df752213e6f8a1cc9a5e875cf68ca8ef32f61db
Author: Poornima G <pgurusid>
Date:   Tue Jul 19 15:20:09 2016 +0530

    gfapi: Fix IO error caused when there is consecutive graph switches
    
    Backport of http://review.gluster.org/#/c/14722/
    
    This is part 2 of the fix, the part 1 can be found at:
    http://review.gluster.org/#/c/14656/
    
    Problem:
    =======
    Consider a race between, __glfs_active_subvol() and graph_setup().
    Lets say @TIME T1:
    fs->active_subvol = A
    fs->next_subvol = B
    __glfs_active_subvol()                //under lock fs->mutex
    {
      ....
      new_subvol = fs->next_subvol       //which is B
      ....                               //Start migration from A to B
      __glfs_first_lookup(){
         ....
         unlock fs->mutex                //@TIME T2
         network fop
         lock fs->mutex
         ....
      }
      ....                                //migration continue on B
      fs->active_subvol = fs->next_subvol //which is C (explained below)
      ....
    }
    
    @Time T2, lets say in another thread, graph_setup() is called with C,
    note that at T2, fs->mutex is unlocked.
    
    graph_stup(C...)
    {
      lock fs->mutex
      ....
      if (fs->next_subvol)                // which is B
          destroy subvol (fs->next_subvol)
      ....
      fs->next_subvol = C
      ....
      unlock fs->mutex
    }
    
    Thus at the end of this,
    fs->old_subvol = A;
    fs->active_subvol = C;
    fs->next_subvol = NULL;
    which is wrong, as B completed migration, but was destroyed by
    graph_setup, and C never was migrated.
    
    Solution:
    =========
    Any new graph can be in one of the 2 states:
    - Picked for migration, migration in progress (fs->mip_subvol)
    - Not picked so far for migration (fs->next_subvol)
    graph_setup() updates fs->next_subvol only, __glfs_active_subvol()
    moves fs->next_subvol to fs->mip_subvol and fs->next_subvol = NULL
    atomically, and then once the migration is complete, make that the
    fs->active_subvol
    
    > Reviewed-on: http://review.gluster.org/14722
    > Smoke: Gluster Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > Reviewed-by: Raghavendra Talur <rtalur>
    > Reviewed-by: Rajesh Joseph <rjoseph>
    > Reviewed-by: Niels de Vos <ndevos>
    
    BUG: 1367294
    Change-Id: Ib6ff0565105c5eedb912a43da4017cd413243612
    Signed-off-by: Poornima G <pgurusid>
    Signed-off-by: Oleksandr Natalenko <oleksandr>
    Reviewed-on: http://review.gluster.org/15167
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Kaushal M <kaushal>

Comment 6 Kaushal 2016-09-01 09:21:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.15, please open a new bug report.

glusterfs-3.7.15 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-September/050714.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 7 Kaushal 2016-09-01 09:33:27 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.15, please open a new bug report.

glusterfs-3.7.15 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-September/050714.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.