1367310 – IO ERROR when multiple graph switches

Bug 1367310 - IO ERROR when multiple graph switches

Summary: IO ERROR when multiple graph switches

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	libgfapi
Sub Component:
Version:	3.7.14
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Poornima G
QA Contact:	Sudhir D
Docs Contact:
URL:
Whiteboard:
Depends On:	1343038
Blocks:	1347489 1351436 1365821
TreeView+	depends on / blocked

Reported:	2016-08-16 07:49 UTC by Poornima G
Modified:	2017-03-08 10:51 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:	1343038
Environment:
Last Closed:	2017-03-08 10:51:11 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Poornima G 2016-08-16 07:49:35 UTC

+++ This bug was initially created as a clone of Bug #1343038 +++

Description of problem:
When the IO is going on a client, 2 or more graph switches one after the other can lead to IO Error from the client.

Version-Release number of selected component (if applicable):


How reproducible:
Attached is the reproducer.
It is also seen when, qemu(libgfapi based) has a VM on gluster storage and replace brick(add brick followed by remove brick) was executed.

Steps to Reproduce:
1. gcc -lgfapi tests/bugs/libgfapi/glfs_vol_set_IO_ERR.c -o tests/bugs/libgfapi/glfs_vol_set_IO_ERR -lgfapi
2. ./tests/bugs/libgfapi/glfs_vol_set_IO_ERR <volname> <log file>
3.

Actual results:
It exists with IO error

Expected results:
It should pass

Additional info:

--- Additional comment from Vijay Bellur on 2016-06-06 07:48:36 EDT ---

REVIEW: http://review.gluster.org/14656 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#1) for review on master by Poornima G (pgurusid)

--- Additional comment from Vijay Bellur on 2016-06-14 02:01:54 EDT ---

REVIEW: http://review.gluster.org/14722 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#1) for review on master by Poornima G (pgurusid)

--- Additional comment from Vijay Bellur on 2016-06-16 02:57:18 EDT ---

REVIEW: http://review.gluster.org/14656 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#2) for review on master by Poornima G (pgurusid)

--- Additional comment from Vijay Bellur on 2016-06-16 07:57:47 EDT ---

COMMIT: http://review.gluster.org/14656 committed in master by Jeff Darcy (jdarcy) 
------
commit b8ac20e888fbacad9d90cd8f1c6ff8579a5cefe9
Author: Poornima G <pgurusid>
Date:   Mon Jun 6 06:29:40 2016 -0400

    gfapi: Fix IO error caused when there is consecutive graph switches
    
    Issue:
    Consider a simple situation, where glfs_init() is done, i.e. initial
    graph is up. Now perform 2 volume sets that results in 2 client side
    graph changes. After this perform some IO, the IO fails with ENOTCON.
    The only way to recover this client is i guess another graph switch
    or restart.
    
    What actually is happening from code perspective:
    Initial graph lets say A, followed by 2 consecutive graph switches
    to B and C without any IO those two switches.
    
    - graph_setup (A) as a result of GF_EVENT_CHILD_UP, and
    fs->next_subvol = A
    
    - glfs_init() results in fs->active_subvol = A, fs->next_subvol = NULL
    
    - graph_setup (B) as a result of GF_EVENT_CHILD_UP, and
    fs->next_subvol = B
    
    - graph_setup (C) as a result of GF_EVENT_CHILD_UP, and
    fs->next_subvol = C. It also sees that the previous graph B was never
    set as fs->active_subvol, i.e. no IO or anything happened on B, so
    can safely send GF_EVENT_PARENT_DOWN (by calling glfs_subvol_done(B)).
    This parent down on B, results in child_down(B), which is fine.
    But child_down also triggers graph_setup(B).
    
    - graph_setup(B) as a result of GF_EVENT_CHILD_DOWN, and
    fs->next_subvol = B, and GF_EVENT_PARENT_DOWN on C as explained
    above. This again leads to GF_EVENT_CHILD_DOWN on C.
    
    - graph_setup(C) as a result of GF_EVENT_CHILD_DOWN, and
    fs->next_subvol = C, and GF_EVENT_PARENT_DOWN on B as explained
    above.
    
    Thus both the graphs B and C are disconnected, and hence the ENOTCON
    
    Solution:
    Remove the call to graph_setup() when the event is GF_EVENT_CHILD_DOWN.
    It don't see any reason why graph_setup should be called when there is
    child_down. Not sure what the original reason was, to have graph_setup
    in child_down. git hostory shows the first patch itself had this call.
    
    Change-Id: I9de86555f66cc94a05649ac863b40ed3426ffd4b
    BUG: 1343038
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/14656
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Jeff Darcy <jdarcy>

--- Additional comment from Vijay Bellur on 2016-07-19 06:51:25 EDT ---

REVIEW: http://review.gluster.org/14722 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#2) for review on master by Poornima G (pgurusid)

--- Additional comment from Vijay Bellur on 2016-07-29 23:05:08 EDT ---

REVIEW: http://review.gluster.org/14722 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#3) for review on master by Poornima G (pgurusid)

--- Additional comment from Vijay Bellur on 2016-08-10 06:41:01 EDT ---

COMMIT: http://review.gluster.org/14722 committed in master by Rajesh Joseph (rjoseph) 
------
commit 30019e51ddefc266c939a61d26d324b7ebf3ad95
Author: Poornima G <pgurusid>
Date:   Tue Jul 19 15:20:09 2016 +0530

    gfapi: Fix IO error caused when there is consecutive graph switches
    
    This is part 2 of the fix, the part 1 can be found at:
    http://review.gluster.org/#/c/14656/
    
    Problem:
    =======
    Consider a race between, __glfs_active_subvol() and graph_setup().
    Lets say @TIME T1:
    fs->active_subvol = A
    fs->next_subvol = B
    __glfs_active_subvol()                //under lock fs->mutex
    {
      ....
      new_subvol = fs->next_subvol       //which is B
      ....                               //Start migration from A to B
      __glfs_first_lookup(){
         ....
         unlock fs->mutex                //@TIME T2
         network fop
         lock fs->mutex
         ....
      }
      ....                                //migration continue on B
      fs->active_subvol = fs->next_subvol //which is C (explained below)
      ....
    }
    
    @Time T2, lets say in another thread, graph_setup() is called with C,
    note that at T2, fs->mutex is unlocked.
    
    graph_stup(C...)
    {
      lock fs->mutex
      ....
      if (fs->next_subvol)                // which is B
          destroy subvol (fs->next_subvol)
      ....
      fs->next_subvol = C
      ....
      unlock fs->mutex
    }
    
    Thus at the end of this,
    fs->old_subvol = A;
    fs->active_subvol = C;
    fs->next_subvol = NULL;
    which is wrong, as B completed migration, but was destroyed by
    graph_setup, and C never was migrated.
    
    Solution:
    =========
    Any new graph can be in one of the 2 states:
    - Picked for migration, migration in progress (fs->mip_subvol)
    - Not picked so far for migration (fs->next_subvol)
    graph_setup() updates fs->next_subvol only, __glfs_active_subvol()
    moves fs->next_subvol to fs->mip_subvol and fs->next_subvol = NULL
    atomically, and then once the migration is complete, make that the
    fs->active_subvol
    
    Change-Id: Ib6ff0565105c5eedb912a43da4017cd413243612
    BUG: 1343038
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/14722
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra Talur <rtalur>
    Reviewed-by: Rajesh Joseph <rjoseph>
    Reviewed-by: Niels de Vos <ndevos>

Comment 1 Vijay Bellur 2016-08-16 07:50:34 UTC

REVIEW: http://review.gluster.org/15167 (gfapi: Fix IO error caused when there is consecutive graph switches) posted (#2) for review on release-3.7 by Poornima G (pgurusid)

Comment 2 Kaushal 2017-03-08 10:51:11 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.