Bug 1211863 - RFE: Support in md-cache to use upcall notifications to invalidate its cache
Summary: RFE: Support in md-cache to use upcall notifications to invalidate its cache
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: All
OS: All
medium
medium
Target Milestone: ---
Assignee: Poornima G
QA Contact:
URL:
Whiteboard:
Depends On: 1366284 1368842 1369430 1369432 1369638 1370708 1370710
Blocks: glusterfs-3.9.0
TreeView+ depends on / blocked
 
Reported: 2015-04-15 06:43 UTC by Soumya Koduri
Modified: 2017-03-06 17:18 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-06 17:18:18 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Soumya Koduri 2015-04-15 06:43:59 UTC
Description of problem:

"Upcall" feature enables clients to receive callback notifications from the brick process on the file/dir entries (which it may cache) in case of any change in the back-end by other clients.

Feature page URL:
http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure

md-cache can make use of this feature to invalidate its cache entries. This bug is to track those changes.

Comment 1 Soumya Koduri 2015-05-05 13:12:07 UTC
With change http://review.gluster.org/#/c/10581 , cache-invalidation will be turned off in case if NFS-ganesha is disabled. We may need to avoid that once md-cache also uses the feature to invalidate its cache.

Comment 2 Vijay Bellur 2015-12-17 13:08:07 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cahe.) posted (#3) for review on master by Poornima G (pgurusid)

Comment 3 Vijay Bellur 2015-12-17 13:09:03 UTC
REVIEW: http://review.gluster.org/12996 (upcall: Add xattr invalidation) posted (#1) for review on master by Poornima G (pgurusid)

Comment 4 Vijay Bellur 2015-12-17 20:50:56 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#4) for review on master by Niels de Vos (ndevos)

Comment 5 Vijay Bellur 2015-12-17 20:51:01 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 6 Vijay Bellur 2015-12-17 20:51:05 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 7 Vijay Bellur 2015-12-29 07:16:02 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#5) for review on master by Poornima G (pgurusid)

Comment 8 Vijay Bellur 2015-12-29 07:16:06 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#3) for review on master by Poornima G (pgurusid)

Comment 9 Vijay Bellur 2015-12-29 07:16:10 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#3) for review on master by Poornima G (pgurusid)

Comment 10 Vijay Bellur 2016-01-13 11:19:11 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#6) for review on master by Poornima G (pgurusid)

Comment 11 Vijay Bellur 2016-01-13 11:19:14 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#4) for review on master by Poornima G (pgurusid)

Comment 12 Vijay Bellur 2016-01-13 11:19:22 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#4) for review on master by Poornima G (pgurusid)

Comment 13 Vijay Bellur 2016-01-13 11:32:05 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#7) for review on master by Poornima G (pgurusid)

Comment 14 Vijay Bellur 2016-01-13 11:32:07 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#5) for review on master by Poornima G (pgurusid)

Comment 15 Vijay Bellur 2016-01-13 11:32:10 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#5) for review on master by Poornima G (pgurusid)

Comment 16 Vijay Bellur 2016-02-09 11:21:47 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#8) for review on master by Poornima G (pgurusid)

Comment 17 Vijay Bellur 2016-02-09 11:21:50 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#6) for review on master by Poornima G (pgurusid)

Comment 18 Vijay Bellur 2016-02-09 11:21:53 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#6) for review on master by Poornima G (pgurusid)

Comment 19 Vijay Bellur 2016-02-09 11:21:56 UTC
REVIEW: http://review.gluster.org/13406 (md-cache: Add xattr caching support) posted (#1) for review on master by Poornima G (pgurusid)

Comment 20 Vijay Bellur 2016-03-10 05:43:54 UTC
REVIEW: http://review.gluster.org/13406 (md-cache: Add xattr caching support) posted (#2) for review on master by Poornima G (pgurusid)

Comment 21 Vijay Bellur 2016-03-10 05:43:57 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#9) for review on master by Poornima G (pgurusid)

Comment 22 Vijay Bellur 2016-03-10 05:44:00 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#7) for review on master by Poornima G (pgurusid)

Comment 23 Vijay Bellur 2016-03-10 05:44:03 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#7) for review on master by Poornima G (pgurusid)

Comment 24 Vijay Bellur 2016-03-10 08:48:17 UTC
REVIEW: http://review.gluster.org/13406 (md-cache: Add xattr caching support) posted (#3) for review on master by Poornima G (pgurusid)

Comment 25 Vijay Bellur 2016-03-10 08:48:20 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#10) for review on master by Poornima G (pgurusid)

Comment 26 Vijay Bellur 2016-03-10 08:48:24 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#8) for review on master by Poornima G (pgurusid)

Comment 27 Vijay Bellur 2016-03-10 08:48:27 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#8) for review on master by Poornima G (pgurusid)

Comment 28 Mike McCune 2016-03-28 23:16:15 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 29 Vijay Bellur 2016-04-05 06:09:19 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#9) for review on master by Poornima G (pgurusid)

Comment 30 Vijay Bellur 2016-05-06 05:28:22 UTC
REVIEW: http://review.gluster.org/12995 (upcall: add support to invalidate xattrs) posted (#10) for review on master by Poornima G (pgurusid)

Comment 31 Vijay Bellur 2016-05-06 05:50:24 UTC
REVIEW: http://review.gluster.org/12995 (upcall: Add support to invalidate xattrs) posted (#11) for review on master by Poornima G (pgurusid)

Comment 32 Vijay Bellur 2016-05-06 05:50:27 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#9) for review on master by Poornima G (pgurusid)

Comment 33 Vijay Bellur 2016-05-11 07:15:07 UTC
REVIEW: http://review.gluster.org/12995 (upcall: Add support to invalidate xattrs) posted (#12) for review on master by Poornima G (pgurusid)

Comment 34 Vijay Bellur 2016-05-11 07:15:09 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#10) for review on master by Poornima G (pgurusid)

Comment 35 Vijay Bellur 2016-05-11 18:23:03 UTC
COMMIT: http://review.gluster.org/12995 committed in master by Niels de Vos (ndevos) 
------
commit b2222c1e13d3bff17fa04b8f9b4870cefd457fe2
Author: Niels de Vos <ndevos>
Date:   Mon Dec 7 16:24:15 2015 +0000

    upcall: Add support to invalidate xattrs
    
    When SElinux is used, clients should get a notification that the
    extended attributes have been updated. Other components (like md-cache)
    will be able to use this too.
    
    A big part of the implementation comes from Poornima through the first
    version of http://review.gluster.org/12996.
    
    Also moving the flags from upcall-cache-invalidation.h to the main
    libglusterfs upcall-utils.h file, so that other places can easily use
    them in future.
    
    Change-Id: I525345bed8f22d029524ff19ccaf726a2c905454
    BUG: 1211863
    Signed-off-by: Niels de Vos <ndevos>
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/12995
    Reviewed-by: soumya k <skoduri>
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>

Comment 36 Vijay Bellur 2016-05-12 05:21:14 UTC
REVIEW: http://review.gluster.org/12996 (upcall: pass dict with xattrs on xattr invalidation) posted (#11) for review on master by Poornima G (pgurusid)

Comment 37 Vijay Bellur 2016-05-19 09:28:11 UTC
COMMIT: http://review.gluster.org/12996 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 8facd588f20ef8305b6f6b53da0f6d54d300093b
Author: Poornima G <pgurusid>
Date:   Wed Dec 16 05:45:03 2015 -0500

    upcall: pass dict with xattrs on xattr invalidation
    
    In case of xattr invalidation, return a dict containing
    the updated xattrs.
    
    [ndevos: move chunks to change 12995 and only address the xattrs-dict here]
    
    Change-Id: I8733f06a519a9a0f24be1bb4b2c38c9c9dce0ce2
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/12996
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Niels de Vos <ndevos>
    Reviewed-by: soumya k <skoduri>
    Tested-by: soumya k <skoduri>

Comment 38 Vijay Bellur 2016-06-29 04:28:18 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#11) for review on master by Poornima G (pgurusid)

Comment 39 Vijay Bellur 2016-06-29 04:31:54 UTC
REVIEW: http://review.gluster.org/14824 (md-cache: Cache gluster-samba metadata) posted (#1) for review on master by Poornima G (pgurusid)

Comment 40 Vijay Bellur 2016-06-29 09:45:43 UTC
REVIEW: http://review.gluster.org/14824 (md-cache: Cache gluster-samba metadata) posted (#2) for review on master by Poornima G (pgurusid)

Comment 41 Vijay Bellur 2016-06-29 10:42:21 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#12) for review on master by Poornima G (pgurusid)

Comment 42 Vijay Bellur 2016-06-30 07:00:55 UTC
REVIEW: http://review.gluster.org/14824 (md-cache: Cache gluster-samba metadata) posted (#3) for review on master by Poornima G (pgurusid)

Comment 43 Vijay Bellur 2016-07-05 11:00:54 UTC
COMMIT: http://review.gluster.org/14824 committed in master by Raghavendra G (rgowdapp) 
------
commit 01d6b17bac704a320bc0549ae063ee7f4bf3748b
Author: Poornima G <pgurusid>
Date:   Wed Jun 29 00:25:39 2016 -0400

    md-cache: Cache gluster-samba metadata
    
    Change-Id: I0a95f4897440c5bf6f54612d9c232e015c8bf983
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/14824
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Prashanth Pai <ppai>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 44 Vijay Bellur 2016-07-05 11:43:42 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#13) for review on master by Poornima G (pgurusid)

Comment 45 Vijay Bellur 2016-07-07 05:41:54 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#14) for review on master by Poornima G (pgurusid)

Comment 46 Vijay Bellur 2016-07-08 09:56:57 UTC
REVIEW: http://review.gluster.org/14879 ("md-cache: Enable caching of stat fetched from readdirp) posted (#1) for review on master by Poornima G (pgurusid)

Comment 47 Vijay Bellur 2016-07-11 13:48:38 UTC
COMMIT: http://review.gluster.org/14879 committed in master by Jeff Darcy (jdarcy) 
------
commit 62f826de85489f47da506e0d20e9ed349278d597
Author: Poornima G <pgurusid>
Date:   Fri Jul 8 14:25:35 2016 +0530

    "md-cache: Enable caching of stat fetched from readdirp
    
    Patch http://review.gluster.org/11894 removed readdirp fop for
    md-cache, but there is no mention of exact xlator which was
    failing because of this. As mentioned by Rafi(author of patch 11894)
    tiering and svc doesn't really need this as the inode_ctx is populated
    in readdirp_cbk. Hence reverting this commit.
    This reverts commit c8c9308134ae4ce24c630a1b0ccfcf4e8f9b0fe7.
    
    Change-Id: Ib8d00b3f129596f3a54984f839199175f5c9b55b
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/14879
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Zhou Zhengping <johnzzpcrystal>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 48 Vijay Bellur 2016-07-12 03:08:28 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#15) for review on master by Poornima G (pgurusid)

Comment 49 Vijay Bellur 2016-07-12 05:59:47 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#16) for review on master by Poornima G (pgurusid)

Comment 50 Vijay Bellur 2016-07-13 11:40:27 UTC
REVIEW: http://review.gluster.org/14912 (tier: properly update cached subvol during readdirp response) posted (#1) for review on master by mohammed rafi  kc (rkavunga)

Comment 51 Vijay Bellur 2016-07-15 09:11:48 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#17) for review on master by Poornima G (pgurusid)

Comment 52 Vijay Bellur 2016-07-17 03:09:09 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#18) for review on master by Poornima G (pgurusid)

Comment 53 Vijay Bellur 2016-07-17 13:33:29 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#19) for review on master by Poornima G (pgurusid)

Comment 54 Vijay Bellur 2016-07-17 14:29:54 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#20) for review on master by Poornima G (pgurusid)

Comment 55 Vijay Bellur 2016-07-19 05:32:48 UTC
REVIEW: http://review.gluster.org/12951 (md-cache: Add cache invalidation support to invalidate the meta data cache) posted (#21) for review on master by Poornima G (pgurusid)

Comment 56 Vijay Bellur 2016-07-20 12:12:09 UTC
COMMIT: http://review.gluster.org/12951 committed in master by Raghavendra G (rgowdapp) 
------
commit 1f97d7101b3313ce647638310e1028da8dac6785
Author: Poornima G <pgurusid>
Date:   Fri Dec 11 05:12:07 2015 -0500

    md-cache: Add cache invalidation support to invalidate the meta data cache
    
    Problem:
    md-cache currently updates its stat in cbks of selected fops.
    The default cache time is 1 second, if this is increasd to reap the
    benefits of caching, we may end up with stale cache for long time,
    as there is no logic yet to notify md-cache of backend changes by
    another client.
    
    Solution:
    Use the existing upcall mechanism to invalidate the cache.
    For this feature to work, "features.cache-invalidation" volume
    option should be enabled.
    
    This patch as is doesn't improve any performance, the benifit of the
    patch is that it provides coherency for stat cache, hence the cache
    timeout can be quite longer which in turn can improve the performance.
    
    Change-Id: I2dbb0afa7b5e4a5a248f910188e0918e02f18692
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/12951
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 57 Vijay Bellur 2016-07-21 06:16:03 UTC
REVIEW: http://review.gluster.org/14971 (md-cache: fix indention to silence Coverity) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 58 Vijay Bellur 2016-07-21 18:06:03 UTC
COMMIT: http://review.gluster.org/14971 committed in master by Vijay Bellur (vbellur) 
------
commit 370197f6e8413c0c4c13571f1c5b613bfa1e50d5
Author: Niels de Vos <ndevos>
Date:   Thu Jul 21 08:12:04 2016 +0200

    md-cache: fix indention to silence Coverity
    
    Coverity complains about the incorrect indention:
    
        *** CID 1357867:  Control flow issues  (NESTING_INDENT_MISMATCH)
        ...
        2566                     if (conf->mdc_invalidation)
        2567                             ret = mdc_invalidate (this, data);
        >>>     CID 1357867:  Control flow issues  (NESTING_INDENT_MISMATCH)
        >>>     This 'if'  statement is indented to column 25, as if it were nested within the preceding parent statement, but it is not.
        2568                             if (default_notify (this, event, data) != 0)
        2569      	                         ret = -1;
        2570                     break;
        ...
    
    Even when md-cache does not have cache-invalidation on, we need to pass
    the upcall to the next xlator.
    
    Change-Id: I6d2a174eb54e3df270920aae9673b5010c235f25
    CID: 1357867
    BUG: 1211863
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/14971
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Prashanth Pai <ppai>
    Reviewed-by: Poornima G <pgurusid>
    Reviewed-by: Raghavendra G <rgowdapp>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 59 Vijay Bellur 2016-07-25 12:02:55 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#1) for review on master by Poornima G (pgurusid)

Comment 60 Vijay Bellur 2016-07-27 10:16:24 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#2) for review on master by Poornima G (pgurusid)

Comment 61 Poornima G 2016-07-28 12:51:36 UTC
TODOs:
1. Implement cache size limit for md-cache
2. Implement register for notification in gfapi
3. IPC: Implement the way md-cache can communicate the list of xattr to upcall.
   - implement dht_ipc, afr_ipc, ec_ipc, shard_ipc
   - Enhance upcall to store the xattrs list seperately for each client, and enable each client to register and unregister its own set of xattrs.

For better debugability
1. Add logging in md-cache, option to enable md-cache trace logging
2. Md-cache, and upcall integrate with statedump
3. Gfapi client perf profiling ability

Test cases for md-cache:
1. Add test cases for testing cache-invalidation, specific cases
2. Run the regression with cache inval on and large timeout

Comment 62 Vijay Bellur 2016-07-29 07:23:39 UTC
REVIEW: http://review.gluster.org/15043 (md-cache/upcall: In case of mode bit change invalidate xattr) posted (#1) for review on master by Poornima G (pgurusid)

Comment 63 Vijay Bellur 2016-07-29 09:49:37 UTC
REVIEW: http://review.gluster.org/15045 (md-cache: Fix wrong cache time update for xattrs) posted (#1) for review on master by Poornima G (pgurusid)

Comment 64 Vijay Bellur 2016-07-29 09:56:04 UTC
REVIEW: http://review.gluster.org/15043 (md-cache/upcall: In case of mode bit change invalidate xattr) posted (#2) for review on master by Poornima G (pgurusid)

Comment 65 Vijay Bellur 2016-08-02 04:11:36 UTC
REVIEW: http://review.gluster.org/15043 (md-cache/upcall: In case of mode bit change invalidate xattr) posted (#3) for review on master by Poornima G (pgurusid)

Comment 66 Vijay Bellur 2016-08-02 09:22:17 UTC
REVIEW: http://review.gluster.org/15069 (md-cache: Add logging to increase debuggability) posted (#1) for review on master by Poornima G (pgurusid)

Comment 67 Vijay Bellur 2016-08-04 11:07:23 UTC
REVIEW: http://review.gluster.org/15069 (md-cache: Add logging to increase debuggability) posted (#2) for review on master by Poornima G (pgurusid)

Comment 68 Vijay Bellur 2016-08-04 15:46:48 UTC
COMMIT: http://review.gluster.org/15043 committed in master by Jeff Darcy (jdarcy) 
------
commit 5ae0a5d1e92175c28cd5470b890e99ff4eac0673
Author: Poornima G <pgurusid>
Date:   Fri Jul 29 12:20:11 2016 +0530

    md-cache/upcall: In case of mode bit change invalidate xattr
    
    When the mode bits are changed, the ACL entries also do get affected.
    Currently in upcall, setattr invalidates only the stat info.
    
    With this patch, if mode bits are changed, the upcall will invalidate
    all the xattrs.
    
    Change-Id: Iccda2e1a7440ee845aa5442bf51970f74d9b0862
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15043
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Niels de Vos <ndevos>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Prashanth Pai <ppai>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 69 Vijay Bellur 2016-08-10 09:33:04 UTC
REVIEW: http://review.gluster.org/15069 (md-cache: Add logging to increase debuggability) posted (#3) for review on master by Poornima G (pgurusid)

Comment 70 Vijay Bellur 2016-08-11 09:52:27 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#3) for review on master by Poornima G (pgurusid)

Comment 71 Vijay Bellur 2016-08-16 04:20:48 UTC
COMMIT: http://review.gluster.org/15069 committed in master by Raghavendra G (rgowdapp) 
------
commit 2b0e83e908b3be2043e92a974ba92ae942bff4d1
Author: Poornima G <pgurusid>
Date:   Tue Aug 2 14:51:23 2016 +0530

    md-cache: Add logging to increase debuggability
    
    Change-Id: I147d16ec3c20d3372892fdd5f62010e52f82f8bd
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15069
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Vijay Bellur <vbellur>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 72 Vijay Bellur 2016-08-17 04:49:53 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#4) for review on master by Poornima G (pgurusid)

Comment 73 Poornima G 2016-08-17 06:33:50 UTC
(In reply to Poornima G from comment #61)
> TODOs:
> 1. Implement cache size limit for md-cache
> 2. Implement register for notification in gfapi
> 3. IPC: Implement the way md-cache can communicate the list of xattr to
> upcall.
>    - implement dht_ipc, afr_ipc, ec_ipc, shard_ipc
>    - Enhance upcall to store the xattrs list seperately for each client, and
> enable each client to register and unregister its own set of xattrs.
  4. Transaction id framework to eliminate duplicate invalidations in case of replica and EC.
  5. Display upcall invalidations also in the profile info
  6. Add support for lower xlators to indicate md-cache to not cache 

> For better debugability
> 1. Add logging in md-cache, option to enable md-cache trace logging
> 2. Md-cache, and upcall integrate with statedump
> 3. Gfapi client perf profiling ability
> 
> Test cases for md-cache:
> 1. Add test cases for testing cache-invalidation, specific cases
> 2. Run the regression with cache inval on and large timeout

Comment 74 Vijay Bellur 2016-08-17 07:51:36 UTC
REVIEW: http://review.gluster.org/15185 (md-cache: Add cache hit and miss counters) posted (#1) for review on master by Poornima G (pgurusid)

Comment 75 Vijay Bellur 2016-08-18 06:00:29 UTC
REVIEW: http://review.gluster.org/15193 (io-stats: Add stats for upcall notifications) posted (#1) for review on master by Poornima G (pgurusid)

Comment 76 Vijay Bellur 2016-08-19 04:43:13 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#5) for review on master by Poornima G (pgurusid)

Comment 77 Vijay Bellur 2016-08-19 05:35:53 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#6) for review on master by Poornima G (pgurusid)

Comment 78 Vijay Bellur 2016-08-20 09:32:56 UTC
REVIEW: http://review.gluster.org/15185 (md-cache: Add cache hit and miss counters) posted (#2) for review on master by Poornima G (pgurusid)

Comment 79 Vijay Bellur 2016-08-21 10:19:19 UTC
REVIEW: http://review.gluster.org/15185 (md-cache: Add cache hit and miss counters) posted (#3) for review on master by Poornima G (pgurusid)

Comment 80 Vijay Bellur 2016-08-21 13:35:25 UTC
REVIEW: http://review.gluster.org/15225 (dht: Implement ipc fop) posted (#1) for review on master by Poornima G (pgurusid)

Comment 81 Vijay Bellur 2016-08-21 13:45:25 UTC
REVIEW: http://review.gluster.org/15185 (md-cache: Add cache hit and miss counters) posted (#4) for review on master by Poornima G (pgurusid)

Comment 82 Worker Ant 2016-08-24 04:03:23 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#7) for review on master by Poornima G (pgurusid)

Comment 83 Worker Ant 2016-08-24 04:03:27 UTC
REVIEW: http://review.gluster.org/15225 (dht: Implement ipc fop) posted (#2) for review on master by Poornima G (pgurusid)

Comment 84 Worker Ant 2016-08-25 05:16:40 UTC
REVIEW: http://review.gluster.org/15314 (md-cache: Do not use features.cache-invalidation for both md-cache and upcall) posted (#1) for review on master by Poornima G (pgurusid)

Comment 85 Worker Ant 2016-08-25 05:55:34 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#8) for review on master by Poornima G (pgurusid)

Comment 86 Worker Ant 2016-08-25 05:55:37 UTC
REVIEW: http://review.gluster.org/15225 (dht: Implement ipc fop) posted (#3) for review on master by Poornima G (pgurusid)

Comment 87 Worker Ant 2016-08-26 06:27:47 UTC
REVIEW: http://review.gluster.org/15324 (md-cache: Process all the cache invalidation flags) posted (#1) for review on master by Poornima G (pgurusid)

Comment 88 Worker Ant 2016-08-26 09:45:48 UTC
REVIEW: http://review.gluster.org/15193 (io-stats: Add stats for upcall notifications) posted (#2) for review on master by Poornima G (pgurusid)

Comment 89 Worker Ant 2016-08-26 11:25:23 UTC
REVIEW: http://review.gluster.org/15314 (md-cache: Do not use features.cache-invalidation for both md-cache and upcall) posted (#2) for review on master by Poornima G (pgurusid)

Comment 90 Worker Ant 2016-08-26 12:54:43 UTC
REVIEW: http://review.gluster.org/15193 (io-stats: Add stats for upcall notifications) posted (#3) for review on master by Poornima G (pgurusid)

Comment 91 Worker Ant 2016-08-27 06:25:39 UTC
REVIEW: http://review.gluster.org/15193 (io-stats: Add stats for upcall notifications) posted (#4) for review on master by Poornima G (pgurusid)

Comment 92 Worker Ant 2016-08-27 11:14:40 UTC
COMMIT: http://review.gluster.org/15185 committed in master by Raghavendra G (rgowdapp) 
------
commit 3cc7f6588c281846f8c590553da03dd16f150e8a
Author: Poornima G <pgurusid>
Date:   Wed Aug 17 12:55:37 2016 +0530

    md-cache: Add cache hit and miss counters
    
    These counters can be accessed either by .meta interface
    or statedump.
    
    From meta: cat on the private file in md-cache directory.
    Eg: cat /mnt/glusterfs/0/.meta/graphs/active/patchy-md-cache/private
    [performance/md-cache.patchy-md-cache]
    stat_hit_count = 2
    stat_miss_count = 8
    xattr_hit_count = 4
    xattr_miss_count = 3
    nameless_lookup_count = 1
    negative_lookup_count = 0
    stat_invalidations_recieved = 1
    xattr_invalidations_recieved = 1
    
    Change-Id: Ib62a8822f263a9f75858b15832d0119fbe382629
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15185
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 93 Worker Ant 2016-08-27 12:18:52 UTC
COMMIT: http://review.gluster.org/15314 committed in master by Raghavendra G (rgowdapp) 
------
commit 3f5273e19af2eaa7bc33c6abaf6b10850f97dcc0
Author: Poornima G <pgurusid>
Date:   Thu Aug 25 10:25:24 2016 +0530

    md-cache: Do not use features.cache-invalidation for both md-cache and upcall
    
    Currently, the volume set option features.cache-invalidation enables upcall
    feature on server side and md-cache cache-invalidation on client side.
    There are multiple problems that can arise from this:
    1. The scenario when user wants to, enable upcall for nfs-ganesha setup,
       but do not want to enable md-cache cache-invalidation, as the
       nfs-clients have already cached the metadata and upcall is used to
       to invalidate the nfs-client cache. In this case, users should have
       a way of disabling md-cache invalidation without disabling upcall.
    
    2. Upcall requires a op-version of GD_OP_VERSION_3_7_0, where as
       md-cache invalidation requires an op version of GD_OP_VERSION_3_9_0.
       Consider a setup where the servers are in op-version GD_OP_VERSION_3_7_0,
       and th clients are in op-version GD_OP_VERSION_3_9_0. if there is one
       single volume set option, user can enable this feature in this setup.
       But it can lead to stale xattr cache as the xattr invalidation was
       introduced in upcall only in release 3.8. Hence, we should not be
       able to enable md-cache invalidation, if all the servers and clients
       are not on opversion >= GD_OP_VERSION_3_9_0.
    
    To solve the above mentioned issues, we have seperate volume options
    for enabling md-cache invalidation and upcall. But this can lead to
    issues when user enable md-cache invalidation and forgets to enable
    upcall. Probably in the next release, these can be enables by default.
    
    Change-Id: Ie70eff97fe12fcb623eec8f4f5861ac065bf483e
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15314
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 94 Worker Ant 2016-08-27 12:48:40 UTC
COMMIT: http://review.gluster.org/15225 committed in master by Raghavendra G (rgowdapp) 
------
commit b85c648a6b236ca03494cb61b97e2e703be0950c
Author: Poornima G <pgurusid>
Date:   Mon Jul 18 21:19:34 2016 +0530

    dht: Implement ipc fop
    
    ipc is used by md-cache to communicate the list of xattrs that
    it is caching, to the upcall xlator. Hence implement this in
    dht, such that it winds to all the bricks if the ipc op is
    GF_IPC_MDC_TARGET_UPCALL. The ips should not fail if any of
    the bricks is down, as md-cache will replay the ipc late when
    the brick comes back up.
    
    Change-Id: Ica551a550c04cbb1240c0d211fe831c2e5eb6017
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15225
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 95 Worker Ant 2016-08-27 12:56:38 UTC
COMMIT: http://review.gluster.org/15045 committed in master by Raghavendra G (rgowdapp) 
------
commit f8b51bef8820142264bdca9cfe0d7106fb045c2a
Author: Poornima G <pgurusid>
Date:   Fri Jul 29 15:03:47 2016 +0530

    md-cache: Fix wrong cache time update for xattrs
    
    In md-cache, the cache has two times:
    1. Time when the stat was last fetched for that inode
    2. Time when the xattrs were last fetched for that inode. This
    time should not be updated when only one of the xattrs is updated.
    If, its updated when one of the cached xattr is changed, it can so
    happen that the other xattrs have past their cache timeout, but are
    still served from cache.
    
    Solution:
    Do not update the xattr cache time, when one of the xattrs being cached
    is changed. With this, we may end up in cache timeout though it was
    updated recently, but it is not a harm. The other way is to have
    timeout for every xattr that is being cached. Its more complicated, and
    may be not worth it, as we have lot of lookup fops, that are overloaded to
    get all the xattrs.
    
    Change-Id: Id77e547f403fc792348f1ea56b468b9260a5a34f
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15045
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 96 Worker Ant 2016-08-29 08:54:23 UTC
REVIEW: http://review.gluster.org/15324 (md-cache: Process all the cache invalidation flags) posted (#2) for review on master by Poornima G (pgurusid)

Comment 97 Worker Ant 2016-08-29 08:54:27 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#9) for review on master by Poornima G (pgurusid)

Comment 98 Worker Ant 2016-08-29 08:54:33 UTC
REVIEW: http://review.gluster.org/15334 (md-cache:Send inode invalidate to Fuse when there is unlink/rename) posted (#1) for review on master by Poornima G (pgurusid)

Comment 99 Worker Ant 2016-08-30 11:38:27 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#10) for review on master by Poornima G (pgurusid)

Comment 100 Worker Ant 2016-08-30 16:18:55 UTC
REVIEW: http://review.gluster.org/15002 (md-cache: Register the list of xattrs with cache-invalidation) posted (#11) for review on master by Poornima G (pgurusid)

Comment 101 Worker Ant 2016-08-31 06:07:05 UTC
COMMIT: http://review.gluster.org/15002 committed in master by Raghavendra G (rgowdapp) 
------
commit 8f053f9d7270f1c6d50c0b3ab5d020503ceeb31a
Author: Poornima G <pgurusid>
Date:   Mon Jul 11 15:04:55 2016 +0530

    md-cache: Register the list of xattrs with cache-invalidation
    
    Issue:
    md-cache caches a specified list of xattrs, and when cache invalidation
    is enabled, it makes sense to recieve invalidation only when those xattrs
    are modified by other clients. But the current implementation of upcall
    is that, it will send invalidation when any of the on-disk xattrs is modified.
    
    Solution:
    md-cache sends a list of xattrs that it is interested in, to upcall by
    issuing an ipc(). The challenge here is to make sure everytime a brick
    goes offline and comes back up, the ipc() needs to be issued to the
    bricks. Hence ipc() is sent from md-cache every time there is a
    CHILD_UP/CHILD_MODIFIED event.
    
    TODO:
    There will be patches following, in cluster xlators, to implement ipc fop.
    
    Change-Id: I6efcf3df474f5ce6eabd3d6694c00c7bd89bc25d
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15002
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Rajesh Joseph <rjoseph>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Prashanth Pai <ppai>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 102 Worker Ant 2016-08-31 06:40:56 UTC
COMMIT: http://review.gluster.org/15324 committed in master by Raghavendra G (rgowdapp) 
------
commit fe929224c47d5c82da5650e9e1041645a8d7f244
Author: Poornima G <pgurusid>
Date:   Thu Aug 25 15:43:29 2016 +0530

    md-cache: Process all the cache invalidation flags
    
    Currently, md-cache only processes IATT_UPDATE_FLAGS, UP_XATTR and
    UP_XATTR_RM. We also need to process UP_RENAME_FLAGS, UP_FORGET,
    UP_PARENT_DENTRY_FLAGS and UP_NLINK_FLAGS. Otherwise the files
    unlinked or renamed will not be reflected on other mounts.
    
    Change-Id: Icb8b03da51482c3fc2e2a7292d16d56e11a341d9
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15324
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 103 Worker Ant 2016-09-01 06:51:44 UTC
REVIEW: http://review.gluster.org/15378 (afr: Add IPC fop) posted (#1) for review on master by Poornima G (pgurusid)

Comment 104 Worker Ant 2016-09-03 05:12:06 UTC
REVIEW: http://review.gluster.org/15378 (afr: Add IPC fop) posted (#2) for review on master by Poornima G (pgurusid)

Comment 105 Worker Ant 2016-09-04 03:04:17 UTC
REVIEW: http://review.gluster.org/15378 (afr: Implement IPC fop) posted (#3) for review on master by Poornima G (pgurusid)

Comment 106 Worker Ant 2016-09-07 10:19:58 UTC
REVIEW: http://review.gluster.org/15419 (tests: Fix one of the md-cache test cases) posted (#1) for review on master by Poornima G (pgurusid)

Comment 107 Worker Ant 2016-09-08 04:15:11 UTC
COMMIT: http://review.gluster.org/15419 committed in master by Vijay Bellur (vbellur) 
------
commit 0fd7d0e1c78fdbedfcdb085445c4b0be3c1a97a9
Author: Poornima G <pgurusid>
Date:   Wed Sep 7 15:47:14 2016 +0530

    tests: Fix one of the md-cache test cases
    
    Verify if the unlink, rename and other ops are reflected both on
    the current mount and other mounts.
    
    Change-Id: I5a296cdd557194dcf487e65ee4a14bbeaf4be690
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15419
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 108 Worker Ant 2016-09-14 07:15:06 UTC
REVIEW: http://review.gluster.org/15378 (afr: Implement IPC fop) posted (#4) for review on master by Poornima G (pgurusid)

Comment 109 Worker Ant 2016-09-14 07:15:10 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#2) for review on master by Poornima G (pgurusid)

Comment 110 Worker Ant 2016-09-22 09:13:05 UTC
REVIEW: http://review.gluster.org/15378 (afr: Implement IPC fop) posted (#5) for review on master by Poornima G (pgurusid)

Comment 111 Worker Ant 2016-09-22 09:13:09 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#3) for review on master by Poornima G (pgurusid)

Comment 112 Worker Ant 2016-09-23 06:34:45 UTC
REVIEW: http://review.gluster.org/15387 (ec: Implement ipc fop) posted (#4) for review on master by Poornima G (pgurusid)

Comment 113 Worker Ant 2016-09-25 19:30:23 UTC
COMMIT: http://review.gluster.org/15387 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 359b72a57b7c92fc2a11236ac05f5d740db2f540
Author: Poornima G <pgurusid>
Date:   Fri Sep 2 12:47:15 2016 +0530

    ec: Implement ipc fop
    
    The ipc will be wound to all the bricks, but for it to be
    successfull, the fop should succeed on minimum number of bricks.
    
    Change-Id: I3f8cb6a349e87bafd0773583def9d4e3765aa140
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15387
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Ashish Pandey <aspandey>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 114 Worker Ant 2016-09-26 07:38:45 UTC
REVIEW: http://review.gluster.org/15378 (afr: Implement IPC fop) posted (#6) for review on master by Poornima G (pgurusid)

Comment 115 Worker Ant 2016-09-26 07:38:49 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#4) for review on master by Poornima G (pgurusid)

Comment 116 Worker Ant 2016-09-27 04:27:30 UTC
REVIEW: http://review.gluster.org/15378 (afr: Implement IPC fop) posted (#7) for review on master by Poornima G (pgurusid)

Comment 117 Worker Ant 2016-09-27 04:28:14 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#5) for review on master by Poornima G (pgurusid)

Comment 118 Worker Ant 2016-09-29 07:06:29 UTC
REVIEW: http://review.gluster.org/15378 (afr: Implement IPC fop) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 119 Worker Ant 2016-09-29 18:22:09 UTC
COMMIT: http://review.gluster.org/15378 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 9ab5b52dee5be45458fdb5446d3cbf6a1a5306a6
Author: Poornima G <pgurusid>
Date:   Mon Aug 22 12:30:43 2016 +0530

    afr: Implement IPC fop
    
    Currently ipc() is not implemented in afr. md-cache and upcall
    uses ipc to register the list of xattrs, [1] for more details.
    For the ipc op GF_IPC_TARGET_UPCALL, it has to be wound to all
    the replica subvolumes. ipc() is failed when any of the
    subvolumes fails with other than ENOTCONN or all of the subvolumes
    are down.
    
    [1] http://review.gluster.org/#/c/15002/
    
    Change-Id: I0f651330eafda64e4d922043fe53bd0014536247
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15378
    Tested-by: Pranith Kumar Karampuri <pkarampu>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 120 Worker Ant 2016-10-03 09:48:59 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#6) for review on master by Poornima G (pgurusid)

Comment 121 Worker Ant 2016-10-14 06:16:29 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#7) for review on master by Poornima G (pgurusid)

Comment 122 Worker Ant 2016-10-19 09:29:22 UTC
REVIEW: http://review.gluster.org/15398 (md-cache, afr: Reduce the window of stale read) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 123 Worker Ant 2016-10-20 07:08:02 UTC
COMMIT: http://review.gluster.org/15398 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 8d8eded58cd5431a7000a70337444b828cb400d8
Author: Poornima G <pgurusid>
Date:   Sun Sep 4 08:27:47 2016 +0530

    md-cache, afr: Reduce the window of stale read
    
    Problem:
    Consider a replica setup, where one mount writes data to a
    file and the other mount reads the file. In afr, read operations
    are not transaction based, a brick(read subvolume) is chosen as
    a part of lookup or other operations, read is always wound only
    to the read subvolume, even if there was write from a different client
    that failed on this brick. This stale read continues until there is
    a lookup or any write operation from the mount point. Currently, this
    is not a major issue, as a lookup is issued before every read and it will
    switch the read subvolume to a correct one. But with the plan of
    increasing md-cache timeout to 600s, the stale read problem will be
    more pronounced, i.e. stale read can continue for 600s(or more if cascaded
    with readdirp), as there will be no lookups.
    
    Solution:
    Afr doesn't have any built-in solution for stale read(without affecting
    the performance). The solution that came up, was to use upcall. When a file
    on any brick is marked bad for the first time, upcall sends a notification
    to all the clients that had recently accessed the file. The solution has
    2 parts:
    - Identifying when a file is marked bad, on any of the bricks,
      for the first time
    - Client side actions on recieving the notifications
    
    Identifying when a file is marked bad on any of the bricks for the first time:
    -----------------------------------------------------------------------------
    The idea is to track xattrop in upcall. xattrop currently comes with 2 afr
    xattrs - afr dirty bit and afr pending xattrs.
       Dirty xattr is set to 1 before every write, and is unset if write succeeds.
    In certain scenarios, dirty xattr can be 0 and still the file could be bad
    copy. Hence do not track dirty xattr.
       Pending xattr is set on the good copy, indicating the other bricks that have
    bad copy. It is still not as simple as, notifying when any of the pending xattrs
    change. It could lead to flood of notifcations, in case the other brick is
    completely down or consistantly failing. Hence it is important to notify only
    once, the first time a good copy is marked bad.
    
    Client side actions on recieving pending xattr change, notification:
    --------------------------------------------------------------------
    md-cache will invalidate the cache of that file, so that further lookup is
    passed down to afr and hence update the read subvolume. Invalidating only in
    md-cache is not enough, consider the folling oder of opertaions:
    - pending xattr invalidation - invalidate md-cache
    - readdirp on the bad read subvolume - fill md-cache
    - lookup (served from md-cache)
    - read - wound to the old read subvol.
    Hence, along with invalidating md-cache, it is very important to reset the
    read subvolume for that file, in afr.
    
    Design Credit: Anuradha Talur, Ravishankar N
    
    1. xattrop doesn't carry info saying post op/pre op.
    2. Pre xattrop will have 0 value for all pending xattrs,
       the cbk of pre xattrop carries the on-disk xattr value.
       Non zero indicated healing is required.
    3. Post xattrop will have non zero value for any of the
       pending xattrs, if the fop failed on any of the bricks.
    
    Change-Id: I469cbc111714c433984fe1c922be2ef113c25804
    BUG: 1211863
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/15398
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Pranith Kumar Karampuri <pkarampu>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 124 Shyamsundar 2017-03-06 17:18:18 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.