Bug 1418091 - [RFE] Support multiple bricks in one process (multiplexing)
Summary: [RFE] Support multiple bricks in one process (multiplexing)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1385758
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-31 19:52 UTC by Jeff Darcy
Modified: 2017-03-06 17:44 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1385758
Environment:
Last Closed: 2017-03-06 17:44:58 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jeff Darcy 2017-01-31 19:52:18 UTC
+++ This bug was initially created as a clone of Bug #1385758 +++

Primary goal is to support more bricks/volumes by reducing port and memory consumption.  Secondarily, this could improve performance in the long term - though so far performance is degraded slightly.  Feature page is here:

https://github.com/gluster/glusterfs-specs/blob/master/under_review/multiplexing.md

(note that this might move to a different directory as the feature moves through the upstream feature approval/tracking process)

--- Additional comment from Worker Ant on 2016-10-17 13:12:37 EDT ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#4) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-17 19:40:49 EDT ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#5) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-17 20:00:00 EDT ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#6) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-24 15:47:39 EDT ---

REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#4) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-26 12:57:32 EDT ---

REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#5) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-26 19:38:21 EDT ---

REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#6) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-27 08:31:13 EDT ---

REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#7) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-27 12:19:28 EDT ---

REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#8) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-27 22:09:00 EDT ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#7) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-10-28 15:31:56 EDT ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#22) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-11-07 11:57:58 EST ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#8) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-11-07 12:28:12 EST ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#9) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-11-17 17:16:04 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#23) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-11-23 17:41:19 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#24) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-11-23 19:44:49 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#25) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-01 15:25:27 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#26) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-02 16:38:35 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#27) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-08 16:27:10 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#28) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-12 14:08:54 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#29) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-12 21:50:47 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#30) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-12 21:55:29 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#31) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-12 23:03:30 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#32) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-13 08:08:05 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#33) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-14 12:29:29 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#34) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-14 12:33:28 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#35) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-14 12:55:02 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#36) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-14 13:59:38 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#37) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-14 16:42:07 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#38) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-14 22:00:02 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#39) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-16 16:30:10 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#40) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-20 15:51:13 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#41) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2016-12-21 13:16:24 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#42) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-09 11:19:47 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#43) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-09 11:19:56 EST ---

REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#10) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-10 17:28:11 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#44) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-11 13:11:52 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#45) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-12 13:30:55 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#46) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-12 14:15:31 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#5) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-12 17:17:09 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#6) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-12 19:27:50 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#7) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-12 23:14:16 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#8) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-12 23:43:10 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#47) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-13 13:52:23 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#48) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-13 13:53:13 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#9) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-17 11:21:43 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#49) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-17 11:22:04 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#10) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-17 17:42:14 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#50) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-17 17:44:26 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#11) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-18 09:04:55 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#51) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-18 09:05:11 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#12) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-18 12:30:20 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#52) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-18 14:29:32 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#13) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-19 16:33:18 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#53) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-19 16:50:50 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#14) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-19 20:29:03 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#15) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-20 12:47:43 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#54) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-20 15:23:01 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#55) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-20 17:33:40 EST ---

REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#56) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-20 17:34:09 EST ---

REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#16) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-24 07:29:58 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#59) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-24 12:48:14 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#60) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-24 22:24:45 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#61) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-24 23:46:08 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#62) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-25 08:54:06 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#63) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-25 09:34:42 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#64) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-25 10:19:16 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#65) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-25 13:25:02 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#66) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-25 21:27:52 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#67) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 11:15:40 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#68) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 12:44:18 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#69) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 13:05:05 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#70) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 15:18:30 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#71) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 15:25:58 EST ---

REVIEW: https://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#18) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 16:32:53 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#72) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-26 16:33:16 EST ---

REVIEW: https://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#19) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-27 17:43:36 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#73) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-27 20:04:07 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#74) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-28 07:16:56 EST ---

REVIEW: https://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#11) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-30 13:18:08 EST ---

REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#75) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-30 14:30:52 EST ---

REVIEW: https://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#12) for review on master by Jeff Darcy (jdarcy)

--- Additional comment from Worker Ant on 2017-01-30 19:14:02 EST ---

COMMIT: https://review.gluster.org/14763 committed in master by Vijay Bellur (vbellur) 
------
commit 1a95fc3036db51b82b6a80952f0908bc2019d24a
Author: Jeff Darcy <jdarcy>
Date:   Thu Dec 8 16:24:15 2016 -0500

    core: run many bricks within one glusterfsd process
    
    This patch adds support for multiple brick translator stacks running
    in a single brick server process.  This reduces our per-brick memory usage by
    approximately 3x, and our appetite for TCP ports even more.  It also creates
    potential to avoid process/thread thrashing, and to improve QoS by scheduling
    more carefully across the bricks, but realizing that potential will require
    further work.
    
    Multiplexing is controlled by the "cluster.brick-multiplex" global option.  By
    default it's off, and bricks are started in separate processes as before.  If
    multiplexing is enabled, then *compatible* bricks (mostly those with the same
    transport options) will be started in the same process.
    
    Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
    BUG: 1385758
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/14763
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 1 Worker Ant 2017-01-31 20:04:28 UTC
REVIEW: https://review.gluster.org/16495 (core: run many bricks within one glusterfsd process) posted (#1) for review on master by Jeff Darcy (jdarcy)

Comment 2 Worker Ant 2017-01-31 20:25:06 UTC
REVIEW: https://review.gluster.org/16495 (core: run many bricks within one glusterfsd process) posted (#2) for review on master by Jeff Darcy (jdarcy)

Comment 3 Worker Ant 2017-01-31 20:46:00 UTC
REVIEW: https://review.gluster.org/16496 (core: run many bricks within one glusterfsd process) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 4 Worker Ant 2017-02-02 00:55:02 UTC
COMMIT: https://review.gluster.org/16496 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 83803b4b2d70e9e6e16bb050d7ac8e49ba420893
Author: Jeff Darcy <jdarcy>
Date:   Tue Jan 31 14:49:45 2017 -0500

    core: run many bricks within one glusterfsd process
    
    This patch adds support for multiple brick translator stacks running in
    a single brick server process.  This reduces our per-brick memory usage
    by approximately 3x, and our appetite for TCP ports even more.  It also
    creates potential to avoid process/thread thrashing, and to improve QoS
    by scheduling more carefully across the bricks, but realizing that
    potential will require further work.
    
    Multiplexing is controlled by the "cluster.brick-multiplex" global
    option.  By default it's off, and bricks are started in separate
    processes as before.  If multiplexing is enabled, then *compatible*
    bricks (mostly those with the same transport options) will be started in
    the same process.
    
    Backport of:
    > Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/14763
    
    Change-Id: I4bce9080f6c93d50171823298fdf920258317ee8
    BUG: 1418091
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16496
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 5 Worker Ant 2017-02-02 19:41:39 UTC
REVIEW: https://review.gluster.org/16531 (libglusterfs: make memory pools more thread-friendly) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 6 Worker Ant 2017-02-02 20:08:32 UTC
REVIEW: https://review.gluster.org/16532 (glusterd: double-check whether brick is alive for stats) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 7 Worker Ant 2017-02-02 20:24:01 UTC
REVIEW: https://review.gluster.org/16533 (socket: retry connect immediately if it fails) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 8 Worker Ant 2017-02-02 20:48:26 UTC
REVIEW: https://review.gluster.org/16534 (tests: use kill_brick instead of kill -9) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 9 Worker Ant 2017-02-03 00:43:28 UTC
COMMIT: https://review.gluster.org/16533 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 0a0112b2c02a30bcb7eca8fa9ecb7fbbe84aa7f8
Author: Jeff Darcy <jdarcy>
Date:   Wed Feb 1 22:00:32 2017 -0500

    socket: retry connect immediately if it fails
    
    Previously we relied on a complex dance of setting flags, shutting
    down the socket, tearing stuff down, getting an event, tearing more
    stuff down, and waiting for a higher-level retry.  What we really
    need, in the case where we're just trying to connect prematurely e.g.
    to a brick that hasn't fully come up yet, is a simple retry of the
    connect(2) call.
    
    This was discovered by observing failures in ec-new-entry.t with
    multiplexing enabled, but probably fixes other random failures as
    well.
    
    Backport of:
    > Change-Id: Ibedb8942060bccc96b02272a333c3002c9b77d4c
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/16510
    
    BUG: 1418091
    Change-Id: I4bac26929a12cabcee4f9e557c8b4d520948378b
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16533
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 10 Worker Ant 2017-02-03 00:44:13 UTC
COMMIT: https://review.gluster.org/16531 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 1ed73ffa16cb7fe4415acbdb095da6a4628f711a
Author: Jeff Darcy <jdarcy>
Date:   Fri Oct 14 10:04:07 2016 -0400

    libglusterfs: make memory pools more thread-friendly
    
    Early multiplexing tests revealed *massive* contention on certain
    pools' global locks - especially for dictionaries and secondarily for
    call stubs.  For the thread counts that multiplexing can create, a
    more lock-free solution is clearly needed.  Also, the current mem-pool
    implementation does a poor job releasing memory back to the system,
    artificially inflating memory usage to match whatever the worst case
    was since the process started.  This is bad in general, but especially
    so for multiplexing where there are more pools and a major point of
    the whole exercise is to reduce memory consumption.
    
    The basic ideas for the new design are these
    
      There is one pool, globally, for each power-of-two size range.
      Every attempt to create a new pool within this range will instead
      add a reference to the existing pool.
    
      Instead of adding pools for each translator within each multiplexed
      brick (potentially infinite and quite possibly thousands), we
      allocate one set of size-based pools per *thread* (hundreds at
      worst).
    
      Each per-thread pool is divided into hot and cold lists.  Every
      allocation first attempts to use the hot list, then the cold list.
      When objects are freed, they always go on the hot list.
    
      There is one global "pool sweeper" thread, which periodically
      reclaims everything in each pool's cold list and then "demotes" the
      current hot list to be the new cold list.
    
      For normal allocation activity, only a per-thread lock need be
      taken, and even that only to guard against very rare contention from
      the pool sweeper.  When threads start and stop, a global lock must
      be taken to add them to the pool sweeper's list.  Lock contention is
      therefore extremely low, and the hot/cold lists also provide good
      locality.
    
    A more complete explanation (of a similar earlier design) can be found
    here:
    
     http://www.gluster.org/pipermail/gluster-devel/2016-October/051160.html
    
    Backport of:
    > Change-Id: I5bc8a1ba57cfb553998f979a498886e0d006e665
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/15645
    
    BUG: 1418091
    Change-Id: Id09bbea41f65fcd245822607bc204f3a34904dc2
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16531
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 11 Worker Ant 2017-02-03 00:45:07 UTC
COMMIT: https://review.gluster.org/16532 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit da30f79c9e35ab8cca71601a33665af72d0880ff
Author: Jeff Darcy <jdarcy>
Date:   Wed Feb 1 21:54:30 2017 -0500

    glusterd: double-check whether brick is alive for stats
    
    With multiplexing, our tests detach bricks from their host processes
    without glusterd being involved.  Thus, when we ask glusterd to fetch
    profile info, it will try to fetch from a brick that's actually not
    present any more.  While it can handle the process being dead and its
    RPC connection being closed, it barfs if it gets a negative response
    from a live brick process.  This is not a problem in normal use,
    because the brick can't disappear without glusterd seeing it.  The fix
    is to double check that the brick is actually running, by looking for
    its pidfile which the tests *do* clean up as part of killing a brick.
    
    Backport of:
    > Change-Id: I098465b175ecf23538bd7207357c752a2bba8f4e
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/16509
    
    BUG: 1418091
    Change-Id: Ia61e273134520c8ccfa3371ee2370cb9a1920877
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16532
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 12 Worker Ant 2017-02-03 04:02:19 UTC
REVIEW: https://review.gluster.org/16534 (tests: use kill_brick instead of kill -9) posted (#2) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 13 Worker Ant 2017-02-03 04:09:04 UTC
REVIEW: https://review.gluster.org/16537 (glusterd: double-check brick liveness for remove-brick validation) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 14 Worker Ant 2017-02-03 13:46:52 UTC
COMMIT: https://review.gluster.org/16534 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 0692e1c8f40333cc85637a85663b353f20f663b9
Author: Jeff Darcy <jdarcy>
Date:   Thu Feb 2 11:41:06 2017 -0500

    tests: use kill_brick instead of kill -9
    
    The system actually handles this OK, but with multiplexing the result
    of killing the whole process is not what some tests assumed.
    
    Backport of:
    > Change-Id: I89ebf0039ab1369f25b0bfec3710ec4c13725915
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/16528
    
    BUG: 1418091
    Change-Id: I39943e27b4b1a5e56142f48dc9ef2cdebe0a9d5b
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16534
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>

Comment 15 Worker Ant 2017-02-03 13:46:57 UTC
COMMIT: https://review.gluster.org/16537 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit ef25dbe39fd5e89a208b9c1698aaab74c014d7d5
Author: Jeff Darcy <jdarcy>
Date:   Thu Feb 2 13:08:04 2017 -0500

    glusterd: double-check brick liveness for remove-brick validation
    
    Same problem as https://review.gluster.org/#/c/16509/ in a different
    place.  Tests detach bricks without glusterd's knowledge, so
    glusterd's internal brick state is out of date and we have to re-check
    (via the brick's pidfile) as well.
    
    Backport of:
    > BUG: 1385758
    > Change-Id: I169538c1c62d72a685a49d57ef65fb6c3db6eab2
    > Reviewed-on: https://review.gluster.org/16529
    
    BUG: 1418091
    Change-Id: Id0b597bc60807ed090f6ecdba549c5cf3d758f98
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16537
    Reviewed-by: Atin Mukherjee <amukherj>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 16 Worker Ant 2017-02-08 14:47:24 UTC
REVIEW: https://review.gluster.org/16565 (tests: fix online_brick_count for multiplexing) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 17 Worker Ant 2017-02-08 20:03:40 UTC
COMMIT: https://review.gluster.org/16565 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 0b3255e0ce1d4d407467b34f7d6ad91161b43cfc
Author: Jeff Darcy <jdarcy>
Date:   Thu Feb 2 10:22:00 2017 -0500

    tests: fix online_brick_count for multiplexing
    
    The number of brick processes no longer matches the number of bricks,
    therefore counting processes doesn't work.  Counting *pidfiles* does.
    Ironically, the fix broke multiplex.t which used this function, so it
    now uses a different function with the old process-counting behavior.
    Also had to fix online_brick_count and kill_node in cluster.rc to be
    consistent with the new reality.
    
    Backport of:
    > Change-Id: I4e81a6633b93227e10604f53e18a0b802c75cbcc
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/16527
    
    Change-Id: I70b5cd169eafe3ad5b523bc0a30d21d864b3036a
    BUG: 1418091
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16565
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 18 Worker Ant 2017-02-10 13:21:39 UTC
REVIEW: https://review.gluster.org/16597 (glusterd: keep snapshot bricks separate from regular ones) posted (#2) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 19 Worker Ant 2017-02-10 13:22:51 UTC
REVIEW: https://review.gluster.org/16598 (tests: reenable trash.t) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)

Comment 20 Worker Ant 2017-02-10 19:23:42 UTC
COMMIT: https://review.gluster.org/16597 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit bbbc6792d58705a1696f53d5e5f41e86c8345f14
Author: Jeff Darcy <jdarcy>
Date:   Fri Feb 3 10:51:21 2017 -0500

    glusterd: keep snapshot bricks separate from regular ones
    
    The problem here is that a volume's transport options can change, but
    any snapshots' bricks don't follow along even though they're now
    incompatible (with respect to multiplexing).  This was causing the
    USS+SSL test to fail.  By keeping the snapshot bricks separate
    (though still potentially multiplexed with other snapshot bricks
    including those for other volumes) we can ensure that they remain
    unaffected by changes to their parent volumes.
    
    Also fixed various issues with how the test waits (or more precisely
    didn't) for various events to complete before it continues.
    
    Backport of:
    > Change-Id: Iab4a8a44fac5760373fac36956a3bcc27cf969da
    > BUG: 1385758
    > Reviewed-on: https://review.gluster.org/16544
    `
    Change-Id: I91c73e3fdf20d23bff15fbfcc03a8a1922acec27
    BUG: 1418091
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16597
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 21 Worker Ant 2017-02-13 13:19:43 UTC
COMMIT: https://review.gluster.org/16598 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 12cbaabb16ad1f1e5156c35dafe6a7a29a2027a1
Author: Jeff Darcy <jdarcy>
Date:   Thu Feb 9 09:53:51 2017 -0500

    tests: reenable trash.t
    
    Now that the underlying bug has been fixed (by d97e63d0) we can allow
    the test to run again.
    
    Backport of:
    > Change-Id: If9736d142f414bf9af5481659c2b2673ec797a4b
    > BUG: 1420434
    > Reviewed-on: https://review.gluster.org/16584
    
    Change-Id: I44edf2acfd5f9ab33bdc95cc9fc981e04c3eead1
    BUG: 1418091
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16598
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 22 Shyamsundar 2017-03-06 17:44:58 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.