+++ This bug was initially created as a clone of Bug #1385758 +++ Primary goal is to support more bricks/volumes by reducing port and memory consumption. Secondarily, this could improve performance in the long term - though so far performance is degraded slightly. Feature page is here: https://github.com/gluster/glusterfs-specs/blob/master/under_review/multiplexing.md (note that this might move to a different directory as the feature moves through the upstream feature approval/tracking process) --- Additional comment from Worker Ant on 2016-10-17 13:12:37 EDT --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#4) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-17 19:40:49 EDT --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#5) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-17 20:00:00 EDT --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#6) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-24 15:47:39 EDT --- REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#4) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-26 12:57:32 EDT --- REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#5) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-26 19:38:21 EDT --- REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#6) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-27 08:31:13 EDT --- REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#7) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-27 12:19:28 EDT --- REVIEW: http://review.gluster.org/15643 (io-threads: do global scaling instead of per-instance) posted (#8) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-27 22:09:00 EDT --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#7) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-10-28 15:31:56 EDT --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#22) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-11-07 11:57:58 EST --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#8) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-11-07 12:28:12 EST --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#9) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-11-17 17:16:04 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#23) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-11-23 17:41:19 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#24) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-11-23 19:44:49 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#25) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-01 15:25:27 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#26) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-02 16:38:35 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#27) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-08 16:27:10 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#28) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-12 14:08:54 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#29) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-12 21:50:47 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#30) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-12 21:55:29 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#31) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-12 23:03:30 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#32) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-13 08:08:05 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#33) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-14 12:29:29 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#34) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-14 12:33:28 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#35) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-14 12:55:02 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#36) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-14 13:59:38 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#37) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-14 16:42:07 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#38) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-14 22:00:02 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#39) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-16 16:30:10 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#40) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-20 15:51:13 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#41) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2016-12-21 13:16:24 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#42) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-09 11:19:47 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#43) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-09 11:19:56 EST --- REVIEW: http://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#10) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-10 17:28:11 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#44) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-11 13:11:52 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#45) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-12 13:30:55 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#46) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-12 14:15:31 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#5) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-12 17:17:09 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#6) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-12 19:27:50 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#7) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-12 23:14:16 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#8) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-12 23:43:10 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#47) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-13 13:52:23 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#48) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-13 13:53:13 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#9) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-17 11:21:43 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#49) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-17 11:22:04 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#10) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-17 17:42:14 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#50) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-17 17:44:26 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#11) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-18 09:04:55 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#51) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-18 09:05:11 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#12) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-18 12:30:20 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#52) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-18 14:29:32 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#13) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-19 16:33:18 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#53) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-19 16:50:50 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#14) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-19 20:29:03 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#15) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-20 12:47:43 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#54) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-20 15:23:01 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#55) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-20 17:33:40 EST --- REVIEW: http://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#56) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-20 17:34:09 EST --- REVIEW: http://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#16) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-24 07:29:58 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#59) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-24 12:48:14 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#60) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-24 22:24:45 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#61) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-24 23:46:08 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#62) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-25 08:54:06 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#63) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-25 09:34:42 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#64) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-25 10:19:16 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#65) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-25 13:25:02 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#66) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-25 21:27:52 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#67) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 11:15:40 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#68) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 12:44:18 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#69) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 13:05:05 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#70) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 15:18:30 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#71) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 15:25:58 EST --- REVIEW: https://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#18) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 16:32:53 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#72) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-26 16:33:16 EST --- REVIEW: https://review.gluster.org/16363 (multiple: combo patch with multiplexing enabled) posted (#19) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-27 17:43:36 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#73) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-27 20:04:07 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#74) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-28 07:16:56 EST --- REVIEW: https://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#11) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-30 13:18:08 EST --- REVIEW: https://review.gluster.org/14763 (core: run many bricks within one glusterfsd process) posted (#75) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-30 14:30:52 EST --- REVIEW: https://review.gluster.org/15645 (libglusterfs: make memory pools more thread-friendly) posted (#12) for review on master by Jeff Darcy (jdarcy) --- Additional comment from Worker Ant on 2017-01-30 19:14:02 EST --- COMMIT: https://review.gluster.org/14763 committed in master by Vijay Bellur (vbellur) ------ commit 1a95fc3036db51b82b6a80952f0908bc2019d24a Author: Jeff Darcy <jdarcy> Date: Thu Dec 8 16:24:15 2016 -0500 core: run many bricks within one glusterfsd process This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: https://review.gluster.org/16495 (core: run many bricks within one glusterfsd process) posted (#1) for review on master by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16495 (core: run many bricks within one glusterfsd process) posted (#2) for review on master by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16496 (core: run many bricks within one glusterfsd process) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
COMMIT: https://review.gluster.org/16496 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 83803b4b2d70e9e6e16bb050d7ac8e49ba420893 Author: Jeff Darcy <jdarcy> Date: Tue Jan 31 14:49:45 2017 -0500 core: run many bricks within one glusterfsd process This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Backport of: > Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb > BUG: 1385758 > Reviewed-on: https://review.gluster.org/14763 Change-Id: I4bce9080f6c93d50171823298fdf920258317ee8 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16496 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
REVIEW: https://review.gluster.org/16531 (libglusterfs: make memory pools more thread-friendly) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16532 (glusterd: double-check whether brick is alive for stats) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16533 (socket: retry connect immediately if it fails) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16534 (tests: use kill_brick instead of kill -9) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
COMMIT: https://review.gluster.org/16533 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 0a0112b2c02a30bcb7eca8fa9ecb7fbbe84aa7f8 Author: Jeff Darcy <jdarcy> Date: Wed Feb 1 22:00:32 2017 -0500 socket: retry connect immediately if it fails Previously we relied on a complex dance of setting flags, shutting down the socket, tearing stuff down, getting an event, tearing more stuff down, and waiting for a higher-level retry. What we really need, in the case where we're just trying to connect prematurely e.g. to a brick that hasn't fully come up yet, is a simple retry of the connect(2) call. This was discovered by observing failures in ec-new-entry.t with multiplexing enabled, but probably fixes other random failures as well. Backport of: > Change-Id: Ibedb8942060bccc96b02272a333c3002c9b77d4c > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16510 BUG: 1418091 Change-Id: I4bac26929a12cabcee4f9e557c8b4d520948378b Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16533 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
COMMIT: https://review.gluster.org/16531 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 1ed73ffa16cb7fe4415acbdb095da6a4628f711a Author: Jeff Darcy <jdarcy> Date: Fri Oct 14 10:04:07 2016 -0400 libglusterfs: make memory pools more thread-friendly Early multiplexing tests revealed *massive* contention on certain pools' global locks - especially for dictionaries and secondarily for call stubs. For the thread counts that multiplexing can create, a more lock-free solution is clearly needed. Also, the current mem-pool implementation does a poor job releasing memory back to the system, artificially inflating memory usage to match whatever the worst case was since the process started. This is bad in general, but especially so for multiplexing where there are more pools and a major point of the whole exercise is to reduce memory consumption. The basic ideas for the new design are these There is one pool, globally, for each power-of-two size range. Every attempt to create a new pool within this range will instead add a reference to the existing pool. Instead of adding pools for each translator within each multiplexed brick (potentially infinite and quite possibly thousands), we allocate one set of size-based pools per *thread* (hundreds at worst). Each per-thread pool is divided into hot and cold lists. Every allocation first attempts to use the hot list, then the cold list. When objects are freed, they always go on the hot list. There is one global "pool sweeper" thread, which periodically reclaims everything in each pool's cold list and then "demotes" the current hot list to be the new cold list. For normal allocation activity, only a per-thread lock need be taken, and even that only to guard against very rare contention from the pool sweeper. When threads start and stop, a global lock must be taken to add them to the pool sweeper's list. Lock contention is therefore extremely low, and the hot/cold lists also provide good locality. A more complete explanation (of a similar earlier design) can be found here: http://www.gluster.org/pipermail/gluster-devel/2016-October/051160.html Backport of: > Change-Id: I5bc8a1ba57cfb553998f979a498886e0d006e665 > BUG: 1385758 > Reviewed-on: https://review.gluster.org/15645 BUG: 1418091 Change-Id: Id09bbea41f65fcd245822607bc204f3a34904dc2 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16531 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
COMMIT: https://review.gluster.org/16532 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit da30f79c9e35ab8cca71601a33665af72d0880ff Author: Jeff Darcy <jdarcy> Date: Wed Feb 1 21:54:30 2017 -0500 glusterd: double-check whether brick is alive for stats With multiplexing, our tests detach bricks from their host processes without glusterd being involved. Thus, when we ask glusterd to fetch profile info, it will try to fetch from a brick that's actually not present any more. While it can handle the process being dead and its RPC connection being closed, it barfs if it gets a negative response from a live brick process. This is not a problem in normal use, because the brick can't disappear without glusterd seeing it. The fix is to double check that the brick is actually running, by looking for its pidfile which the tests *do* clean up as part of killing a brick. Backport of: > Change-Id: I098465b175ecf23538bd7207357c752a2bba8f4e > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16509 BUG: 1418091 Change-Id: Ia61e273134520c8ccfa3371ee2370cb9a1920877 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16532 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
REVIEW: https://review.gluster.org/16534 (tests: use kill_brick instead of kill -9) posted (#2) for review on release-3.10 by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16537 (glusterd: double-check brick liveness for remove-brick validation) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
COMMIT: https://review.gluster.org/16534 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 0692e1c8f40333cc85637a85663b353f20f663b9 Author: Jeff Darcy <jdarcy> Date: Thu Feb 2 11:41:06 2017 -0500 tests: use kill_brick instead of kill -9 The system actually handles this OK, but with multiplexing the result of killing the whole process is not what some tests assumed. Backport of: > Change-Id: I89ebf0039ab1369f25b0bfec3710ec4c13725915 > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16528 BUG: 1418091 Change-Id: I39943e27b4b1a5e56142f48dc9ef2cdebe0a9d5b Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16534 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org>
COMMIT: https://review.gluster.org/16537 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit ef25dbe39fd5e89a208b9c1698aaab74c014d7d5 Author: Jeff Darcy <jdarcy> Date: Thu Feb 2 13:08:04 2017 -0500 glusterd: double-check brick liveness for remove-brick validation Same problem as https://review.gluster.org/#/c/16509/ in a different place. Tests detach bricks without glusterd's knowledge, so glusterd's internal brick state is out of date and we have to re-check (via the brick's pidfile) as well. Backport of: > BUG: 1385758 > Change-Id: I169538c1c62d72a685a49d57ef65fb6c3db6eab2 > Reviewed-on: https://review.gluster.org/16529 BUG: 1418091 Change-Id: Id0b597bc60807ed090f6ecdba549c5cf3d758f98 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16537 Reviewed-by: Atin Mukherjee <amukherj> Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org>
REVIEW: https://review.gluster.org/16565 (tests: fix online_brick_count for multiplexing) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
COMMIT: https://review.gluster.org/16565 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 0b3255e0ce1d4d407467b34f7d6ad91161b43cfc Author: Jeff Darcy <jdarcy> Date: Thu Feb 2 10:22:00 2017 -0500 tests: fix online_brick_count for multiplexing The number of brick processes no longer matches the number of bricks, therefore counting processes doesn't work. Counting *pidfiles* does. Ironically, the fix broke multiplex.t which used this function, so it now uses a different function with the old process-counting behavior. Also had to fix online_brick_count and kill_node in cluster.rc to be consistent with the new reality. Backport of: > Change-Id: I4e81a6633b93227e10604f53e18a0b802c75cbcc > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16527 Change-Id: I70b5cd169eafe3ad5b523bc0a30d21d864b3036a BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16565 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
REVIEW: https://review.gluster.org/16597 (glusterd: keep snapshot bricks separate from regular ones) posted (#2) for review on release-3.10 by Jeff Darcy (jdarcy)
REVIEW: https://review.gluster.org/16598 (tests: reenable trash.t) posted (#1) for review on release-3.10 by Jeff Darcy (jdarcy)
COMMIT: https://review.gluster.org/16597 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit bbbc6792d58705a1696f53d5e5f41e86c8345f14 Author: Jeff Darcy <jdarcy> Date: Fri Feb 3 10:51:21 2017 -0500 glusterd: keep snapshot bricks separate from regular ones The problem here is that a volume's transport options can change, but any snapshots' bricks don't follow along even though they're now incompatible (with respect to multiplexing). This was causing the USS+SSL test to fail. By keeping the snapshot bricks separate (though still potentially multiplexed with other snapshot bricks including those for other volumes) we can ensure that they remain unaffected by changes to their parent volumes. Also fixed various issues with how the test waits (or more precisely didn't) for various events to complete before it continues. Backport of: > Change-Id: Iab4a8a44fac5760373fac36956a3bcc27cf969da > BUG: 1385758 > Reviewed-on: https://review.gluster.org/16544 ` Change-Id: I91c73e3fdf20d23bff15fbfcc03a8a1922acec27 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16597 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
COMMIT: https://review.gluster.org/16598 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 12cbaabb16ad1f1e5156c35dafe6a7a29a2027a1 Author: Jeff Darcy <jdarcy> Date: Thu Feb 9 09:53:51 2017 -0500 tests: reenable trash.t Now that the underlying bug has been fixed (by d97e63d0) we can allow the test to run again. Backport of: > Change-Id: If9736d142f414bf9af5481659c2b2673ec797a4b > BUG: 1420434 > Reviewed-on: https://review.gluster.org/16584 Change-Id: I44edf2acfd5f9ab33bdc95cc9fc981e04c3eead1 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy> Reviewed-on: https://review.gluster.org/16598 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> CentOS-regression: Gluster Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/