Bug 765336 - (GLUSTER-3604) Streamline translator-local structure allocation
Streamline translator-local structure allocation
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
pre-release
x86_64 Linux
medium Severity low
: ---
: ---
Assigned To: Amar Tumballi
Raghavendra Bhat
:
Depends On:
Blocks: 817967
  Show dependency treegraph
 
Reported: 2011-09-21 13:26 EDT by Jeff Darcy
Modified: 2013-12-18 19:06 EST (History)
2 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-24 13:36:07 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions: glusterfs-3.3.0qa43
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeff Darcy 2011-09-21 13:26:43 EDT
Right now, the translator API includes a "local" field in the
call_frame_t structure, which many translators use to hold their own
state across asynchronous calls/callbacks.  Typically, they GF_CALLOC a
translator-specific structure and stick the pointer into frame->local.
This does get freed automatically when the frame itself is freed, which
is ever so slightly convenient, but otherwise we have a situation in
which every single translator that touches a request incurs a whole
system-level malloc/free pair - not even benefiting from the pool
allocation used for the frames themselves.  Besides the cycles
consumed, this also has implications for page and cache warmth.

What I'm proposing is that each translator can provide an *optional*
new entry point which the translator infrastructure can use to query
the size of the structure that the translator will be using "local" to
hold.  That infrastructure can then walk the translator graph to
determine the total size of all such structures in the longest path, and
use a single malloc/free for all at once.  This might in some cases
result in allocating space for a structure that won't be needed, but in
the great majority of cases it would be more efficient than what we have
now.  It would also be more convenient, since translators that
implement the new query function wouldn't have to manage the
local-structure life cycle themselves.  Other translators could
continue to use the current method indefinitely.

I'd be willing to write the infrastructure code for this, and modify
some of the most commonly used translators to use it as examples.  Any
interest?
Comment 1 Amar Tumballi 2011-09-21 21:13:43 EDT
We do had plans of bringing in mempools for the local allocations. Currently Shehjar started that effort (ref: http://review.gluster.com/#dashboard,1000032 )

We also want to bring in more important logs in statedump about how many times a given type of structure is allocated (so we can decide if it is high %, make it mempool or variable sized iobuf). (ref: http://review.gluster.com/376)

Considering this, I think starting on this effort right now would be bit of duplicate work. Let me know your thoughts.
Comment 2 Anand Avati 2012-02-21 05:42:15 EST
CHANGE: http://review.gluster.com/2772 (core: utilize mempool for frame->local allocations) merged in master by Anand Avati (avati@redhat.com)
Comment 3 Anand Avati 2012-02-22 07:23:54 EST
CHANGE: http://review.gluster.com/2784 (mempool: adjustments in pool sizes) merged in master by Vijay Bellur (vijay@gluster.com)
Comment 4 Raghavendra Bhat 2012-05-24 06:05:28 EDT
Now the local structure for the xlators are allocated from the mempool instead of doing malloc/free.

pool-name=mirror-replicate-1:afr_local_t
hot-count=1
cold-count=511
padded_sizeof=10612
alloc-count=48865
max-alloc=5
pool-misses=0
max-stdalloc=0
-----=-----
pool-name=mirror-replicate-2:afr_local_t
hot-count=1
cold-count=511
padded_sizeof=10612
alloc-count=42560
max-alloc=7
pool-misses=0
max-stdalloc=0
-----=-----
pool-name=mirror-dht:dht_local_t
hot-count=1
cold-count=511
padded_sizeof=1972
alloc-count=66128
max-alloc=6
pool-misses=0
max-stdalloc=0
-----=-----
pool-name=mirror-quota:quota_local_t
hot-count=1
cold-count=63
padded_sizeof=356
alloc-count=33124
max-alloc=6
pool-misses=0
max-stdalloc=0
-----=-----
pool-name=mirror-write-behind:wb_local_t

Note You need to log in before you can comment on or make changes to this bug.