Bug 1649037 - Translators allocate too much memory in their xlator_mem_acct_init()
Summary: Translators allocate too much memory in their xlator_mem_acct_init()
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-12 18:33 UTC by Yaniv Kaul
Modified: 2020-03-12 12:33 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-12 12:33:35 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Yaniv Kaul 2018-11-12 18:33:28 UTC
Description of problem:
If I'm looking at most xlators, they use it with something like (random example):
gf_mt_jbr_end

which is defined in xlators/experimental/jbr-server/src/jbr-internal.h as:
enum {
    gf_mt_jbr_private_t = gf_common_mt_end + 1,
    gf_mt_jbr_fd_ctx_t,
    gf_mt_jbr_inode_ctx_t,
    gf_mt_jbr_dirty_t,
    gf_mt_jbr_end
};


What is the value of gf_common_mt_end ? Some number, defined in libglusterfs/src/mem-types.h as 150 or so (did not really count or looked, but seen it's quite a large list)

So gf_mt_jbr_end ends up being 155 or so.
Then in xlator_mem_acct_init(), I see these:
   xl->mem_acct = MALLOC(sizeof(struct mem_acct) +
                          sizeof(struct mem_acct_rec) * num_types);

Which seems to me that we are allocating plenty of mem_acct_rec structs, even if we only need 5 or so? We are clearly allocating and then memsetting and probably not using way too many mem_acct_rec records.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Vijay Bellur 2018-11-15 09:20:13 UTC
(In reply to Yaniv Kaul from comment #0)
> Description of problem:
> If I'm looking at most xlators, they use it with something like (random
> example):
> gf_mt_jbr_end
> 
> which is defined in xlators/experimental/jbr-server/src/jbr-internal.h as:
> enum {
>     gf_mt_jbr_private_t = gf_common_mt_end + 1,
>     gf_mt_jbr_fd_ctx_t,
>     gf_mt_jbr_inode_ctx_t,
>     gf_mt_jbr_dirty_t,
>     gf_mt_jbr_end
> };
> 
> 
> What is the value of gf_common_mt_end ? Some number, defined in
> libglusterfs/src/mem-types.h as 150 or so (did not really count or looked,
> but seen it's quite a large list)
> 
> So gf_mt_jbr_end ends up being 155 or so.
> Then in xlator_mem_acct_init(), I see these:
>    xl->mem_acct = MALLOC(sizeof(struct mem_acct) +
>                           sizeof(struct mem_acct_rec) * num_types);
> 
> Which seems to me that we are allocating plenty of mem_acct_rec structs,
> even if we only need 5 or so? We are clearly allocating and then memsetting
> and probably not using way too many mem_acct_rec records.
> 

All memory accounting happens per xlator. When a xlator invokes a libglusterfs function, any memory allocation happening there leverages a common memory type and the accounting happens in xl->mem_acct[common_mt]. Hence it is not very easy to determine which memory type record would not be used and so the allocation in init() looks ok to me.

Comment 2 Yaniv Kaul 2018-11-15 09:25:41 UTC
(In reply to Vijay Bellur from comment #1)
> (In reply to Yaniv Kaul from comment #0)
> > Description of problem:
> > If I'm looking at most xlators, they use it with something like (random
> > example):
> > gf_mt_jbr_end
> > 
> > which is defined in xlators/experimental/jbr-server/src/jbr-internal.h as:
> > enum {
> >     gf_mt_jbr_private_t = gf_common_mt_end + 1,
> >     gf_mt_jbr_fd_ctx_t,
> >     gf_mt_jbr_inode_ctx_t,
> >     gf_mt_jbr_dirty_t,
> >     gf_mt_jbr_end
> > };
> > 
> > 
> > What is the value of gf_common_mt_end ? Some number, defined in
> > libglusterfs/src/mem-types.h as 150 or so (did not really count or looked,
> > but seen it's quite a large list)
> > 
> > So gf_mt_jbr_end ends up being 155 or so.
> > Then in xlator_mem_acct_init(), I see these:
> >    xl->mem_acct = MALLOC(sizeof(struct mem_acct) +
> >                           sizeof(struct mem_acct_rec) * num_types);
> > 
> > Which seems to me that we are allocating plenty of mem_acct_rec structs,
> > even if we only need 5 or so? We are clearly allocating and then memsetting
> > and probably not using way too many mem_acct_rec records.
> > 
> 
> All memory accounting happens per xlator. When a xlator invokes a
> libglusterfs function, any memory allocation happening there leverages a
> common memory type and the accounting happens in xl->mem_acct[common_mt].
> Hence it is not very easy to determine which memory type record would not be
> used and so the allocation in init() looks ok to me.

I think the real issue comes from the fact that common is not really common.
If we had real 10 or 100 common types, which was actually shared by many xlators, then fine, I can understand that. But the reality is that there is an abuse of the common types.
Completely random examples:
gf_common_mt_tw_ctx - used exactly once, in ctx.c
gf_mt_sql_cbk_args_t - not used AT ALL?
gf_common_mt_rdma_arena_mr - used exactly once, in rdma.c

So I think the real issue is the abuse. If we had a real concise list, then the overuse wouldn't have been a big deal (of course, we are talking here on few KBs or 10's of KBs, so it's not such a big deal anyway, but it just looks wrong to me)

Comment 3 Vijay Bellur 2018-11-15 11:14:39 UTC
(In reply to Yaniv Kaul from comment #2)
> (In reply to Vijay Bellur from comment #1)
> > (In reply to Yaniv Kaul from comment #0)
> > > Description of problem:
> > > If I'm looking at most xlators, they use it with something like (random
> > > example):
> > > gf_mt_jbr_end
> > > 
> > > which is defined in xlators/experimental/jbr-server/src/jbr-internal.h as:
> > > enum {
> > >     gf_mt_jbr_private_t = gf_common_mt_end + 1,
> > >     gf_mt_jbr_fd_ctx_t,
> > >     gf_mt_jbr_inode_ctx_t,
> > >     gf_mt_jbr_dirty_t,
> > >     gf_mt_jbr_end
> > > };
> > > 
> > > 
> > > What is the value of gf_common_mt_end ? Some number, defined in
> > > libglusterfs/src/mem-types.h as 150 or so (did not really count or looked,
> > > but seen it's quite a large list)
> > > 
> > > So gf_mt_jbr_end ends up being 155 or so.
> > > Then in xlator_mem_acct_init(), I see these:
> > >    xl->mem_acct = MALLOC(sizeof(struct mem_acct) +
> > >                           sizeof(struct mem_acct_rec) * num_types);
> > > 
> > > Which seems to me that we are allocating plenty of mem_acct_rec structs,
> > > even if we only need 5 or so? We are clearly allocating and then memsetting
> > > and probably not using way too many mem_acct_rec records.
> > > 
> > 
> > All memory accounting happens per xlator. When a xlator invokes a
> > libglusterfs function, any memory allocation happening there leverages a
> > common memory type and the accounting happens in xl->mem_acct[common_mt].
> > Hence it is not very easy to determine which memory type record would not be
> > used and so the allocation in init() looks ok to me.
> 
> I think the real issue comes from the fact that common is not really common.
> If we had real 10 or 100 common types, which was actually shared by many
> xlators, then fine, I can understand that. But the reality is that there is
> an abuse of the common types.
> Completely random examples:
> gf_common_mt_tw_ctx - used exactly once, in ctx.c
> gf_mt_sql_cbk_args_t - not used AT ALL?
> gf_common_mt_rdma_arena_mr - used exactly once, in rdma.c
> 
> So I think the real issue is the abuse. If we had a real concise list, then
> the overuse wouldn't have been a big deal (of course, we are talking here on
> few KBs or 10's of KBs, so it's not such a big deal anyway, but it just
> looks wrong to me)

Yeah, agree on that. We could clean up the ones that are not used or seldom used. I am also thinking about a second table for sparingly used memory types. Will update when I have a patch here.

Comment 4 Yaniv Kaul 2019-07-01 06:19:27 UTC
https://review.gluster.org/#/c/glusterfs/+/21657/ removed some unused ones and marked those that are used only once and can be moved - this will have to be done in a follow-up patch.

Comment 5 Worker Ant 2020-03-12 12:33:35 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/904, and will be tracked there from now on. Visit GitHub issues URL for further details


Note You need to log in before you can comment on or make changes to this bug.