Bug 1685715 - glibc: [RFE] Reduce brk(2) syscall usage by implementing scaling heuristic.
Summary: glibc: [RFE] Reduce brk(2) syscall usage by implementing scaling heuristic.
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: glibc
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: glibc team
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-05 21:28 UTC by Paolo Bonzini
Modified: 2023-07-18 14:30 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-12-16 14:59:53 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
small testcase (58.51 KB, application/x-bzip)
2019-03-05 21:28 UTC, Paolo Bonzini
no flags Details
medium testcase (265.07 KB, application/x-bzip)
2019-03-05 21:29 UTC, Paolo Bonzini
no flags Details
large testcase (603.39 KB, application/x-bzip)
2019-03-05 21:29 UTC, Paolo Bonzini
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Sourceware 25286 0 P2 NEW [RFE] Reduce brk(2) syscall usage by implementing scaling heuristic. 2019-12-16 14:59:52 UTC

Description Paolo Bonzini 2019-03-05 21:28:37 UTC
Created attachment 1541152 [details]
small testcase

The workload triggered by the attached synthetic makefiles causes glibc to trigger a lot of brk() calls and therefore up to 50% execution time is spent in the kernel:

$ time make -f GNUmakefile cleann
make: ***  Nessuna regola per generare l'obiettivo "cleann".  Arresto.

real	0m0,289s
user	0m0,212s
sys	0m0,072s

$ time make -f BIGmakefile cleann
make: ***  Nessuna regola per generare l'obiettivo "cleann".  Arresto.

real	0m3,143s
user	0m1,716s
sys	0m1,401s

$ time make -f HUGEmakefile cleann
make: ***  Nessuna regola per generare l'obiettivo "cleann".  Arresto.

real	0m17,870s
user	0m9,057s
sys	0m8,681s

Comment 1 Paolo Bonzini 2019-03-05 21:29:19 UTC
Created attachment 1541153 [details]
medium testcase

Comment 2 Paolo Bonzini 2019-03-05 21:29:46 UTC
Created attachment 1541154 [details]
large testcase

Comment 3 Florian Weimer 2019-03-06 06:30:20 UTC
This appears to be a workaround:

MALLOC_TRIM_THRESHOLD_=900000 MALLOC_MMAP_THRESHOLD_=900000 make -f HUGEmakefile cleann

It looks like make performs a lot of medium-sized allocations.  The malloc heuristics treat this as a signal to minimize heap size.

The culprit appears to be the f_append case in do_variable_definition:

1237	            /* Paste the old and new values together in VALUE.  */
1238	
1239	            unsigned int oldlen, vallen;
1240	            const char *val;
1241	            char *tp = NULL;
1242	
1243	            val = value;
1244	            if (v->recursive)
1245	              /* The previous definition of the variable was recursive.
1246	                 The new value is the unexpanded old and new values. */
1247	              flavor = f_recursive;
1248	            else
1249	              /* The previous definition of the variable was simple.
1250	                 The new value comes from the old value, which was expanded
1251	                 when it was set; and from the expanded new value.  Allocate
1252	                 memory for the expansion as we may still need the rest of the
1253	                 buffer if we're looking at a target-specific variable.  */
1254	              val = tp = allocated_variable_expand (val);
1255	
1256	            oldlen = strlen (v->value);
1257	            vallen = strlen (val);
1258	            p = alloc_value = xmalloc (oldlen + 1 + vallen + 1);
1259	            memcpy (alloc_value, v->value, oldlen);
1260	            alloc_value[oldlen] = ' ';
1261	            memcpy (&alloc_value[oldlen + 1], val, vallen + 1);
1262	
1263	            free (tp);

And the old value is freed here:

1400	  v = define_variable_in_set (varname, strlen (varname), p,
1401	                              origin, flavor == f_recursive,
1402	                              (target_var
1403	                               ? current_variable_set_list->set : NULL),
1404	                              flocp);

This is always going to have bad (quadratic) performance.

If you are auto-generating the makefile anyway, you could set the .depfiles variable at the end (and perhaps the .ruletargets.* variables as well).

Comment 4 Paolo Bonzini 2019-03-06 07:43:38 UTC
> If you are auto-generating the makefile anyway, you could set the .depfiles variable at the end (and perhaps the .ruletargets.* variables as well).

Yes, that's what I am doing in fact (I was suprised to see that Make is this bad at processing += but it's easy to workaround as you say).  Still it seemed like a useful testcase; maybe the trim threshold could be increased a little bit for every "brk-down/brk-up" sequence that reaches again the original amount of memory, or something like that.

Comment 5 Florian Weimer 2019-03-06 08:01:38 UTC
(In reply to Paolo Bonzini from comment #4)
> Yes, that's what I am doing in fact (I was suprised to see that Make is this
> bad at processing += but it's easy to workaround as you say).  Still it
> seemed like a useful testcase; maybe the trim threshold could be increased a
> little bit for every "brk-down/brk-up" sequence that reaches again the
> original amount of memory, or something like that.

We implement something like this for the mmap threshold, but it doesn't work in this case, which is why both heuristics have to be disabled.  We'd need to scale the thresholds with the allocation size, which is not something that will be beneficial in general.

The challenge here is that a different process with a similar allocation sequence could go to sleep at any point, and then we'd would be stuck with a large, unused allocation.

Maybe we could scale the trim threshold with live heap size (e.g. if trimming would free less than 5% of heap size, don't consider trimming worthwhile), but this is difficult to measure accurately.  getrusage includes shared mappings as well (which we should not count).  Heap statistics we can gather directly might over-count memory blocks which have never been dirtied by the application and are therefore not actually resident.

Comment 6 Carlos O'Donell 2019-12-16 14:59:53 UTC
We are going to track this change upstream here:
https://sourceware.org/bugzilla/show_bug.cgi?id=25286

I'm marking this CLOSED/UPSTREAM.

Paolo, Thank you for the reproducers!


Note You need to log in before you can comment on or make changes to this bug.