Bug 118574

Summary: malloc exhausts memory to fast in mulithreaded program
Product: Red Hat Enterprise Linux 3 Reporter: Sergey Kosenko <skosenko>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: high    
Version: 3.0CC: drepper, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-12-20 18:14:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Reproducer none

Description Sergey Kosenko 2004-03-17 21:18:42 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; 
H010818; UB1800; .NET CLR 1.0.3705; .NET CLR 1.1.4322)

Description of problem:
Malloc fails to do the job with just ~1.4 GB memory allocated on 4 Gb 
IBM blades with 2 ht processors(4 virtual). The program creates up to 
200 threads by default. All goes fine in single threaded programs 
though.

Version-Release number of selected component (if applicable):
glibc-2.3.2-95.6, kernel-2.4.21-9.EL, 

How reproducible:
Always

Steps to Reproduce:
1. Run the sample without parameters.
2. It dumps core at the total of 1443 Mb allocated.
3. I wonder if anybody from RH ever reads it. My previous bug posting 
never made it beyond NEW
    

Actual Results:  Allocated ~ 1.4 GB

Expected Results:  Should be able to allocate at least twice as much.

Additional info:

I use default kernel configuration. I could sbrk about 2880 Mb. See 
the sample. It is a major show stopper.

Comment 1 Jakub Jelinek 2004-03-17 21:27:26 UTC
200 threads with what thread stack size?
By default thread stack is ulimit -s KB big, unless unlimited (in which
case it is 2MB on IA-32).  The default ulimit -s setting is 8MB,
so 200 threads occupies ~1.6GB RAM, plus your ~ 1.4GB and unless
you're using -bigmem kernel, 3GB is all virtual address space you
have.
You can change thread stack sizes via pthread_attr_setstacksize,
or ulimit -s.

Comment 2 Sergey Kosenko 2004-03-17 21:31:50 UTC
ulimit -s 2048. I run no more then 20 threads at the time. I am 
getting attacment ready to post.

Comment 3 Sergey Kosenko 2004-03-17 21:36:59 UTC
I got "# define __PAGE_OFFSET (0xc0000000)" in asm/page.h of kernel 
so I assume 3 Gb of user memory minus VMALOC_RESERVE.

Comment 4 Sergey Kosenko 2004-03-17 22:25:17 UTC
Created attachment 98632 [details]
Reproducer

Comment 5 Sergey Kosenko 2004-03-17 22:44:06 UTC
Comment on attachment 98632 [details]
Reproducer

Number of threads shoud be power of 10

Comment 6 Sergey Kosenko 2004-03-17 23:19:41 UTC
Correction: Number of threads should be multiples of 10

Comment 7 Jakub Jelinek 2004-03-18 09:43:31 UTC
Why do you consider ranbytes = 100; a // big alloc ?
That sounds like a really small allocation, so your program attempts to do
about 87000 malloc (20) and 87000 malloc (100) calls per thread.

If you fix the ranbytes allocation, so that it does what you probably
meant to do, the test passes just fine.

The problem with the really small allocations from contending threads
is that malloc uses separate arenas for each such thread to avoid
the locking overhead.  For big allocations (>= MMAP_THRESHOLD, which
is by default 128K), each allocation is a separate mmap, but for small
allocations malloc uses arenas with 1MB size which must be aligned to
1MB (so that malloc can quickly find out which arena a particular
object belongs etc.).  The OS doesn't provide any such way for mmap to
be aligned, so arena.c uses (HEAP_MAX_SIZE == 1MB):
  /* A memory region aligned to a multiple of HEAP_MAX_SIZE is needed.
     No swap space needs to be reserved for the following large
     mapping (on Linux, this is the case for all non-writable mappings
     anyway). */
  p1 = (char *)MMAP(0, HEAP_MAX_SIZE<<1, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE);
  if(p1 != MAP_FAILED) {
    p2 = (char *)(((unsigned long)p1 + (HEAP_MAX_SIZE-1)) & ~(HEAP_MAX_SIZE-1));
    ul = p2 - p1;
    munmap(p1, ul);
    munmap(p2 + HEAP_MAX_SIZE, HEAP_MAX_SIZE - ul);
  } else {
    /* Try to take the chance that an allocation of only HEAP_MAX_SIZE
       is already aligned. */
    p2 = (char *)MMAP(0, HEAP_MAX_SIZE, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE);
    if(p2 == MAP_FAILED)
      return 0;
    if((unsigned long)p2 & (HEAP_MAX_SIZE-1)) {
      munmap(p2, HEAP_MAX_SIZE);
      return 0;
    }
  }
This works well if mmap addresses usually grow up (if malloc is the
only user of mmap in certain timeframe, it will most probably return
addresses growing by 1MB), but in RHEL 3 the mmap addresses are allocated
from top to bottom, so that sbrk is big enough etc.
When mmap addresses given by kernel grow down, the address space
will become fragmented (always 1MB mapped, 1MB unmapped).
If malloc were the only user of mmap, this still wouldn't be a problem,
once there are no more 2MB areas available, p1 = mmap will fail, but
p2 = mmap will most probably succeed and return an aligned address.
But your testcase dies on pthread_create, which unless you trim thread stack
size from the default needs typically 8MB mmap, which once all of address
space becomes fragmented is of course no longer available.

Comment 9 Sergey Kosenko 2004-03-18 15:08:06 UTC
Thanks, Jakub.
I will have our sysadmins to apply the patch ASAP and than I'll try 
it.
Sergey Kosenko,
Banc of America Securities LLC
212-847-5486

Comment 10 Sergey Kosenko 2004-03-19 16:14:55 UTC
I downloaded glibc-2.3.2-95.6.src.rpm from RHN, our sa's patched and 
installed it as glibc-2.3.2-95.6.1, and we tested it. Alloced memory 
amount grew to ~2266 Mb (from 1433 Mb before) but allocation speed 
dropped significantly(several times). Are we using the right glibc 
source? If not, where do I get the wright one from?
Thanks.

Comment 11 Jakub Jelinek 2004-03-19 16:16:52 UTC
Were you building the i686 glibc?  I.e. rpmbuild --target i686 -ba -v glibc.spec?

Comment 12 Sergey Kosenko 2004-03-19 16:43:13 UTC
No, sa did it for i386. Will do it for i686 now. Thanks

Comment 13 Sergey Kosenko 2004-03-19 19:28:03 UTC
The problem is solved! Thanks. I was able to alloc 2555 Mb with no 
speed penalty.

Comment 14 Sergey Kosenko 2004-06-07 20:50:42 UTC
Why didnât you guys put the fix into Update 2 of RHEL 3.0? I still 
could allocate only about 1.7Gb after U2 was applied?

Comment 17 Ulrich Drepper 2004-10-06 05:20:31 UTC
Part of 2.3.3-65 of FC3 now.

Comment 18 Jakub Jelinek 2004-10-06 18:31:54 UTC
The patch is in glibc-2.3.2-95.28 which ought to appear in U4 beta.

Comment 19 John Flanagan 2004-12-20 18:14:12 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-586.html