Bug 640286

Summary:

libc per thread allocator in RHEL6: excessive memory usage and leaks

Product:

Red Hat Enterprise Linux 6

Reporter:

Török Edwin <edwin+bugs>

Component:

glibc

Assignee:

Andreas Schwab <schwab>

Status:

CLOSED NOTABUG

QA Contact:

qe-baseos-tools-bugs

Severity:

medium

Docs Contact:

Priority:

low

Version:

6.0

CC:

fweimer, ladar

Target Milestone:

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2010-12-01 13:25:57 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

598498, 640347

Attachments:

Description	Flags
libc_leak.c	none

Description Török Edwin 2010-10-05 13:21:29 UTC

Created attachment 451664 [details]
libc_leak.c

[This is continuation from bug #598498, but I can't seem to be able to reopen that bug]

Description of problem:
I found 3 problems with the new per-thread allocator in glibc (--enable-experimental-malloc):
 - Total memory usage much higher than memory allocated (should be ~88MB, but is >400MB)
 - When threads are joined, the per-thread heaps are not freed
 - malloc_stats() doesn't seem to be aware of those heaps, it only reports ~7MB allocated

Version-Release number of selected component (if applicable):
RHEL 6 beta 2, glibc-2.12-1.4.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Downloade attached libc_leak.c
2. gcc libc_leak.c -pthread 
3. (ulimit -v 512000 -d 512000; ./a.out)
  
Actual results:
malloc failed: Cannot allocate memory
Arena 0:
system bytes     =     135168
in use bytes     =       2304
malloc failed: Cannot allocate memory
Arena 0:
system bytes     =     135168
in use bytes     =       2304
Arena 1:
system bytes     =    1183744
in use bytes     =    1050816
Arena 2:
system bytes     =    1183744
in use bytes     =    1050816
Arena 3:
system bytes     =    1183744
in use bytes     =    1050816
Arena 4:
system bytes     =    1183744
in use bytes     =    1050816
Arena 5:
system bytes     =    1183744
in use bytes     =    1050816
Arena 6:
system bytes     =    1183744
in use bytes     =    1050816
Total (incl. mmap):
system bytes     =    7237632
Arena 1:
system bytes     =    1183744
in use bytes     =    1050816
in use bytes     =    6307200
max mmap regions =          0
max mmap bytes   =          0
Arena 2:
system bytes     =    1183744
in use bytes     =    1050816
Arena 3:
system bytes     =    1183744
in use bytes     =    1050816
Arena 4:
system bytes     =    1183744
in use bytes     =    1050816
Arena 5:
system bytes     =    1183744
in use bytes     =    1050816
Arena 6:
system bytes     =    1183744
in use bytes     =    1050816
Total (incl. mmap):
system bytes     =    7237632
in use bytes     =    6307200
max mmap regions =          0
max mmap bytes   =          0
/proc/self/status at exit

VmPeak:	  481288 kB
VmSize:	  430076 kB
VmLck:	       0 kB
VmHWM:	     740 kB
VmRSS:	     704 kB
VmData:	  424148 kB
VmStk:	      84 kB
VmExe:	       4 kB
VmLib:	    1704 kB
VmPTE:	      80 kB
VmSwap:	       0 kB


Expected results:
/proc/self/status at exit

VmPeak:   245696 kB
VmSize:   169916 kB
VmLck:         0 kB
VmHWM:       536 kB
VmRSS:       512 kB
VmData:   164060 kB
VmStk:       136 kB
VmExe:         4 kB
VmLib:      1588 kB
VmPTE:        68 kB
VmSwap:        0 kB


Additional info:
I originally reported this issue as #598498, but there doesn't seem to be a way to reopen that bug (not from my login at least), so I opened this new bug.

I've written a testcase to show the memory usage problems I see with clamd.

The testcase works like this:
 - starts NTHR threads
 - allocates 1MB in each
 - waits till all NTHR have allocated
 - frees everything.
 - at program exit Vm* is printed from /proc/self/status

This shouldn't use more than NTHR*(1MB + stacksize) memory.
With 8 threads, 8*(1MB + 10MB) = 88MB, so an ulimit of 512MB should be plenty, yet it is exceeded!

There are 3 issues here:
 - excessive memory usage (due to per-thread heaps?)
 - the per thread heaps are not freed when threads are joined
 - malloc_stats() doesn't seem to be aware of the per thread heaps: it says that only ~7MB was allocated, yet VmSize is >400MB 
 
The expected results are from a 2.11.2 eglibc, which was not built with --enable-experimental-malloc, and the actual results are from RHEL6's glibc which was built with --enable-experimental-malloc.

Without ulimit the testcase runs and prints VmSize: 561148 kB, so that is ~67MB/thread, which is way more than I requested with malloc. 

I wouldn't mind if glibc caches the anon mmaps per thread (as it does with the global heap already anyway), but it allocates way more than requested.

Comment 2 Török Edwin 2010-10-05 15:08:45 UTC

FWIW Fedora 13 is affected by this bug too, although it is not affected by the original bug (ClamAV's make check fails). 
So the libc in RHEL6 might have some more bugs, but we can test for that after these ones are fixed.

Should I file a bug for Fedora too, or is there a way to mark a bug as affecting Fedora as well?

Comment 3 Török Edwin 2010-10-05 15:37:51 UTC

(In reply to comment #2)
> FWIW Fedora 13 is affected by this bug too, although it is not affected by the
> original bug (ClamAV's make check fails). 

Actually it is affected, I've just run ClamAV's 'make check' on Fedora 13 and it failed. Guess the package doesn't run 'make check'?

Comment 4 Andreas Schwab 2010-12-01 13:02:03 UTC

This is by design.  Every thread gets its own heap.  There is no leak.

Comment 5 Andreas Schwab 2010-12-01 13:25:57 UTC

Each heap must be 64MB aligned, the address space is only reserved to guarantee alignment.  No more than the actual heap size is ever touched in any way.

Comment 6 Ladar Levison 2010-12-01 15:26:44 UTC

Confirmed on RHEL 6 final+updates = glibc-2.12-1.7.el6_0.3.x86_64.

After looking at the code, I agree with Török, it should not be using >400MB. 

Is the use of thread barriers causing the problem? In my cursory review of the mnanual I didn't see anything that indicated their use would affect the behavior of malloc. 

malloc failed: Cannot allocate memory
Arena 0:
malloc failed: Cannot allocate memory
system bytes     =     135168
in use bytes     =       2304
Arena 1:
system bytes     =    1183744
in use bytes     =    1050816
Arena 2:
system bytes     =    1183744
in use bytes     =    1050816
Arena 3:
system bytes     =    1183744
in use bytes     =    1050816
Arena 4:
system bytes     =    1183744
in use bytes     =    1050816
Arena 5:
system bytes     =    1183744
in use bytes     =    1050816
Arena 6:
system bytes     =    1183744
in use bytes     =    1050816
Total (incl. mmap):
system bytes     =    7237632
in use bytes     =    6307200
max mmap regions =          0
max mmap bytes   =          0
Arena 0:
system bytes     =     135168
in use bytes     =       2304
Arena 1:
system bytes     =    1183744
in use bytes     =    1050816
Arena 2:
system bytes     =    1183744
in use bytes     =    1050816
Arena 3:
system bytes     =    1183744
in use bytes     =    1050816
Arena 4:
system bytes     =    1183744
in use bytes     =    1050816
Arena 5:
system bytes     =    1183744
in use bytes     =    1050816
Arena 6:
system bytes     =    1183744
in use bytes     =    1050816
Total (incl. mmap):
system bytes     =    7237632
in use bytes     =    6307200
max mmap regions =          0
max mmap bytes   =          0
/proc/self/status at exit

VmPeak:	  481292 kB
VmSize:	  430080 kB
VmLck:	       0 kB
VmHWM:	    2776 kB
VmRSS:	     716 kB
VmData:	  424148 kB
VmStk:	      88 kB
VmExe:	       4 kB
VmLib:	    1704 kB
VmPTE:	      80 kB
VmSwap:	       0 kB

Comment 7 Ladar Levison 2010-12-01 15:27:48 UTC

If its not a leak, how would you go about releasing the memory?