Bug 129802 - Using Python threads allocates absurd amounts of virtual memory
Using Python threads allocates absurd amounts of virtual memory
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: python (Show other bugs)
2
All Linux
medium Severity medium
: ---
: ---
Assigned To: Mihai Ibanescu
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-08-12 15:37 EDT by Chris Siebenmann
Modified: 2007-11-30 17:10 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-30 12:04:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Python program to show virtual memory usage of threads (1.20 KB, text/plain)
2004-08-12 15:38 EDT, Chris Siebenmann
no flags Details

  None (edit)
Description Chris Siebenmann 2004-08-12 15:37:34 EDT
Description of problem:
Any Python program using threads will allocate an absurd amount
of virtual memory; more than 8 megabytes per simultaneously active
thread, no matter how much or how little memory that thread uses.
This memory is effectively never released, as far as I can see.

The root cause of this is that glibc forces every active thread
that uses malloc() to have its own private 8-megabyte arena, without
any way at all of overriding this. In Python's case these arenas
are completely pointless; almost all memory allocation in a Python
interpreter is single-threaded anyways by the global interpreter
lock.

Among other effects, this prevents me from starting more than
404 simultaneous threads in a Python program on an i686 machine.
I assume it is simply running out of virtual memory address space,
as ps shows that a 404-thread program has a VSZ of 4143336.
This virtual memory used also shows up as Committed_AS in
/proc/meminfo.

Version-Release number of selected component (if applicable):
python-2.3.3-6 (Fedora Core 2 Python)

How reproducible:
Always.


Steps to Reproduce:

Run the attached program and watch the output. The number of
command-line arguments is the number of threads created, and
it uses ps to show its size before and after thread creation.
(Observe in particular the wide variance between VSZ and RSS.)
Comment 1 Chris Siebenmann 2004-08-12 15:38:36 EDT
Created attachment 102671 [details]
Python program to show virtual memory usage of threads
Comment 2 Chris Siebenmann 2004-08-13 02:11:30 EDT
I was wrong in my diagnosis of this problem; glibc's malloc() is not
at fault. Further investigation shows that it is likely to be an NPTL
behavior change.

By default, NPTL tries to give each new thread a stack as large as
the process's current stack size limit (RLIMIT_STACK). In turn,
this appears to default to 8 megabytes. Thus, normally NPTL will
allocate 8 meg per simultaneously active thread (it reuses the
stacks when possible).

I assume that the old pthreads library has some much lower default
size, and this is why I didn't see this behavior on older systems.

It's hard to see if anything should change in this situation, such
as Python explicitly setting the stack size for new threads. It
certainly can be argued that the current approach gives the user
maximum flexibility, at the cost of surprising memory usage in
default situations.

Thus this may well be a NOTABUG.
Comment 3 Matthew Miller 2005-04-26 12:21:16 EDT
Fedora Core 2 is now maintained by the Fedora Legacy project for
security updates only. If this problem is a security issue, please
reopen and reassign to the Fedora Legacy product. If it is not a
security issue and hasn't been resolved in the current FC3 updates or
in the FC4 test release, reopen and change the version to match.

Note You need to log in before you can comment on or make changes to this bug.