Description of problem:
Python processes continually allocate large numbers of small objects. An optimized memory allocator was added to python 2.1 and turned on by default in 2.3. This allocator sits in front of malloc, carving out 256kb "arenas" with malloc. This space is then carved up into 4kb pools, which are used by an optimized routine to service allocation requests of <= 256 bytes, and it is able to do this faster than doing it all with malloc.
For python 2.3 and 2.4 this arena allocator never actually calls free(), so that long-lived python programs never actually release memory back to the system; the "high-water mark" of memory usage of such a process will just rise and rise, and the process appears to have leaked memory (the memory is still available for use within the specific python process, but not by the rest of the system). The problem is noticable for long-running "bursty" processes that occasionally create large numbers of small objects, then release them: after the objects go away, the arenas are not reclaimed.
This was fixed in 2.5a1; fully unused arenas are free-ed back to the system.
A detailed description can be seen in this post to the python-dev mailing list:
It's in the python.org bug tracker as: http://bugs.python.org/issue1123430
The fix was merged to trunk (for Python 2.5a1) in revision 43059:
I have had customers informally tell me that this is causing issues in their environments, where long-running Python processes appear to be leaking memory.
Some possible approaches to solving this issue:
(a) backport the fix to 2.4. The patch seems to apply cleanly to 2.4, but it's non-trivial and would thus require significant testing.
(b) supply a parallel-installable python 2.6 package. Potentially this would involve recreating other parts of the python stack for 2.6 (e.g. database connectors?). This would fix the problem and give us python 2.6, but brings with it other complexity.
Other possible approaches:
(c) enable the WITH_MEMORY_LIMITS macro, which imposes a 64MB limit on the amount of space these arenas take (limiting each process to 256 arenas); further allocations go straight to malloc/free. This would limit the problem, but process that use many objects (both short-lived and long-lived) would be slowed down by having to go to malloc for all allocations above the limit.
(d) supply an override that bypasses the arena for long-running processes (perhaps a --without-arenas command-line option?); this would allow a per-process workaround, but seems ugly.
It's not yet clear to me what the best solution here is.
If this issue is affecting you, please contact Red Hat Support and cite this bug ID.
Version-Release number of selected component (if applicable):
Python 2.3 up to 2.5a1 ; e.g. RHEL5's python-2.4.3-27.el5
Steps to Reproduce:
1. As per Tim Peters' post to the list cited above, a simply way to demonstrate the problem is to copy the following to a .py file and run it; it creates a list containing a million empty lists, waits for user input ("full"), then deletes them, then waits for user input again "empty", then finally exits.
x = 
for i in xrange(1000000):
2. At the "full" prompt, use Ctrl-Z, then "jobs -l" to identify the PID, then "top" to examine the resident memory of the python process. In my tests on RHEL5 I see approximately 37M resident size (on a 32-bit box; the usage on a 64-bit box is likely to be roughly double).
3. At the "empty" prompt, repeat.
The resident memory used by the python process at step 3 above will be the same as in step 2, even though all of the million inner lists have been deleted
The resident memory used by the python process at step 3 ought to be much less than in step 2. Experimenting with python 2.6.2 (on a Fedora 12 i386 box), I get 38MB resident at step 2, and this drops to 1.7MB at step 3.
Created attachment 396822 [details]
Simple reproducer for this, as described by Tim Peters on upstream mailing list
This is the simple reproducer for this issue given by Tim Peters in this python-dev mailing list post:
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.
Typical symptom here: httpd running mod_python, serving up a dynamic website driven by a RDBMS. Queries to the db come back in the form of large numbers of tuples/dicts of objects representing the data. If a "large" query goes through, that can lead to a very large number of Python objects in-memory at once, driving up temporary memory usage. This is unsurprising, but the issue is that the memory is not released back to the system as a whole after the page is completed, leading to the httpd process being permanently much larger than it could be.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
Prior to version 2.5, Python's optimized memory allocator never released memory back to the system. The memory usage of a long-running Python process would resemble a "high-water mark". This update backports a fix from Python 2.5a1, which frees unused arenas, and adds a non-standard sys._debugmallocstats() function, which prints diagnostic information to stderr. Finally, when running under Valgrind, the optimized allocator is deactivated, to allow more convenient debugging of Python memory usage issues.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.