Created attachment 1182219 [details]
valgrind memory leak output
Description of problem:
I have a python script that calls rpm.readHeaderListFromFile() to check rpm headers. When calling this method iteratively over rpms in a yum repository, the python process' heap grows continuously. There appears to be a memory leak when creating the header object in librpm. A simple reproducer:
# cat rpm_hdr_test.py
for i in xrange(1000):
hdr1 = rpm._rpm.hdr()
Running a memory leak test w/ valgrind:
# valgrind --leak-check=full --time-stamp=yes --log-file=rpm_hdr.log python rpm_hdr_test.py
shows the following:
==00:00:00:06.733 19733== 296,000 (40,000 direct, 256,000 indirect) bytes in 1,000 blocks are definitely lost in loss record 1,935 of 1,935
==00:00:00:06.733 19733== at 0x4C2B974: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:00:06.733 19733== by 0xDA8E2EE: rcalloc (rpmmalloc.c:55)
==00:00:00:06.733 19733== by 0xCE38951: headerCreate (header.c:168)
==00:00:00:06.733 19733== by 0xCC14FD4: hdr_new (header-py.c:394)
==00:00:00:06.733 19733== by 0x4ED4E52: type_call (typeobject.c:729)
==00:00:00:06.733 19733== by 0x4E7F0C2: PyObject_Call (abstract.c:2529)
==00:00:00:06.733 19733== by 0x4F1338B: do_call (ceval.c:4316)
==00:00:00:06.733 19733== by 0x4F1338B: call_function (ceval.c:4121)
==00:00:00:06.733 19733== by 0x4F1338B: PyEval_EvalFrameEx (ceval.c:2740)
==00:00:00:06.733 19733== by 0x4F171EC: PyEval_EvalCodeEx (ceval.c:3330)
==00:00:00:06.733 19733== by 0x4F172F1: PyEval_EvalCode (ceval.c:689)
==00:00:00:06.733 19733== by 0x4F3072E: run_mod (pythonrun.c:1374)
==00:00:00:06.733 19733== by 0x4F318ED: PyRun_FileExFlags (pythonrun.c:1360)
==00:00:00:06.733 19733== by 0x4F32B78: PyRun_SimpleFileExFlags (pythonrun.c:952)
Full log file is attached.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.) Create and deallocate multiple hdr objects in python with rpm.readHeaderListFromFile().
2.) See increasing heap size throughout lifetime of process.
Memory leak when creating hdr objects.
No memory leak.
See attached valgrind log.
Created attachment 1183913 [details]
In python/header-py.c: hdr_new -> headerNew -> headerCreate in lib/header.c. headderCreate calls headerLink to increment h->nrefs once, then hdr_Wrap in python/header-py.c calls headerLink again when wrapping the C header object in a Python object, so h->nrefs is now 2.
When the Python object is deallocated, hdr_dealloc calls headerFree, which calls headerUnlink once and then quits without freeing memory if h->nrefs > 0, leading to a memory leak in Python.
I've attached a test patch which resolves the issue in my testing, to call headerUnlink once in hdr_new between the calls to headerNew and hdr_Wrap. I haven't verified whether this patch works with other code paths that call hdr_new other than "hdr1=rpm.hdr()", so it may be necessary to do this for only some of the if/else blocks in hdr_new.
Thanks for hunting this down. Turns out the reference counting of header objects is messed up at even more places in the Python binding. This is now fixed up stream as commit 40326b5724b0cd55a21b2d86eeef344e4826f863.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.