Bug 727426

Summary: python-lxml causes memory leak
Product: Red Hat Enterprise Linux 6 Reporter: Divya <dbasant>
Component: python-lxmlAssignee: Jiri Popelka <jpopelka>
Status: CLOSED NOTABUG QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0   
Target Milestone: rc   
Target Release: 6.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-16 15:55:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Divya 2011-08-02 06:14:52 UTC
Description of problem:

Python application using lxml/python-lxml causes memory leaks and itermittent segfaults.
Following are the error reports from valgrind. It reports two separate occurrences of invalid memory reads: one when freeing an XML property, and another when freeing a node.

02== Invalid read of size 8
==14402== at 0x35C7A54AED: xmlFreeProp (tree.c:2032)
==14402== by 0x35C7A54DD8: xmlFreePropList (tree.c:2016)
==14402== by 0x35C7A54462: xmlFreeNodeList (tree.c:3617)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54297: xmlFreeDoc (tree.c:1224)
==14402== by 0x10AEBA08: __pyx_tp_dealloc_4lxml_5etree__Document (lxml.etree.c:28182)
==14402== by 0x10AEB8AC: __pyx_tp_dealloc_4lxml_5etree__Element (lxml.etree.c:7079)
==14402== by 0x35CBAA1A15: subtype_dealloc (typeobject.c:1019)
==14402== Address 0xcae87d8 is 152 bytes inside a block of size 176 free'd
==14402== at 0x4A04D72: free (vg_replace_malloc.c:325)
==14402== by 0x35C7A5434A: xmlFreeDoc (tree.c:1231)
==14402== by 0x10AEBA08: __pyx_tp_dealloc_4lxml_5etree__Document (lxml.etree.c:28182)
==14402== by 0x10AF6EDA: __pyx_f_4lxml_5etree_moveNodeToDocument (lxml.etree.c:7182)
==14402== by 0x10AF6FC6: __pyx_f_4lxml_5etree__appendChild (lxml.etree.c:18977)
==14402== by 0x10AF72F2: __pyx_pf_4lxml_5etree_8_Element_append (lxml.etree.c:31584)
==14402== by 0x35CBADF7F8: PyEval_EvalFrameEx (ceval.c:3738)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBAE05A3: PyEval_EvalCodeEx (ceval.c:3000)
==14402==
==14402== Invalid read of size 8
==14402== at 0x35C7A54409: xmlFreeNodeList (tree.c:3602)
==14402== by 0x35C7A54B1F: xmlFreeProp (tree.c:2041)
==14402== by 0x35C7A54DD8: xmlFreePropList (tree.c:2016)
==14402== by 0x35C7A54462: xmlFreeNodeList (tree.c:3617)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54442: xmlFreeNodeList (tree.c:3612)
==14402== by 0x35C7A54297: xmlFreeDoc (tree.c:1224)
==14402== by 0x10AEBA08: __pyx_tp_dealloc_4lxml_5etree__Document (lxml.etree.c:28182)
==14402== by 0x10AEB8AC: __pyx_tp_dealloc_4lxml_5etree__Element (lxml.etree.c:7079)
==14402== Address 0xcae87d8 is 152 bytes inside a block of size 176 free'd
==14402== at 0x4A04D72: free (vg_replace_malloc.c:325)
==14402== by 0x35C7A5434A: xmlFreeDoc (tree.c:1231)
==14402== by 0x10AEBA08: __pyx_tp_dealloc_4lxml_5etree__Document (lxml.etree.c:28182)
==14402== by 0x10AF6EDA: __pyx_f_4lxml_5etree_moveNodeToDocument (lxml.etree.c:7182)
==14402== by 0x10AF6FC6: __pyx_f_4lxml_5etree__appendChild (lxml.etree.c:18977)
==14402== by 0x10AF72F2: __pyx_pf_4lxml_5etree_8_Element_append (lxml.etree.c:31584)
==14402== by 0x35CBADF7F8: PyEval_EvalFrameEx (ceval.c:3738)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBADF52C: PyEval_EvalFrameEx (ceval.c:3836)
==14402== by 0x35CBAE05A3: PyEval_EvalCodeEx (ceval.c:3000)
==14402== 


Version-Release number of selected component (if applicable):
python-lxml-2.2.3-1.1.el6.x86_64


How reproducible:
Always


Steps to Reproduce:
1. Install the valgrind packages (if not already installed)
2. Create two files on disk, named test.definition and packages.xml respectively.

2a. Paste the following xml content to the test.definition file.
---------------------------------------
<distribution xmlns:xi="http://www.w3.org/2001/XInclude">

<main>
<fullname>Example System</fullname>
<name>example</name>
<version>5</version>
<arch>i386</arch>
</main>

<packages>
<xi:include href="packages.xml"
xpointer="xpointer(/distribution/packages/*)"/>
</packages>

<repos>
<repo id="base">
<name>CentOS-$releasever - Base</name>
<baseurl>http://mirror.centos.org/centos/$releasever/os/$basearch/</baseurl>
</repo>
</repos>

</distribution>
---------------------------------

2b. Paste the following content to the packages.xml file.

---------------------------------
<?xml version="1.0" encoding="utf-8"?>
<distribution xmlns:xi="http://www.w3.org/2001/XInclude">

<packages>
<group repoid='base'>core</group>
</packages>

</distribution>
----------------------------------

3. Follow the instructions at http://www.renditionsoftware.com/systemstudio/source to download and run the application from source code.

4. Run the application as the root user from within valgrind using the following command line. Replace $WORKSPACE with the location of the systemstudio sources from step 3 above.

# valgrind python $WORKSPACE/bin/systemstudio test.definition --force all --debug --log-level 2
  

Actual results:
Memory leak.

Expected results:
Application should get installed without any memory leak or segmentation fault.

Additional info:
* Valgrind may report an error "Syscall param utimes(tvp[1]) points to uninitialised byte(s)". This is a false positive (as mentioned in valgrind documentation) and can be safely ignored.
* The first run of the application can take five minutes or more as megabites of data are being downloaded over the internet (in this case from the CentOS mirror website - sorry). Be patient. Alternatively, change the baseurl in the test.definition to point to a local RHEL 5 or 6 install tree :-)
* At some point during processing (often within the "depsolve" step) valgrind will report the two errors listed above. Following that, the process may continue to the end. Often, however, it dies with a segfault error.

Comment 1 RHEL Program Management 2011-08-02 06:28:07 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 2 Jiri Popelka 2011-08-12 16:14:26 UTC
Thanks for the description, however I'm not able to reproduce it at the moment because the systemstudio application is crashing during start with:

.../kickstart.py", line 32, in __init__                                     
    h = list(ts.dbMatch('name', 'pykickstart'))[0]
IndexError: list index out of range

I'll try it again later.

Comment 3 Jiri Popelka 2011-11-02 15:18:36 UTC
(In reply to comment #2)
> .../kickstart.py", line 32, in __init__                                     
>     h = list(ts.dbMatch('name', 'pykickstart'))[0]
> IndexError: list index out of range

The problem was in missing pykickstart package.
I wrote to Rendition Software and the updated
http://www.renditionsoftware.com/systemstudio/source
to reflect that.

Comment 4 Jiri Popelka 2011-11-02 15:58:03 UTC
I've been running systemstudio in valgrind as you described many times, but I've never seen the crash and the 'invalid read of size 8' warning. I've been trying it on RHEL-6.1 and I see the report is against RHEL-6.0. Is it possible that it no longer occurs on 6.1 (although I'm not sure how when python-lxml and libxml2 were not updated in 6.1) ?

According to the valgrind log you posted the first idea for a fix would be something like this:

diff src/lxml/lxml.etree.c
static void __pyx_pf_4lxml_5etree_9_Document___dealloc__(PyObject *__pyx_v_self) {
  __Pyx_SetupRefcountContext("__dealloc__");

+  if (((struct LxmlDocument *)__pyx_v_self)->_c_doc != NULL) {
         xmlFreeDoc(((struct LxmlDocument *)__pyx_v_self)->_c_doc);
+      ((struct LxmlDocument *)__pyx_v_self)->_c_doc = NULL;
+  }
  __Pyx_FinishRefcountContext();
}

But I can't test it because I'm not able to reproduce the crash.

Comment 5 Jiri Popelka 2011-11-02 16:06:52 UTC
To the memory leaking issue:
I certainly see some memory leaks when running systemstudio via valgrind.
When running with --leak-check=full I see something like:

240 bytes in 1 blocks are possibly lost in loss record 4,920 of 7,015
   at 0x4A04820: memalign (vg_replace_malloc.c:581)
   by 0x4A048D7: posix_memalign (vg_replace_malloc.c:709)
   by 0x1290DF87: slab_allocator_alloc_chunk (gslice.c:1136)
   by 0x1290E80B: g_slice_alloc (gslice.c:661)
   by 0x1290FDBD: g_slist_prepend (gslist.c:160)
   by 0x1291461F: g_string_chunk_insert_len (gstring.c:334)
   by 0x126B0E47: primary_sax_end_element (xml-parser.c:379)
   by 0x3DAE83F58C: xmlParseEndTag1 (parser.c:8228)
   by 0x3DAE84637A: xmlParseElement (parser.c:9568)
   by 0x3DAE846649: xmlParseContent (parser.c:9371)
   by 0x3DAE8461A0: xmlParseElement (parser.c:9542)
   by 0x3DAE846649: xmlParseContent (parser.c:9371)

This is telling us that something (probably python-lxml) had allocated some memory via libxml2 (parser.c) and didn't free it. Problem is that there's no more info to it so I don't know how to dig into this any deeper.
And I'm not sure that this (memory leaks) is a big deal when the systemstudio isn't a long time running program/daemon.

Comment 6 Jiri Popelka 2011-11-02 16:11:43 UTC
(In reply to comment #5)
> Problem is that there's no more info to it so I don't know how to dig into this any deeper.

I'll try --num-callers to get some more info.

Comment 7 Jiri Popelka 2011-11-02 16:34:00 UTC
(In reply to comment #6)
> I'll try --num-callers to get some more info.

Ok, widening the stack have shown that it's not python-lxml but yum-metadata-parser what leaks memory via libxml2.

 ...
 by 0x3DAE8461A0: xmlParseElement (parser.c:9542)
 by 0x3DAE84D131: xmlParseDocument (parser.c:10204)
 by 0x3DAE84DF0E: xmlSAXUserParseFile (parser.c:13591)
 by 0x126B1B39: yum_xml_parse_primary (xml-parser.c:593)
 by 0x126B3EC3: py_update (sqlitecache.c:420)
 by 0x126B4749: py_update_primary (sqlitecache.c:578)

So I'm not aware of any evidence of python-lxml leaking memory.

Comment 8 Jiri Popelka 2011-11-02 17:14:09 UTC
The summary so far is:

I'm not able to reproduce the crash in RHEL-6.1(x86_64) and don't see any memory leaks in python-lxml. So I tend to close this BZ as WORKSFORME. What do you think ?

Comment 9 Divya 2011-12-15 12:54:34 UTC
Jiri Popelka,  ensure that you are using correct revision of the source code i.e revision 2139.

"hg clone https://www.renditionsoftware.com/hg/public/systemstudio -r 2139"

Comment 10 Jiri Popelka 2011-12-16 15:55:01 UTC
Yes, I'm able to reproduce the segfault with revision 2139.
According to the systemstudio changelog
the problem seems to be (actually was) in systemstudio

# hg log -r 2140
changeset:   2140:37679abc80b7
branch:      trunk
user:        Kay Williams
date:        Tue Jan 25 17:35:24 2011 -0800
summary:     fixed lxml segfaults with rhel6, also fixed schema validation


I'm closing this ticket as notabug because the problem is (was) in a
third party software and per comment #7 it's not python-lxml what leaks memory.