Bug 1328299

Summary: Memory leak and occasional segfault in qdrouterd when (un)installing a package from Satellite6
Product: Red Hat Satellite Reporter: Brad Buckingham <bbuckingham>
Component: katello-agentAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Katello QA List <katello-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.1.6CC: anazmy, bbuckingham, cduryee, egolov, katello-bugs, katello-qa-list, mlinden, pmoravec, sthirugn, tross
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1312419 Environment:
Last Closed: 2016-09-26 16:01:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1312419    
Bug Blocks:    

Comment 2 Pavel Moravec 2016-08-10 12:57:31 UTC
Some valgrind tests showed that:

- valgrind itself didnt identify a direct leak, rather memory was accumulated as "still reachable"

- running a short test and a longer one, attached valgrind reports are as https://bugzilla.redhat.com/show_bug.cgi?id=1312419#c61 / https://bugzilla.redhat.com/show_bug.cgi?id=1312419#c62

- comparing the outputs, there are few "still reachable" areas where longer test has bigger figures, i.e.:

==14354== 272,640 bytes in 355 blocks are still reachable in loss record 6,134 of 6,135
==14354==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==14354==    by 0x5967FD8: dictresize (dictobject.c:652)
==14354==    by 0x59C5DE3: PyEval_EvalFrameEx (ceval.c:1780)
==14354==    by 0x59CA1EC: PyEval_EvalCodeEx (ceval.c:3330)
==14354==    by 0x59571BC: function_call (funcobject.c:526)
==14354==    by 0x59320C2: PyObject_Call (abstract.c:2529)
==14354==    by 0x59410B4: instancemethod_call (classobject.c:2602)
==14354==    by 0x59320C2: PyObject_Call (abstract.c:2529)
==14354==    by 0x5989186: slot_tp_init (typeobject.c:5692)
==14354==    by 0x5987E9E: type_call (typeobject.c:745)
==14354==    by 0x59320C2: PyObject_Call (abstract.c:2529)
==14354==    by 0x59C638B: do_call (ceval.c:4316)
==14354==    by 0x59C638B: call_function (ceval.c:4121)
==14354==    by 0x59C638B: PyEval_EvalFrameEx (ceval.c:2740)

in longer test, vs.:

==20371== 83,712 bytes in 109 blocks are still reachable in loss record 6,012 of 6,019
==20371==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20371==    by 0x5967FD8: dictresize (dictobject.c:652)
==20371==    by 0x59C5DE3: PyEval_EvalFrameEx (ceval.c:1780)
==20371==    by 0x59CA1EC: PyEval_EvalCodeEx (ceval.c:3330)
==20371==    by 0x59571BC: function_call (funcobject.c:526)
==20371==    by 0x59320C2: PyObject_Call (abstract.c:2529)
==20371==    by 0x59410B4: instancemethod_call (classobject.c:2602)
==20371==    by 0x59320C2: PyObject_Call (abstract.c:2529)
==20371==    by 0x5989186: slot_tp_init (typeobject.c:5692)
==20371==    by 0x5987E9E: type_call (typeobject.c:745)
==20371==    by 0x59320C2: PyObject_Call (abstract.c:2529)
==20371==    by 0x59C638B: do_call (ceval.c:4316)
==20371==    by 0x59C638B: call_function (ceval.c:4121)
==20371==    by 0x59C638B: PyEval_EvalFrameEx (ceval.c:2740)

in shorter test.

Comment 3 Pavel Moravec 2016-08-10 13:04:24 UTC
Also, some memory leak is seen when following this scenario:

python client (just sending a messages via qdrouterd to some qpidd exchange via link routing):

#!/usr/bin/python

from time import sleep
from uuid import uuid4

from proton import ConnectionException
from proton import Message

from proton.utils import BlockingConnection

import traceback
import random

conn = BlockingConnection("proton+amqp://0.0.0.0:5648", ssl_domain=None, heartbeat=2)

while True:
  try:
    snd = conn.create_sender("pulp.ex", name=str(uuid4()))
    while True:
      snd.send(Message(body="test", durable=True))
      print "msg sent"
      sleep(random.uniform(0.01,0.1))
  except ConnectionException:
    try:
      if conn:
        conn.close()
        conn = BlockingConnection("proton+amqp://0.0.0.0:5648", ssl_domain=None, heartbeat=2)
    except Exception, e:
      print e
      print(traceback.format_exc())


Run it in one terminal, and in another one time to time restart qpidd service (that will drop links and break the script, so run the script in a loop).

Not sure if the leak is the same or not, feel free to create new BZ if it is a different leak.