Bug 1754314
Summary: | memory leak in qpid-proton 0.28.0-1 libraries used by goferd when conection to qdrouterd is bounced | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Pavel Moravec <pmoravec> | ||||
Component: | Qpid | Assignee: | Mike Cressman <mcressma> | ||||
Status: | CLOSED ERRATA | QA Contact: | Radovan Drazny <rdrazny> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 6.5.0 | CC: | aeladawy, bkearney, bvassova, christian.klier, cjansen, dsynk, fhirtz, gkadam, gmurthy, gpadholi, gpayelka, greartes, hmore, jalviso, kagarwal, ktordeur, kupadhya, mawerner, mcressma, mkalyat, mmccune, momran, mschibli, mvanderw, patalber, pcreech, pdwyer, rcavalca, sadas, saydas, shisingh, skudupud, spetrosi, sraut, vmeghana, wclark, whitedm | ||||
Target Milestone: | 6.7.0 | Keywords: | Regression, Triaged | ||||
Target Release: | Unused | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | hotfix_delivered | ||||||
Fixed In Version: | qpid-proton-0.28.0-2.{el6,el7,el8} | Doc Type: | Known Issue | ||||
Doc Text: |
Satellite hosts that use katello-agent might experience a memory leak caused by the qpid-proton package.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1769895 1774268 (view as bug list) | Environment: | |||||
Last Closed: | 2020-04-14 13:25:37 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Pavel Moravec
2019-09-22 18:22:28 UTC
Reproducer script outside Satellite: 1) Have qdrouterd with link routing everything (or at least prefix pulp.*) to qpidd. 2) qpidd having queue pulp.agent.TEST.2 3) scenarios: - A: use SSL in both qdrouterd and the client program (it's code is below), run the client and restart qdrouterd frequently. The client will be reconnecting automatically. - B: disable SSL in qdrouterd, leave it enabled in client, and run the client; it will be repeatedly failing to connect as qdrouterd will reject "SSL rubbish" on plain AMQP connection. - C: disable SSL also in the client (just set "SSL = False" in the client code), run the client and restart qdrouterd frequently. The client will be reconnecting automatically. In either A, B or C scenario: - when using 0.26.0-3.el7 proton libraries on the client, no memory leak (a tiny mem.growth is observed, sometimes stabilised after 15mins) - when using 0.28.0-1.el7 proton libraries on the client, evident mem.leak is observed the script itself: from proton import Timeout from proton.utils import BlockingConnection from proton import SSLDomain from time import sleep from uuid import uuid4 from gofer.config import Config RHSM_CONFIG_PATH = '/etc/rhsm/rhsm.conf' SSL = True SSL_S = 'amqps' if SSL else 'proton+amqp' domain = SSLDomain(SSLDomain.MODE_CLIENT) domain.set_trusted_ca_db('/etc/rhsm/ca/katello-default-ca.pem') domain.set_credentials('/etc/pki/consumer/bundle.pem', '/etc/pki/consumer/bundle.pem', None) domain.set_peer_authentication(SSLDomain.ANONYMOUS_PEER) rhsm_conf = Config(RHSM_CONFIG_PATH) ROUTER_ADDRESS = '%s://pmoravec-sat65-on-rhev.gsslab.brq2.redhat.com:5647' % SSL_S ADDRESS = "pulp.agent.TEST.2" HEARTBEAT = 5 SLEEP = 5 recv = None conn = None while True: subscribed = False while not subscribed: try: conn = BlockingConnection(ROUTER_ADDRESS, ssl_domain=domain if SSL else None, heartbeat=HEARTBEAT) recv = conn.create_receiver(ADDRESS, name=str(uuid4()), dynamic=False, options=None) subscribed = True except Exception, e: print "received exception %s on connect/subscribe, trying again in 0.5s" % e sleep(0.5) print "connected => running" while subscribed: try: print recv.receive(SLEEP) except Timeout: pass except Exception, e: print e try: recv.close() recv = None except: pass try: conn.close() conn = None except: pass subscribed = False (the A and C reproducer scenarios differ only in usage of SSL - that proves the proton memory leak is not in SSL part of the proton code) Good morning! How goes progress on a test build and/or candidate? Thanks again, Frank. Created attachment 1632607 [details]
RHEL7 Hotfix RPMs
Hotfix is available for RHEL7. To install:
1. Download attached file qpid-proton-HF1754314-RHEL7.tar.gz and extract it
2. Copy the two RPMs inside the archive to each affected RHEL7 gofer client
3. on each client, # yum localinstall ./python-qpid-proton-0.28.0-2.el7.x86_64.rpm ./qpid-proton-c-0.28.0-2.el7.x86_64.rpm
4. on each client, # systemctl restart goferd
Tested with python-qpid-proton-0.28.0-2.el7.x86_64 from the Sat 6.7 Snap 5 Sat Tools using the reproducer from the initial report, using the option 3, and lowered DELAY and MAX_DELAY vars. After a initial memory init, the memory usage settled up, and remained completely constant even after a few hundreds failed attempts to connect. VERIFIED If you report a memory leak on 0.28.0-2 version: 1) ensure what the symptoms are (qdrouterd was restarted? goferd logs like described?) 2) check if https://bugzilla.redhat.com/show_bug.cgi?id=1810549 is not hit, rather (different scenario, present on any recent qpid-proton version) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:1454 |