Bug 1366232
Summary: | qdrouterd segfault with "double free or corruption" in pn_class_decref | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Pavel Moravec <pmoravec> | ||||
Component: | katello-agent | Assignee: | Ted Ross <tross> | ||||
Status: | CLOSED ERRATA | QA Contact: | Perry Gagne <pgagne> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.2.0 | CC: | adam, bkearney, cduryee, chartwel, chrobert, egolov, erinn.looneytriggs, jberry86, jbubeck, jhutar, ktordeur, mcressma, mmccune, nmiao, omaciel, oshtaier, pgagne, pmoravec, prsharma, psuriset, sramacha, tross | ||||
Target Milestone: | Unspecified | ||||||
Target Release: | Unused | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | qpid-dispatch-0.4-17 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-11-10 08:13:35 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Pavel Moravec
2016-08-11 10:38:48 UTC
Standalone reproducer: 1) Link routing to qpidd to route pulp.* 2) Run below script 10 times in parallel - it tries to create a receiver to qdrouterd/qpidd but the broker does not have such a queue (i.e. "Node not found" error printed by qpidd): #!/usr/bin/python from time import sleep from uuid import uuid4 from proton.utils import BlockingConnection, LinkDetached routerURL = "proton+amqp://0.0.0.0:5648" conn = BlockingConnection(routerURL, ssl_domain=None, heartbeat=2) while True: sleep(0.05) try: rcv = conn.create_receiver("pulp."+str(uuid4()), name=str(uuid4())) rcv.close() except LinkDetached, e: print e if conn: conn.close() conn = BlockingConnection(routerURL, ssl_domain=None, heartbeat=2) <end-of-the-script> This segfault is usually not expected to happen in Sat6 environment. Since it relies on _missing_ pulp.agent.* queue that goferd tries to subscribe to. Usually, goferd should create its queue during startup.. *** Bug 1366231 has been marked as a duplicate of this bug. *** May need to keep this assigned to tross. The mitigation possible by goferd is to re-create the queue when getting LinkDetached with condition = amqp:not-found. This means goferd could still try to create a receiver (Link) when the queue does not exist and crash the router. Note: This can only happen in cases where the queue existed (or was created by goferd on startup) and then disappeared. (In reply to Jeff Ortel from comment #8) > May need to keep this assigned to tross. The mitigation possible by goferd > is to re-create the queue when getting LinkDetached with condition = > amqp:not-found. This means goferd could still try to create a receiver > (Link) when the queue does not exist and crash the router. > > Note: This can only happen in cases where the queue existed (or was created > by goferd on startup) and then disappeared. +1. The primary problem is qdrouterd segfaulting in some scenario. goferd can be improved like Jeff suggests since the repeated link failures from the same agent increased probability of the failure/segfault. Created attachment 1193891 [details]
(gdb) thread apply all bt
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:2699 *** Bug 1385890 has been marked as a duplicate of this bug. *** |