Bug 1713560 - qpidd segfault when processing QMF message from closed connection
Summary: qpidd segfault when processing QMF message from closed connection
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 3.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: 3.2.13
: ---
Assignee: Mike Cressman
QA Contact: Zdenek Kraus
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-24 06:50 UTC by Pavel Moravec
Modified: 2019-07-15 07:54 UTC (History)
5 users (show)

Fixed In Version: qpid-cpp-1.36.0-22
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-15 07:54:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproducer client (1.27 KB, application/gzip)
2019-05-24 06:55 UTC, Pavel Moravec
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Apache JIRA QPID-8319 0 None None None 2019-05-31 18:27:31 UTC
Red Hat Product Errata RHBA-2019:1770 0 None None None 2019-07-15 07:54:57 UTC

Description Pavel Moravec 2019-05-24 06:50:21 UTC
Description of problem:
User story: when running concurrently 2 times a program that:
1) Creates a queue on the broker "HelloQueue"
2) Creates a second queue called "HelloQueue.AutoDelete" with auto-delete set and alternate exchange set to "qmf.default.direct" and hold open the Receiver that is subscribed to it.
3) Puts a QMF message into the "HelloQueue.AutoDelete" queue that will delete the "HelloQueue" queue when it is processed.
4) Waits 10 seconds.
5) Closes the receiver, triggering the auto-delete of "HelloQueue.AutoDelete".

Then the QMF message will be sent to "qmf.default.direct" because of the alternate exchange, resulting in the deletion of "HelloQueue" regardless of whether or not there are other subscribers connected to it. And with some high probability, the 2nd QMF request from just dropped connection will attempt to be processed, but causes segfault.


Version-Release number of selected component (if applicable):
qpid-cpp 1.36.0-15 (or -21 or -21+hf2), I expect any


How reproducible:
75% in my case


Steps to Reproduce:
1. Compile attached program.
2. qpidd &
3. ./QmfBrokerCrashRepro localhost:5672 & ./QmfBrokerCrashRepro localhost:5672 &


Actual results:
client program aborts every time (unhandled exception, no deal), but very often qpidd segfaults as well, with backtrace:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f9b5cdca752 in qpid::management::(anonymous namespace)::ScopedManagementContext::getUserId (this=<value optimized out>)
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/management/ManagementAgent.cpp:105
#2  0x00007f9b5cde8055 in qpid::management::ManagementAgent::dispatchAgentCommand (this=0x1680930, msg=..., viaLocal=true)
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/management/ManagementAgent.cpp:2347
#3  0x00007f9b5cde8958 in qpid::management::ManagementAgent::dispatchCommand (this=0x1680930, deliverable=<value optimized out>, routingKey="broker", topic=false, qmfVersion=2)
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/management/ManagementAgent.cpp:1255
#4  0x00007f9b5cdfb219 in qpid::broker::ManagementDirectExchange::route (this=0x168b6f0, msg=...) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/management/ManagementDirectExchange.cpp:48
#5  0x00007f9b5cccfa2a in qpid::broker::Exchange::routeWithAlternate (this=0x168b768, msg=...) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Exchange.cpp:410
#6  0x00007f9b5ccfddb5 in qpid::broker::Queue::reroute (e=<value optimized out>, m=<value optimized out>) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1761
#7  0x00007f9b5ccfe006 in qpid::broker::Queue::abandoned (this=0x16ba740, message=<value optimized out>) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1156
#8  0x00007f9b5ccf16cd in operator() (this=0x16ba740, maxCount=0, p=..., f=..., type=<value optimized out>, triggerAutoDelete=false, maxTests=0)
    at /usr/include/boost/function/function_template.hpp:1013
#9  qpid::broker::Queue::remove (this=0x16ba740, maxCount=0, p=..., f=..., type=<value optimized out>, triggerAutoDelete=false, maxTests=0)
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:795
#10 0x00007f9b5ccf49d5 in qpid::broker::Queue::destroyed (this=0x16ba740) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1167
#11 0x00007f9b5cd73b09 in qpid::broker::QueueRegistry::destroyIfUntouched (this=0x167f2f8, targetQ=<value optimized out>, version=<value optimized out>, connectionId="", userId="")
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/QueueRegistry.cpp:156
#12 0x00007f9b5ccee336 in qpid::broker::Queue::tryAutoDelete (this=0x16ba740, expectedVersion=1) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1358
#13 0x00007f9b5ccee834 in qpid::broker::Queue::scheduleAutoDelete (this=0x16ba740, immediate=false) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1342
#14 0x00007f9b5ccef626 in qpid::broker::Queue::cancel (this=0x16ba740, c=..., connectionId="qpid.[::1]:5672-[::1]:54658", userId="anonymous@QPID")
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:638
#15 0x00007f9b5cd90eca in qpid::broker::SemanticState::cancel (this=0x7f9b4c00a078, c=...) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/SemanticState.cpp:475
#16 0x00007f9b5cd98775 in qpid::broker::SemanticState::closed (this=0x7f9b4c00a078) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/SemanticState.cpp:111
#17 0x00007f9b5cdb0301 in qpid::broker::SessionState::~SessionState (this=0x7f9b4c009eb0, __in_chrg=<value optimized out>)
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/SessionState.cpp:107
#18 0x00007f9b5cdb08a9 in qpid::broker::SessionState::~SessionState (this=0x7f9b4c009eb0, __in_chrg=<value optimized out>)
    at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/SessionState.cpp:110
#19 0x00007f9b5cdb5c44 in ~auto_ptr (this=0x7f9b4c009d00) at /usr/include/c++/4.4.7/backward/auto_ptr.h:168
#20 qpid::broker::SessionHandler::handleDetach (this=0x7f9b4c009d00) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/SessionHandler.cpp:110
#21 0x00007f9b5cd1b564 in qpid::broker::amqp_0_10::Connection::closed (this=0x7f9b4c003e30) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/amqp_0_10/Connection.cpp:378
#22 0x00007f9b5c7f374d in qpid::sys::AsynchIOHandler::disconnect (this=0x168f270) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/sys/AsynchIOHandler.cpp:201
#23 0x00007f9b5c7f3ca9 in qpid::sys::AsynchIOHandler::eof (this=0x168f270, a=<value optimized out>) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/sys/AsynchIOHandler.cpp:184
#24 0x00007f9b5c770e3a in operator() (this=0x168fc90, h=...) at /usr/include/boost/function/function_template.hpp:1013
#25 qpid::sys::posix::AsynchIO::readable (this=0x168fc90, h=...) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/sys/posix/AsynchIO.cpp:486
#26 0x00007f9b5c7f79e3 in boost::function1<void, qpid::sys::DispatchHandle&>::operator() (this=<value optimized out>, a0=<value optimized out>)
    at /usr/include/boost/function/function_template.hpp:1013
#27 0x00007f9b5c7f6676 in qpid::sys::DispatchHandle::processEvent (this=0x168fc98, type=qpid::sys::Poller::READABLE) at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/sys/DispatchHandle.cpp:280
..

Here, the context (of type qpid::broker::amqp_0_10::Connection) points to the 2nd client connection that was dropped. Qpid trace logs show the connection was already closed and its management object deleted - but a reference still kept due to this QMF method..?


Expected results:
no segfault


Additional info:

Comment 1 Pavel Moravec 2019-05-24 06:55:01 UTC
Created attachment 1572802 [details]
reproducer client

Comment 2 Pavel Moravec 2019-05-24 07:19:43 UTC
Even simplier reproducer:

- have the auto-del queue with timeout (have in the code ".. auto-delete:True, arguments:{'qpid.auto_delete_timeout':10}" )
- run the client program just once

Explanation:
- connection from the client will be gone for some time when auto-del will happen
- so the re-routed message to QMF exchange will refer to invalid connection

Simply, dealing with QMF methods and requests does not count with already closed connections.

Comment 3 Chuck Rolke 2019-05-31 17:04:18 UTC
Research shows in function Queue::remove (stack frame #9) there is a comment by Gordon Sim in 2012:
         
    if (f) f(*i);//ERROR? need to clear old persistent context?

Clearing the message's publisher context seems to avoid the crash.

Comment 4 Mike Cressman 2019-06-11 21:17:19 UTC
Fix now upstream, a bit different from the first proposal: see https://issues.apache.org/jira/browse/QPID-8319

Comment 5 Pavel Moravec 2019-06-17 07:10:00 UTC
Testing scratch build http://brew-task-repos.usersys.redhat.com/repos/scratch/mcressma/qpid-cpp/1.36.0/22.el6/qpid-cpp-1.36.0-22.el6-scratch.repo :

no segfault hit in repeated tests, BZ seems fixed.

Comment 8 Zdenek Kraus 2019-07-08 11:31:19 UTC
Tested on RHEL 6 and 7 with following packages:

qpid-cpp-server-1.36.0-22

fix work as expected.

->VERIFIED

Comment 10 errata-xmlrpc 2019-07-15 07:54:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1770


Note You need to log in before you can comment on or make changes to this bug.