Bug 601277

Summary: qpidd broker crash
Product: Red Hat Enterprise MRG Reporter: Pete MacKinnon <pmackinn>
Component: qpid-cppAssignee: Ted Ross <tross>
Status: CLOSED CURRENTRELEASE QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: DevelopmentCC: freznice, kgiusti, pmackinn, tross
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-13 13:37:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pete MacKinnon 2010-06-07 15:38:16 UTC
Core file:

#0  0x00000038af030265 in raise () from /lib64/libc.so.6
(gdb) where
#0  0x00000038af030265 in raise () from /lib64/libc.so.6
#1  0x00000038af031d10 in abort () from /lib64/libc.so.6
#2  0x00000038af06a84b in __libc_message () from /lib64/libc.so.6
#3  0x00000038af0722ef in _int_free () from /lib64/libc.so.6
#4  0x00000038af07273b in free () from /lib64/libc.so.6
#5  0x00000038c1a9db6a in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string () from /usr/lib64/libstdc++.so.6
#6  0x0000003ddb2053c2 in ~RemoteAgent (this=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:97
#7  0x0000003ddb20cd67 in qpid::management::ManagementAgent::deleteOrphanedAgentsLH (this=<value optimized out>) at qpid/management/ManagementAgent.cpp:1410
#8  0x0000003ddb20cf9d in qpid::management::ManagementAgent::handleAttachRequestLH (this=<value optimized out>, inBuffer=<value optimized out>, 
    replyToKey=<value optimized out>, sequence=<value optimized out>, 
    connToken=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:1429
#9  0x0000003ddb211f4b in qpid::management::ManagementAgent::dispatchAgentCommandLH (this=<value optimized out>, msg=<value optimized out>, 
    viaLocal=<value optimized out>) at qpid/management/ManagementAgent.cpp:1933
#10 0x0000003ddb212576 in qpid::management::ManagementAgent::dispatchCommand (
    this=<value optimized out>, deliverable=<value optimized out>, 
    routingKey=<value optimized out>, topic=<value optimized out>, 
    qmfVersion=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:905
#11 0x0000003ddb21a99c in qpid::broker::ManagementTopicExchange::route (
    this=<value optimized out>, msg=<value optimized out>, 
    routingKey=<value optimized out>, args=<value optimized out>)
    at qpid/management/ManagementTopicExchange.cpp:50
#12 0x0000003ddb1bb273 in qpid::broker::SemanticState::route (
    this=<value optimized out>, msg=<value optimized out>, 
    strategy=<value optimized out>) at qpid/broker/SemanticState.cpp:461
#13 0x0000003ddb1bbf6d in qpid::broker::SemanticState::handle (
    this=<value optimized out>, msg=<value optimized out>)
    at qpid/broker/SemanticState.cpp:415
#14 0x0000003ddb1ddd2e in qpid::broker::SessionState::handleContent (
    this=<value optimized out>, frame=<value optimized out>, 
    id=<value optimized out>) at qpid/broker/SessionState.cpp:249
#15 0x0000003ddb1de280 in qpid::broker::SessionState::handleIn (
    this=<value optimized out>, frame=<value optimized out>)
    at qpid/broker/SessionState.cpp:327
#16 0x0000003ddabbd5f6 in qpid::amqp_0_10::SessionHandler::handleIn (
    this=<value optimized out>, f=<value optimized out>)
    at qpid/amqp_0_10/SessionHandler.cpp:93
#17 0x0000003ddb1225b9 in qpid::broker::Connection::received (
    this=<value optimized out>, frame=<value optimized out>)
    at qpid/framing/Handler.h:42
#18 0x0000003ddb0fec44 in qpid::amqp_0_10::Connection::decode (
    this=<value optimized out>, buffer=<value optimized out>, 
    size=<value optimized out>) at qpid/amqp_0_10/Connection.cpp:58
#19 0x0000003ddabef942 in qpid::sys::AsynchIOHandler::readbuff (
    this=<value optimized out>, buff=<value optimized out>)



/var/log/messages:
<snip>
Jun  7 11:31:47 mrg31 sesame[6946]: error Exception caught in sendBuffer: resource-limit-exceeded: resource-limit-exceeded: Policy exceeded on topic-nicaea.usersys.redhat.com.28048.1, policy: size: max=104857600, current=104857593; count: unlimited; type=reject (qpid/broker/QueuePolicy.cpp:85)
Jun  7 11:31:47 mrg31 sesame[6946]: warning Connection to the broker has been lost
Jun  7 11:31:52 mrg31 sesame[6946]: warning Exception received from broker: resource-limit-exceeded: resource-limit-exceeded: Policy exceeded on topic-nicaea.usersys.redhat.com.28048.1, policy: size: max=104857600, current=104857593; count: unlimited; type=reject (qpid/broker/QueuePolicy.cpp:85) [caused by 13 \x00:\x00]
Jun  7 11:31:52 mrg31 sesame[6946]: error Exception caught in sendBuffer: resource-limit-exceeded: resource-limit-exceeded: Policy exceeded on topic-nicaea.usersys.redhat.com.28048.1, policy: size: max=104857600, current=104857593; count: unlimited; type=reject (qpid/broker/QueuePolicy.cpp:85)
Jun  7 11:31:52 mrg31 sesame[6946]: warning Connection to the broker has been lost
Jun  7 11:31:53 mrg31 qpidd: *** glibc detected *** qpidd: double free or corruption (out): 0x00002aaab75ac9b0 ***

Comment 1 Pete MacKinnon 2010-06-07 15:40:23 UTC
mrg31
qpid-cpp-server-0.7.946106-2.el5

Comment 3 Pete MacKinnon 2010-06-09 13:01:16 UTC
Maybe needs a lock in the RemoteAgent dtor?

ManagementAgent::RemoteAgent::~RemoteAgent ()
{
    QPID_LOG(trace, "Remote Agent removed bank=[" << brokerBank << "." << agentBank << "]");
    if (mgmtObject != 0) {
        mgmtObject->resourceDestroy();
        agent.deleteObjectNowLH(mgmtObject->getObjectId());
    }
}

Comment 4 Ted Ross 2010-06-09 18:20:25 UTC
Possibly fixed upstream at revision 953107.

There was a window of opportunity where, if an exception was thrown, a pointer
to deleted heap memory could have been used for a second delete.

Comment 5 Frantisek Reznicek 2010-06-14 08:43:25 UTC
Pete, Ted,
this bug does not contain any description of scenario when (under which conditions) crash was observed.

Raising needinfo for you.

Comment 6 Ted Ross 2010-06-14 12:37:06 UTC
Frantisek,

I was unable to reproduce the failure but I think I know what was happening...

My theory is that an agent (must be QMFv1, like the ruby agent, Wallaby agent, etc) disconnected and then immediately reconnected at the same time a management console (i.e. mint, qpid-tool, etc.) was experiencing congestion (i.e. the console's private queue was full).

Here's a possible reproducer:

1) Run a broker and a QMFv1 agent (you could use a MRG1.2 sesame for this)
2) Create a queue with a specific queue limit
3) Bind the queue to exchange "qpid.management" with key "console.obj.1.0.org.apache.qpid.broker.agent"
4) produce messages directly to the queue in sufficient quantity to fill the queue.
5) disconnect and reconnect the agent

-Ted