Bug 603839 - Concurrent tagging of message with trace id while message is delivered from another queue causes segfault
Summary: Concurrent tagging of message with trace id while message is delivered from a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.2
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: 1.3
: ---
Assignee: Gordon Sim
QA Contact: Jiri Kolar
URL:
Whiteboard:
Depends On:
Blocks: 619919
TreeView+ depends on / blocked
 
Reported: 2010-06-14 16:44 UTC by Gordon Sim
Modified: 2018-10-27 12:04 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, federation routes that annotated a message with a trace ID could have caused a segmentation fault if that message was concurrently delivered from another queue. This situation has been fixed so that it no longer causes a segmentation fault if it arises.
Clone Of:
: 619919 (view as bug list)
Environment:
Last Closed: 2010-10-14 16:00:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Backported fix for 1.2 tree (5.12 KB, patch)
2010-06-15 17:36 UTC, Gordon Sim
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Description Gordon Sim 2010-06-14 16:44:21 UTC
Description of problem:

Federation routes that annotate a message with a trace id can cause segfaults if that message is concurrently delivered from another queue.

Version-Release number of selected component (if applicable):

qpidd-0.5.752581-34.el5

How reproducible:

Readily

Steps to Reproduce:
1. start two brokers (I used ports 5672 and 5673 in this example)
2. create several queues and bind them to an exchange with a given key (I only managed to trigger this for durable queues, though the segfault was not actually related to durability, may just be the timing). E.g.
    for q in `seq 1 10`; do qpid-config add queue queue-$q --durable; qpid-config bind amq.topic queue-$q my-key; done
3. create a federation route from the broker those queues are on to the other broker, using the same exchange and key. E.g.
    qpid-route route add localhost:5673 localhost:5672 amq.topic my-key
4. start receivers for all these queues and then send some durable messages to the exchange with the appropriate routing key
  
Actual results:

Segfault

Expected results:

No segfault

Additional info:
Core was generated by `/usr/sbin/qpidd'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003d067ae09b in qpid::framing::FieldTable::encodedSize (this=0x2aaab0186810) at /usr/include/c++/4.1.2/memory:286

warning: Source file is more recent than executable.
286		return _M_ptr; 
(gdb) bt
#0  0x0000003d067ae09b in qpid::framing::FieldTable::encodedSize (this=0x2aaab0186810) at /usr/include/c++/4.1.2/memory:286
#1  0x0000003d066f5b51 in qpid::framing::MessageProperties::bodySize (this=0x2aaab018d518) at gen/qpid/framing/MessageProperties.cpp:191
#2  0x0000003d066f5bb9 in qpid::framing::MessageProperties::encode (this=0x2aaab0186810, buffer=...) at gen/qpid/framing/MessageProperties.cpp:135
#3  0x0000003d067a271e in encode (this=<value optimized out>, buffer=...) at qpid/framing/AMQHeaderBody.h:50
#4  qpid::framing::AMQHeaderBody::encode (this=<value optimized out>, buffer=...) at qpid/framing/AMQHeaderBody.cpp:30
#5  0x0000003d070c551c in qpid::amqp_0_10::Connection::encode (this=0x2aaaac048070, buffer=0x2aaaac79f900 "\v\001", size=<value optimized out>)
    at qpid/amqp_0_10/Connection.cpp:87
#6  0x0000003d067d4164 in qpid::sys::cyrus::CyrusSecurityLayer::encode (this=0x2aaaac78c000, buffer=0x2aaaac75bb30 "", size=65536)
    at qpid/sys/cyrus/CyrusSecurityLayer.cpp:76
#7  0x0000003d067c64b1 in qpid::sys::AsynchIOHandler::idle (this=0x2aaaac047fe0) at qpid/sys/AsynchIOHandler.cpp:206
#8  0x0000003d06772fca in boost::function1<void, qpid::sys::AsynchIO&, std::allocator<boost::function_base> >::operator() (this=0x0, a0=...)
    at /usr/include/boost/function/function_template.hpp:576
#9  0x0000003d0677171f in qpid::sys::posix::AsynchIO::writeable (this=0x2aaaac048530, h=...) at qpid/sys/posix/AsynchIO.cpp:562
#10 0x0000003d067cc8c7 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() (this=0x0, a0=...)
    at /usr/include/boost/function/function_template.hpp:576
#11 0x0000003d067ca5b9 in qpid::sys::DispatchHandle::processEvent (this=0x2aaaac048538, type=WRITABLE) at qpid/sys/DispatchHandle.cpp:439
#12 0x0000003d06780b93 in process (this=0x1b809da0) at qpid/sys/Poller.h:122
#13 qpid::sys::Poller::run (this=0x1b809da0) at qpid/sys/epoll/EpollPoller.cpp:409
#14 0x0000003d06776bca in qpid::sys::(anonymous namespace)::runRunnable (p=0x2aaab0186810) at qpid/sys/posix/Thread.cpp:35
#15 0x0000003d05a0673d in start_thread () from /lib64/libpthread.so.0
#16 0x0000003d04ed3d1d in clone () from /lib64/libc.so.6

Comment 1 Gordon Sim 2010-06-15 16:38:41 UTC
Fixed on trunk (r954933) and in release repo (http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=a1cdf640e11415c3376c3e420d40113d2bcc723a).

Comment 2 Gordon Sim 2010-06-15 17:36:28 UTC
Created attachment 424245 [details]
Backported fix for 1.2 tree

Comment 3 Gordon Sim 2010-06-15 17:48:26 UTC
Fyi: I could only reproduce easily on an 8 core box.

Comment 7 Jiri Kolar 2010-08-03 11:36:43 UTC
Tested:
on 752581 problem show but it important to use 8core machine and send big batches of messages, send 1by1 does not help

on 946106-11 it is fixed

validated on RHEL5.5/RHEL4  i386 / x86_64  

packages:

# rpm -qa | grep -E '(qpid|openais|rhm)' | sort -u
openais-0.80.6-16.el5_5.2
openais-devel-0.80.6-16.el5_5.2
python-qpid-0.7.946106-11.el5
qpid-cpp-client-0.7.946106-11.el5
qpid-cpp-client-devel-0.7.946106-11.el5
qpid-cpp-client-devel-docs-0.7.946106-11.el5
qpid-cpp-client-ssl-0.7.946106-11.el5
qpid-cpp-mrg-debuginfo-0.7.946106-8.el5
qpid-cpp-server-0.7.946106-11.el5
qpid-cpp-server-cluster-0.7.946106-11.el5
qpid-cpp-server-devel-0.7.946106-11.el5
qpid-cpp-server-ssl-0.7.946106-11.el5
qpid-cpp-server-store-0.7.946106-11.el5
qpid-cpp-server-xml-0.7.946106-11.el5
qpid-java-client-0.7.946106-7.el5
qpid-java-common-0.7.946106-7.el5
qpid-tools-0.7.946106-8.el5
rhm-docs-0.7.946106-4.el5
rh-tests-distribution-MRG-Messaging-qpid_common-1.6-52


->VERIFIED

Comment 9 Martin Prpič 2010-10-10 09:18:03 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, federation routes that annotate a message with a trace ID could cause a segmentation fault if that message was concurrently delivered from another queue. With this update, a segmentation fault no longer occur in the aforementioned case.

Comment 10 Douglas Silas 2010-10-11 13:56:11 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, federation routes that annotate a message with a trace ID could cause a segmentation fault if that message was concurrently delivered from another queue. With this update, a segmentation fault no longer occur in the aforementioned case.+Previously, federation routes that annotated a message with a trace ID could have caused a segmentation fault if that message was concurrently delivered from another queue. This situation has been fixed so that it no longer causes a segmentation fault if it arises.

Comment 12 errata-xmlrpc 2010-10-14 16:00:02 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.