Bug 503760 - qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType): Assertion `state!=CALLING' failed
qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType): Assert...
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
Development
All Linux
urgent Severity urgent
: 1.3
: ---
Assigned To: Andrew Stitcher
Frantisek Reznicek
:
: 514036 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-02 11:29 EDT by Gordon Sim
Modified: 2015-11-15 19:07 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Gordon Sim 2009-06-02 11:29:19 EDT
Description of problem:

broker and c++ client occasionally crahsing during perftesting:


[gordon@mrg15 cpp]$ ./src/qpidd --auth no -p 5673
2009-06-01 05:53:54 notice SASL disabled: No Authentication Performed
2009-06-01 05:53:54 notice Listening on TCP port 5673
2009-06-01 05:53:54 notice Broker running
2009-06-01 06:48:39 warning CLOSING [127.0.0.1:55531] unsent data (probably due to client disconnect)
lt-qpidd: qpid/sys/DispatchHandle.cpp:260: virtual void qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType): Assertion `state!=CALLING' failed.
Aborted (core dumped)

And on the client side:

Core was generated by `/home/gordon/qpid/cpp/src/tests/.libs/lt-perftest --qt 1 --summary'.
Program terminated with signal 6, Aborted.
[New process 31016]
[New process 31002]
#0  0x0000003c52e30215 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003c52e30215 in raise () from /lib64/libc.so.6
#1  0x0000003c52e31cc0 in abort () from /lib64/libc.so.6
#2  0x0000003c52e29696 in __assert_fail () from /lib64/libc.so.6
#3  0x00002b30282fc331 in qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::markForDeletion (handle=0x18bafd50)
    at ./qpid/sys/posix/Mutex.h:116
#4  0x00002b30282f7969 in ~Poller (this=0x18bafb70) at qpid/sys/epoll/EpollPoller.cpp:189
#5  0x00002b3027efdf83 in ~TCPConnector (this=0x18baf490) at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145
#6  0x00002b3027ef308a in ~ConnectionImpl (this=0x18baec70) at /usr/include/boost/checked_delete.hpp:34
#7  0x00002b3027eff1bf in qpid::client::TCPConnector::run (this=0x18baf490)
    at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145
#8  0x00002b30282f0a6a in runRunnable (p=0x791a) at qpid/sys/posix/Thread.cpp:35
#9  0x0000003c53a06367 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003c52ed30ad in clone () from /lib64/libc.so.6
(gdb)


Version-Release number of selected component (if applicable):

r779262 of qpid trunk

How reproducible:

Fairly rare
Comment 1 Gordon Sim 2009-07-15 02:51:28 EDT
Also see QPID-1984 for what looks like a much more frequent occurence of the same assertion as listed above for the broker.
Comment 2 Andrew Stitcher 2009-07-30 11:29:55 EDT
*** Bug 514036 has been marked as a duplicate of this bug. ***
Comment 3 Andrew Stitcher 2009-07-30 12:11:25 EDT
I don't think the stack trace here makes much sense. However the stack trace in Bug 514036 makes sense. I'm treating this bug as if it had that stack trace!
Comment 4 Carl Trieloff 2009-07-31 10:45:40 EDT

I can't reproduce if tracing is enabled, but can reliably reproduce like this

2009-07-27 13:55:45 notice SASL disabled: No Authentication Performed
2009-07-27 13:55:45 notice Listening on TCP port 5672
2009-07-27 13:55:45 notice Broker running
lt-qpidd: qpid/sys/DispatchHandle.cpp:259: virtual void
qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType):
Assertion `state!=CALLING' failed.
Aborted (core dumped)

from trunk revision 798241

to reproduce:
 ./qpidd --auth no --tcp-nodelay

[localhost tests]$ ./latencytest --rate 1000000 --tcp-nodelay 
Latency(ms): min=0.560546, max=523.206, avg=305.966
Latency(ms): min=508.735, max=1009.36, avg=782.974
Latency(ms): min=998.361, max=1501.98, avg=1299.21
^C
[localhost tests]$ ./latencytest --rate 100 --tcp-nodelay 
Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Note that setting tcp-nodelay on latencytest is key. setting or not setting on broker both exhibit the issue.

Carl.
Comment 5 Andrew Stitcher 2009-08-12 14:43:40 EDT
This bug turned out to be an issue with both the DispatchHandle and Poller implementations and so has been fixed in both of them.

The PollerTest program has been updated so that it now would have picked up the original Poller bug.
Comment 7 Frantisek Reznicek 2010-06-09 11:27:46 EDT
The issue has been fixed, tested on RHEL 5.5 / 4.8 i386 / x86_64 on packages:
qpid-tools-0.7.946106-4.el5
qpid-cpp-server-0.7.946106-2.el5
python-qpid-0.7.946106-1.el5
qpid-cpp-client-devel-0.7.946106-2.el5
qpid-cpp-client-0.7.946106-2.el5
qpid-cpp-server-cluster-0.7.946106-2.el5

-> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.