Bug 483927 - perftest hangs when --npubs >1 && --nsubs >1 && --iterations >1
perftest hangs when --npubs >1 && --nsubs >1 && --iterations >1
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
1.1
All Linux
medium Severity medium
: 1.1.1
: ---
Assigned To: Gordon Sim
Frantisek Reznicek
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-04 05:03 EST by Frantisek Reznicek
Modified: 2015-11-15 19:06 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-06-27 16:04:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Frantisek Reznicek 2009-02-04 05:03:45 EST
Description of problem:
perftest hangs very frequently when runwith --npubs >1 && --nsubs >1 && --iterations >1 switches.


Version-Release number of selected component (if applicable):
[freznice@dhcp-lab-200 perftest_issue]$ rpm -qa | grep qpid
qpidc-perftest-0.4.738274-1.el5
qpidc-0.4.738274-1.el5
qpidd-ssl-0.4.738274-1.el5
qpidc-rdma-0.4.738274-1.el5
qpid-java-common-0.4.738568-1.el5
qpidd-acl-0.4.738274-1.el5
qpidd-0.4.738274-1.el5
python-qpid-0.4.738182-1.el5
qpidd-rdma-0.4.738274-1.el5
qpidc-ssl-0.4.738274-1.el5
qpid-java-client-0.4.738568-1.el5
qpidd-cluster-0.4.738274-1.el5
qpidc-devel-0.4.738274-1.el5
qpidd-xml-0.4.738274-1.el5
qpidd-devel-0.4.738274-1.el5


How reproducible:
>60%

Steps to Reproduce:
1. mkdir data && /usr/sbin/qpidd --auth no --log-enable debug+ --data-dir data > qpidd.log 2>&1 &
2. perftest --summary --nsubs 2 --npubs 2 --mode shared --unique-data yes --iterations 2
3. wait and watch qpidd.log at some point it stops growing - that's the hang state
  
Actual results:
perftest hangs in --summary --nsubs 2 --npubs 2 --mode shared --unique-data yes --iterations 2 configuration

Expected results:
perftest should not hang.

Additional info: (transcript+pstack)

- main transcript
[freznice@dhcp-lab-200 perftest_issue]$ rm -rf data
[freznice@dhcp-lab-200 perftest_issue]$ mkdir data
[freznice@dhcp-lab-200 perftest_issue]$ /usr/sbin/qpidd --auth no --log-enable debug+ --data-dir data > qpidd.log 2>&1 &
[1] 25588
[freznice@dhcp-lab-200 perftest_issue]$
[freznice@dhcp-lab-200 perftest_issue]$ time -p perftest --summary --nsubs 1 --npubs 1 --mode shared --unique-data yes --iterations 2
7443.12 7215.57 14431.6 14.0933
6939.09 6725.81 13451.6 13.1363
Averages:
7191.11 6970.69 13941.6 13.6148
real 143.91
user 3.85
sys 1.40
[freznice@dhcp-lab-200 perftest_issue]$ time -p perftest --summary --nsubs 1 --npubs 2 --mode shared --unique-data yes --iterations 2
7584.63 7357.03 14714.3 14.3694
7867.62 7652.51 15305   14.9463
Averages:
7726.13 7504.77 15009.7 14.6579
real 266.71
user 5.96
sys 1.82
[freznice@dhcp-lab-200 perftest_issue]$ time -p perftest --summary --nsubs 2 --npubs 2 --mode shared --unique-data yes --iterations 2

real 2279.18
user 3.64
sys 1.34

- end of qpidd logfile in hang state
[freznice@dhcp-lab-200 perftest_issue]$ tail qpidd.log
2009-feb-04 10:27:10 debug Credit available for 'perftest0' on 0x2aaab02f5188 bytes: 4294967295 msgs: 498356
2009-feb-04 10:27:10 debug DeliveryRecord::setEnded() id=498354
2009-feb-04 10:27:10 debug Credit allocated for 'perftest0' on 0x2aaab02f5188, was  bytes: 4294967295 msgs: 498356 now bytes: 4294967295 msgs: 498355
2009-feb-04 10:27:10 debug No messages to dispatch on queue 'perftest0'
2009-feb-04 10:27:10 debug fc8f691e-5e20-4c6e-a446-0ad83c032a3b@anonymous: sent cmd 501644: {MessageTransferBody: destination=perftest0; accept-mode=1; acquire-mode=0; }
2009-feb-04 10:27:10 debug No messages to dispatch on queue 'perftest0'
2009-feb-04 10:27:10 debug DeliveryRecord::setEnded() id=501644
2009-feb-04 10:27:10 debug No messages to dispatch on queue 'perftest0'
2009-feb-04 10:27:10 debug No messages to dispatch on queue 'perftest0'
2009-feb-04 10:27:10 debug No messages to dispatch on queue 'perftest0'

- perftest pstack in hang state
[freznice@dhcp-lab-200 perftest_issue]$ pstack 25697
Thread 10 (Thread 1105344832 (LWP 25699)):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003976f6e82d in qpid::sys::Poller::wait ()
#2  0x0000003976f6f357 in qpid::sys::Poller::run ()
#3  0x0000003977464beb in qpid::client::TCPConnector::run ()
#4  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#5  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 9 (Thread 1089673536 (LWP 25700)):
#0  0x000000350bc0a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1  0x0000003977487dfa in qpid::client::LocalQueue::get ()
#2  0x00000039774885d9 in qpid::client::LocalQueue::get ()
#3  0x0000003977488899 in qpid::client::LocalQueue::pop ()
#4  0x000000000041b9fd in std::operator+<char, std::char_traits<char>, std::allocator<char> > ()
#5  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#6  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#7  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 8 (Thread 1115834688 (LWP 25701)):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003976f6e82d in qpid::sys::Poller::wait ()
#2  0x0000003976f6f357 in qpid::sys::Poller::run ()
#3  0x0000003977464beb in qpid::client::TCPConnector::run ()
#4  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#5  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 7 (Thread 1126324544 (LWP 25702)):
#0  0x000000350bc0a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1  0x0000003977487dfa in qpid::client::LocalQueue::get ()
#2  0x00000039774885d9 in qpid::client::LocalQueue::get ()
#3  0x0000003977488899 in qpid::client::LocalQueue::pop ()
#4  0x000000000041b9fd in std::operator+<char, std::char_traits<char>, std::allocator<char> > ()
#5  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#6  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#7  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 6 (Thread 1136814400 (LWP 25703)):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003976f6e82d in qpid::sys::Poller::wait ()
#2  0x0000003976f6f357 in qpid::sys::Poller::run ()
#3  0x0000003977464beb in qpid::client::TCPConnector::run ()
#4  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#5  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 5 (Thread 1147304256 (LWP 25704)):
#0  0x000000350bc0a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1  0x0000003977487dfa in qpid::client::LocalQueue::get ()
#2  0x00000039774885d9 in qpid::client::LocalQueue::get ()
#3  0x0000003977488899 in qpid::client::LocalQueue::pop ()
#4  0x00000000004198a5 in boost::optional_detail::optional_base<qpid::framing::MessageProperties>::assign ()
#5  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#6  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#7  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 4 (Thread 1157794112 (LWP 25705)):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003976f6e82d in qpid::sys::Poller::wait ()
#2  0x0000003976f6f357 in qpid::sys::Poller::run ()
#3  0x0000003977464beb in qpid::client::TCPConnector::run ()
#4  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#5  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 3 (Thread 1168283968 (LWP 25706)):
#0  0x000000350bc0a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1  0x0000003977487dfa in qpid::client::LocalQueue::get ()
#2  0x00000039774885d9 in qpid::client::LocalQueue::get ()
#3  0x0000003977488899 in qpid::client::LocalQueue::pop ()
#4  0x00000000004198a5 in boost::optional_detail::optional_base<qpid::framing::MessageProperties>::assign ()
#5  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#6  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#7  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 2 (Thread 1178773824 (LWP 25707)):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003976f6e82d in qpid::sys::Poller::wait ()
#2  0x0000003976f6f357 in qpid::sys::Poller::run ()
#3  0x0000003977464beb in qpid::client::TCPConnector::run ()
#4  0x0000003976f6617a in qpid::sys::AbsTime::AbsTime ()
#5  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x000000350b0d1b6d in clone () from /lib64/libc.so.6
Thread 1 (Thread 47161452111824 (LWP 25697)):
#0  0x000000350bc0a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1  0x0000003977487dfa in qpid::client::LocalQueue::get ()
#2  0x00000039774885d9 in qpid::client::LocalQueue::get ()
#3  0x0000003977488899 in qpid::client::LocalQueue::pop ()
#4  0x0000000000417286 in qpid::client::Message::~Message$delete ()
#5  0x00000000004180a1 in qpid::client::Message::~Message$delete ()
#6  0x000000000040c91c in qpid::framing::TransferContent::getData ()
#7  0x000000350b01d8b4 in __libc_start_main () from /lib64/libc.so.6
#8  0x000000000040b7e9 in qpid::framing::TransferContent::getData ()
#9  0x00007fff091415a8 in ?? ()
#10 0x0000000000000000 in ?? ()
#0  0x000000350bc0a496 in pthread_cond_wait@@GLIBC_2.3.2 ()
[freznice@dhcp-lab-200 perftest_issue]$
Comment 1 Gordon Sim 2009-02-18 13:17:37 EST
Fixed by r745590.
Comment 2 Martin Kudlej 2009-04-09 08:47:49 EDT
I tested on RHEL 5.3/4.7 i386/x86_64 on r752581. It works without hang.
I run perftest with these parameters:
perftest --summary --nsubs 10 --npubs 10 --mode shared --unique-data yes --iterations 10

perftest --summary --nsubs 2 --npubs 2 --mode shared --unique-data yes --iterations 2

perftest --summary --nsubs 1 --npubs 2 --mode shared --unique-data yes --iterations 2

perftest --summary --nsubs 1 --npubs 1 --mode shared --unique-data yes --iterations 2

perftest --summary --nsubs 1 --npubs 1 --mode shared --unique-data yes --iterations 1

-->VERIFIED
Comment 3 Justin Ross 2011-06-27 16:04:47 EDT
Fixed and verified; closing.

Note You need to log in before you can comment on or make changes to this bug.