Bug 692132
| Summary: | Assertion in AsyncCompletion during scalability tests | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Kim van der Riet <kim.vdriet> | ||||||
| Component: | qpid-cpp | Assignee: | Ken Giusti <kgiusti> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | MRG Quality Engineering <mrgqe-bugs> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | Development | CC: | gsim, jneedle, mgoulish, ppecka | ||||||
| Target Milestone: | 2.0 | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | qpid-cpp-mrg-0.10-4 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-06-23 15:43:39 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 488771 [details]
Second stack trace
This stack trace shows that the broker was almost idle at the time of the failure.
Most likely caused by a code path that is "completing" the enqueue more than once. No obvious candidate in the stack trace - will have to reproduce. Please note similarity to stack trace that I found in https://bugzilla.redhat.com/show_bug.cgi?id=692546 . ( Thanks for noticing it, Gordon! ) The stacks are identical from level 9 up! I have an ultra-low-frequency (almost useless) reproducer. I would rather not close that bug yet, just in case that avenue leads to an idea. *** Bug 692546 has been marked as a duplicate of this bug. *** Upstream JIRA: https://issues.apache.org/jira/browse/QPID-3174 Potential fix submitted upstream: Committed revision 1087868. http://svn.apache.org/viewvc?view=revision&revision=1087868 Further similar change committed as http://svn.apache.org/viewvc?rev=1088539&view=rev and this plus change above merged to 0.10 release branch: http://svn.apache.org/viewvc?rev=1088634&view=rev This BZ was fixed prior to -4 release - I missed setting the state to "MODIFIED"... Verified on rhel5 / rhel6 - both i686 / x86_64 (1000 runs) rpm -qa | grep qpid python-qpid-0.10-1.el5 qpid-cpp-server-xml-0.10-7.el5 qpid-qmf-devel-0.10-10.el5 qpid-cpp-client-0.10-7.el5 qpid-java-client-0.10-6.el5 qpid-cpp-client-devel-0.10-7.el5 qpid-cpp-server-devel-0.10-7.el5 qpid-java-common-0.10-6.el5 qpid-qmf-0.10-10.el5 qpid-cpp-client-ssl-0.10-7.el5 qpid-cpp-server-cluster-0.10-7.el5 qpid-cpp-server-0.10-7.el5 qpid-java-example-0.10-6.el5 python-qpid-qmf-0.10-10.el5 qpid-cpp-client-devel-docs-0.10-7.el5 qpid-cpp-server-ssl-0.10-7.el5 qpid-tools-0.10-5.el5 qpid-cpp-server-store-0.10-7.el5 --> VERIFIED An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0890.html |
Created attachment 488764 [details] Stack trace While running scalability tests against in-tree builds of r.1085065/r.4447, the following assertion was observed: lt-qpidd: ./qpid/broker/AsyncCompletion.h:167: void qpid::broker::AsyncCompletion::end(qpid::broker::AsyncCompletion::Callback&): Assertion `completionsNeeded.get() > 0' failed. Two test loops in a row have resulted in this failure at around iteration 8 or 9 (see below), but other test loops have succeeded without showing this problem, so it may be probabilistic in nature. Steps to reproduce: Two boxes, mrg42, mrg43 with 10g interfaces enabled as 20.0.10.{42,43} Both boxes have modified environments: limits.conf: nofile: 65536 syscfg.conf: fs.aio-max-nr: 262144 In addition, mrg42 (which runs qpid-perftest) has: ulimits.conf: nproc: 65536 The broker is run as follows on mrg43: rm -rf /tmp/rhm; ./qpidd --auth no -m no --max-connections 65100 --load-module /home/kpvdr/mrg/store/lib/.libs/msgstore.so --store-dir /tmp --jfile-size-pgs 48 --num-jfiles 16 --log-enable info+ The client is run 10 times in a row against the broker in a bash loop on mrg42 using the 10g interface: ./qpid-perftest --mode shared --summary --pub-confirm no --sync-publish no --sub-ack 0 -b 20.0.10.43 --npubs 1 --qt 10000 --nsubs 1 --count 100 Note that although the store is loaded, the test is a transient test.