Bug 676627
| Summary: | persistent clustered qpidd broker throws an unpredictably journal exception (Enqueue capacity threshold exceeded on queue "qpid-perftest0". (JournalImpl.cpp:616)) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Frantisek Reznicek <freznice> | ||||||
| Component: | qpid-cpp | Assignee: | messaging-bugs <messaging-bugs> | ||||||
| Status: | CLOSED UPSTREAM | QA Contact: | MRG Quality Engineering <mrgqe-bugs> | ||||||
| Severity: | low | Docs Contact: | |||||||
| Priority: | low | ||||||||
| Version: | 1.3 | CC: | gsim, kim.vdriet | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2025-02-10 03:13:38 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Frantisek Reznicek
2011-02-10 14:03:37 UTC
Created attachment 478056 [details]
The journals, logs and terminal transcripts
The above attachment shows two same scenarios on two identical machines with different results (on one the journal exception is thrown on other one is not)
The qpidd journals from both machines are included to provide you good way of comparison of content of data-dirs.
Interesting observation: I can easily reproduce the problem as described, but if I set "auth=no" in the configuration the problem goes away. So it appears to be related to authentication in some way, but I don't know what the connection might be. Host info where I reproduced: mrg32.lab.bos.redhat.com 2.6.18-238.el5 x86_64: 16050Mb 2493MHz 8-core/2-cpu Intel(R) Xeon(R) CPU E5420 @ 2.50GHz Red Hat Enterprise Linux Server release 5.6 (Tikanga) Created attachment 478244 [details] Analysis of journals from comment #1 I have examined the two journals from mrg-qe-09 and mrg-qe-10, and neither shows any irregularity in the journal itself. I checked the enqueue threshold calculation from the mrg-qe-10 journal, and found it to be correct. All analysis details are in the attached file. There is a distinct difference in the patterns of enqueue/dequeue in the journals. The journal from mrg-qe-09 had a maximum depth of 27311 records, while the journal from mrg-qe-10 had a depth of 36548 records at the time of the enqueue failure. This analysis shows that the enqueue/dequeue patterns are very different on these two machines, but does not shed any light on why that might be the case. Setting NEEDINFO for aconway. Alan, any further thoughts on this? It seems that the two nodes are seeing very different patterns of enqueueing/dequeuing, hence triggering an ETE on one node which is not seen on the other. I ran this against a stand-alone broker:
qpid-send --durable yes --messages 50000 --content-size 8 -a 'q;{create:always,node:{durable:1}}'
and the store overflowed. So the message load here is bigger than the default store capacity, therefore it's a matter of timing whether it overflows or not. In the clustered configuration it appears that messages are produced much faster than they are consumed.
I think this is a performance issue, not a correctness issue. I would still like to find out why the differences arise but I think it's low priority/urgency.
This product has been discontinued or is no longer tracked in Red Hat Bugzilla. |