Bug 451011
| Summary: | durable perftest in fanout mode fails with sync store | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Gordon Sim <gsim> | ||||||||||
| Component: | qpid-cpp | Assignee: | Kim van der Riet <kim.vdriet> | ||||||||||
| Status: | CLOSED WONTFIX | QA Contact: | Kim van der Riet <kim.vdriet> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | urgent | ||||||||||||
| Version: | beta | CC: | aconway | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | ia64 | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2008-06-13 11:26:17 UTC | Type: | --- | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Gordon Sim
2008-06-12 11:41:37 UTC
correction to title: only fails on sync store I've seen two modes of failure - only with sync store. Client exits with: Processing 2 messages from pub_done .SubscribeThread exception: Closed: I've also seen the test hang. There is a second issue: the client is not displaying the correct error message. The client is receiving the error: 2008-jun-12 07:52:07 warning Broker closed connection: 541, Error dequeing message, persistence id not set (BdbMessageStore.cpp:1208) but it is getting overwritten with the generic Closed: message. The sync store assumes incorrectly that a message will be enqueued on all matching queues before any dequeues can occur. This is no longer correct and concurrent enqueues and dequeues on the sync store are unsafe as the code stands. Created attachment 309111 [details]
Fix option 1
One possible fix. This has the advantage of being only two lines, but the
disadvantage of affecting all codepaths, not just sync stored msgs.
Created attachment 309115 [details]
Option 2: change to qpid
This is another option that minimises the impact on transient and async store
codepaths, but requires changes to both qpid and rhm store.
Created attachment 309116 [details]
Option 2: change to rhm store
This is the corresponding change to the store for option 2.
The third option would be locking within the store. However, as well as being less efficient, that would be quite involved especially to avoid impacting the standard async path, so I feel its not worth pursuing. Committed fix to client error message problem revision 667205 We really need to perf test the impact of these fixes. Fix 1 looks risky to me, it puts two explicit mallocs and a new variable-size data structure on the critical path. Fix 2 has no impact for transient messages but I don't know what impact it might have on async persistence, we would need to test that on Shaks rig before going forward. I'd strongly lean towards disabling sync store immediately. If there are still situations where async store is broken then we need to fix it pronto. If we support sync store for GA we may be dogged by it for months/years to come. Created attachment 309139 [details]
remove sync store as option
I agree removing store is best option (though I don't see where the mallocs are added), patch attached above to remove the option. Agreed solution was to disable the option to runthe store in sync mode. |