Bug 536155 - (RHQ-536) replacing db-based jbossmq with jboss messaging 2.0 with async journaling
replacing db-based jbossmq with jboss messaging 2.0 with async journaling
Product: RHQ Project
Classification: Other
Component: Alerts (Show other bugs)
All All
medium Severity medium (vote)
: ---
: ---
Assigned To: Joseph Marques
: Improvement
: RHQ-1347 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2008-06-02 12:17 EDT by Joseph Marques
Modified: 2010-08-18 11:40 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-08-10 13:59:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Joseph Marques 2008-06-02 12:17:00 EDT
JBoss Message 2.0.0.alpha was released recently, and has some promising performance improvements over JBoss MQ (which is the default JMS impl in JBoss 4.2):


so i'm thinking we should use the JBS configuration file to expose JMS-compliant destinations via JEE standard practices - namely JNDI.  then use the special JBS deployer that reads these config files, creates objects (as necessary) and plops them into JNDI.

even if we wanted to use the blocking persistence, we're still going to double the capacity the alerts subsystem can handle with a simple drop/replace of the backing impl for JMS.  however, async is really the way to go, because it'll give an additional 10-11x speedup.  that's 20-22x faster than our current impl.  

aside from that, it's a much better choice architecturally because the in-band and out-band processing halves of the alerts engine won't need to go back to our database bottleneck for persistence of matched alerting data.  each server instance in the rhq server-cluster can use non-database-based journaling, which gives us much more options; for instance, local journaling, would allow the (current design of the) alerts engine to scale linearly with respect to the rhq server-cluster.
Comment 1 Joseph Marques 2008-06-02 12:57:35 EDT
the operative phrase in all of this is the parenthetical "current design of the [alerts engine]" that makes this proposed improvement possible.  

this solution has immediate and tremendous benefit for local-only alerts - where all of the alert conditions refer to the same resource in inventory.  once we move into the realm of composite alerts, where you can aggregate different alert conditions across completely arbitrary resources (even those on different agents), the benefit decreases because data locality becomes important.

technically, as long as the conditions (across different resources) only involve resources that are co-located (managed by the same agent), the benefit still holds.  but once you introduce conditions that refer back to resources from different agents into a single composite alert, then instead of being able to process data from a single journal, there would need to be another controller layer above that which knew how to process the aggregate conditions of the composite alerts across the entire rhq server-cluster (not just a single server-collector node).

but that reminds me of a question that Jay Shaughnessy asked a few months ago, about whether we really need to have durability of unmatched alert condition logs and alert events.  I think the answer for local-only alerts is 'no'.  But for composite alerts with conditions that refer back to resources from different agents 'yes'.  so, if we want to squeak as much performance out as possible, we'll likely be dropping data into different journals that have different persistent guarantees depending on whether the matched alert data refers back to a local-only or composite alert.
Comment 2 Red Hat Bugzilla 2009-11-10 16:11:21 EST
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-536
Comment 3 Corey Welton 2010-08-10 13:59:56 EDT
Closing this bug per triage.  If this is still considered an issue, it can be reopened.
Comment 4 Corey Welton 2010-08-18 11:40:22 EDT
*** Bug 534562 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.