Bug 535435 (RHQ-2130)
Summary: | Make it possible to query availability state in the alert conditions | ||
---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Lukas Krejci <lkrejci> |
Component: | Alerts | Assignee: | Jay Shaughnessy <jshaughn> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | unspecified | CC: | cwelton, jshaughn, rbs |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | RHQ 4.4.0 | ||
Hardware: | All | ||
OS: | All | ||
URL: | http://jira.rhq-project.org/browse/RHQ-2130 | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-09-01 10:09:43 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 741450 |
Description
Lukas Krejci
2009-06-05 19:43:00 UTC
Relevant case: https://enterprise.redhat.com/issue-tracker/302566 There are two parts to this: 1) The availability report from the agent (99% of the time) only reports deltas up to the server - this was done because of the RLE (run-length encoded) nature of availability data 2) You can only alert on deltas (the cache only checks for deltas) - this was done specifically because of #1 However, there a few other things to keep in mind. We have the live availability precomputed for every resource in the rhq_resource_avail table, so crafting an in-memory cache of the current availabilities /could/ be done (a single query), but...availability data doesn't always come from the agent (the suspect-agent / backfiller job can mark resources as down too) so we'd need to implement a cache reloading mechanism for when availability data becomes stale. An alternate solution could be a system-level configuration that either turns RLE on and off. If RLE is off, then the agent will always report availability for all of its managed resources, which would enable availability-based alerting to have 4 possible options: goes down, comes up, is down, is up (the last two being possible when RLE is off). This bug was previously known as http://jira.rhq-project.org/browse/RHQ-2130 This has basically been addressed with Availability Duration alerting in the jshaughn/avail branch. See: http://rhq-project.org/display/RHQ/Design-Availability+Checking#Design-AvailabilityChecking-AddAvailabilityDurationAlerting This is in master. Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since. |