Bug 813914

Summary:	RFE: Notify on a subsystem falling behind
Product:	[Other] RHQ Project	Reporter:	Charles Crouch <ccrouch>
Component:	No Component	Assignee:	Nobody <nobody>
Status:	NEW ---	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	medium
Version:	3.0.0	CC:	hrupp
Target Milestone:	---	Keywords:	FutureFeature
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Charles Crouch 2012-04-18 18:28:51 UTC

This an RFE to provide better issue escalation when the alert subsystem is failing behind (e.g. it can't keep up with the alert definition evaluation based on the rate at which items: measurement reports, events etc, are coming in). This could be either because there are too many items or too many alert definitions.
It is arising from case 00470214 (https://c.na7.visual.force.com/apex/Case_View?id=500A0000007AhjH&srKp=500&sfdc.override=1&srPos=0)

The idea would be that the user is notified (message center? mail to rhqadmin?) about the alert subsystem failing behind. The details of the notification should include diagnostic data (e.g. rows count of the alert condition history table or row counts of the raw, hourly, and daily metric tables) that could be passed on by the customer to support which would then help support diagnose the underlying issue.

Comment 1 Charles Crouch 2012-04-18 18:54:39 UTC

Made the title more generic. The sort of notification we're looking for here is generic to all subsystems, e.g. metric collection, events, alerts