Bug 602123

Summary: alert notification processing exceptions prevent alerts audit trail from being created
Product: [Other] RHQ Project Reporter: Joseph Marques <jmarques>
Component: AlertsAssignee: Joseph Marques <jmarques>
Status: CLOSED CURRENTRELEASE QA Contact: Sudhir D <sdharane>
Severity: medium Docs Contact:
Priority: high    
Version: 3.0.0   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-08-12 16:45:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 593121    

Description Joseph Marques 2010-06-09 08:08:29 UTC
Description of problem:

When an alert fires, it needs to capture information from the capture at the time it was triggered.  This information tells the user what specific monitored values made this alert fire.  It also needs to keep a record of which notifications it sent out at the time of firing (or if there was any error sending notifications).  It appears, however, if there is any sort of uncaught error during notification processing, that the alert does not get inserted into the database at all.

How reproducible:
only through error pathways / workflows

Steps to Reproduce:
various methods, need to research this further to find them all.

Additional info:
any fix for this bug will require meticulous code review because it's not practical to try and invoke every error pathway yet we still need to have reasonably high confidence that the improved exception handling is correct.

Comment 1 Charles Crouch 2010-06-09 12:34:54 UTC
Needs Triage

Comment 2 Charles Crouch 2010-06-18 13:22:52 UTC
Joseph
I think we need more info on the sorts of errors that could cause this, if main stream errors are supported fine, e.g. bad email address, or resources not existing in the environment when they are called from an alert, then i think this should be pushed.

Comment 3 Joseph Marques 2010-06-30 07:30:38 UTC
commit 894b6792a5f614d63b10703a726a29e4aab603e0
Author: Joseph Marques <joseph>
Date:   Wed Jun 30 03:26:37 2010 -0400

BZ-602123: BZ-602270: various tweaks for alert notification processing
    
* never allow any exceptions to bubble up during notification processing
    
* add new UNKNOWN status for AlertNotificationLogs
** use the UNKNOWN state to catching mishaving plugins during alert processing
    
* add new DEFERRED status for AlertNotificationLogs
** use the DEFERRED state or operations
** since we don't wait for the operation to complete, we don't know it's termination state)
    
* add new PARTIAL status for AlertNotificationLogs
** use it to represent mixed success/failures during email sending
** do not defer email sending anymore, execute during each notification processing
** by processing them all upfront, easier to reason about terminating ResultState

Comment 4 Sudhir D 2010-07-02 13:21:50 UTC
Direct Emails  	PARTIAL  	Target addresses were: [rhquser, ^&$*((@)#$.$@]
Successfully sent to: [rhquser]
Failed to send to: [^&$*((@)#$.$@]

Resource Operations 	DEFERRED 	Executed 'Store Configuration' on the ( dhcp6-150.pnq.redhat.com-embedded > Tomcat (8080) ) resource
Check the corresponding operation history for more details.

I don't see any exceptions in the logs either. Marking this bug as verified.

Comment 5 Corey Welton 2010-08-12 16:45:29 UTC
Mass-closure of verified bugs against JON.