Bug 602123 - alert notification processing exceptions prevent alerts audit trail from being created
Summary: alert notification processing exceptions prevent alerts audit trail from bein...
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Alerts   
(Show other bugs)
Version: 3.0.0
Hardware: All
OS: Linux
high
medium vote
Target Milestone: ---
: ---
Assignee: Joseph Marques
QA Contact: Sudhir D
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: jon-sprint11-bugs
TreeView+ depends on / blocked
 
Reported: 2010-06-09 08:08 UTC by Joseph Marques
Modified: 2010-08-12 16:45 UTC (History)
0 users

Fixed In Version: 2.4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-08-12 16:45:29 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description Joseph Marques 2010-06-09 08:08:29 UTC
Description of problem:

When an alert fires, it needs to capture information from the capture at the time it was triggered.  This information tells the user what specific monitored values made this alert fire.  It also needs to keep a record of which notifications it sent out at the time of firing (or if there was any error sending notifications).  It appears, however, if there is any sort of uncaught error during notification processing, that the alert does not get inserted into the database at all.

How reproducible:
only through error pathways / workflows

Steps to Reproduce:
various methods, need to research this further to find them all.

Additional info:
any fix for this bug will require meticulous code review because it's not practical to try and invoke every error pathway yet we still need to have reasonably high confidence that the improved exception handling is correct.

Comment 1 Charles Crouch 2010-06-09 12:34:54 UTC
Needs Triage

Comment 2 Charles Crouch 2010-06-18 13:22:52 UTC
Joseph
I think we need more info on the sorts of errors that could cause this, if main stream errors are supported fine, e.g. bad email address, or resources not existing in the environment when they are called from an alert, then i think this should be pushed.

Comment 3 Joseph Marques 2010-06-30 07:30:38 UTC
commit 894b6792a5f614d63b10703a726a29e4aab603e0
Author: Joseph Marques <joseph@redhat.com>
Date:   Wed Jun 30 03:26:37 2010 -0400

BZ-602123: BZ-602270: various tweaks for alert notification processing
    
* never allow any exceptions to bubble up during notification processing
    
* add new UNKNOWN status for AlertNotificationLogs
** use the UNKNOWN state to catching mishaving plugins during alert processing
    
* add new DEFERRED status for AlertNotificationLogs
** use the DEFERRED state or operations
** since we don't wait for the operation to complete, we don't know it's termination state)
    
* add new PARTIAL status for AlertNotificationLogs
** use it to represent mixed success/failures during email sending
** do not defer email sending anymore, execute during each notification processing
** by processing them all upfront, easier to reason about terminating ResultState

Comment 4 Sudhir D 2010-07-02 13:21:50 UTC
Direct Emails  	PARTIAL  	Target addresses were: [rhquser@redhat.com, ^&$*((@)#$.$@]
Successfully sent to: [rhquser@redhat.com]
Failed to send to: [^&$*((@)#$.$@]

Resource Operations 	DEFERRED 	Executed 'Store Configuration' on the ( dhcp6-150.pnq.redhat.com-embedded > Tomcat (8080) ) resource
Check the corresponding operation history for more details.

I don't see any exceptions in the logs either. Marking this bug as verified.

Comment 5 Corey Welton 2010-08-12 16:45:29 UTC
Mass-closure of verified bugs against JON.


Note You need to log in before you can comment on or make changes to this bug.