Bug 1028487 - Recovery alert cache refresh needs to happen prior to alert notification processing
Recovery alert cache refresh needs to happen prior to alert notification proc...
Status: ON_QA
Product: RHQ Project
Classification: Other
Component: Alerts (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified (vote)
: GA
: ---
Assigned To: Jay Shaughnessy
Mike Foley
Depends On:
Blocks: 1030111
  Show dependency treegraph
Reported: 2013-11-08 10:00 EST by Jay Shaughnessy
Modified: 2013-11-13 18:06 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1030111 (view as bug list)
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Jay Shaughnessy 2013-11-08 10:00:54 EST
Currently alert notifications are processed in the same transaction as the alert creation, recovery alert activation, etc.  There are a few issues related to the notif processing taking place in this transaction:

1) It simply extends the length of an already complicated transaction, potentially holding locks, delaying the alert commit, delaying the global cache refresh flags being committed (such that HA servers can pick up on it), etc...

2) Notifications can actually initiate recovery actions such as executing resource operations, invoking CLI scripts, etc.  This should not happen prior to our ability to update the global alert condition cache, which must happen to begin condition matching on activated recovery alert definitions.  Otherwise we risk an actual recovery happening prior to the recovery alert being ready.

The alert notification processing should happen outside of the alert creation transaction and after cache refresh.
Comment 1 Jay Shaughnessy 2013-11-08 10:57:06 EST
master commit 28becd282f8cd3ed4327a56ea0c08f8431845dba
Author: Jay Shaughnessy <jshaughn@redhat.com>
Date:   Fri Nov 8 10:42:46 2013 -0500

Restructure SLSB methods (locals only, no remote changes) to process alert
notifications later in the workflow, after the alert is committed and after
we have a chance to update the condition caches (for more reliable recovery
alerting).  Needed to be able to pass back the new alert through the call chain.

Note You need to log in before you can comment on or make changes to this bug.