Bug 986393 - [RFE] Alarm audit/history API
Summary: [RFE] Alarm audit/history API
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ceilometer
Version: 4.0
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: Upstream M3
: 4.0
Assignee: Eoghan Glynn
QA Contact: Kevin Whitney
URL: https://blueprints.launchpad.net/ceil...
Whiteboard:
Depends On: 988358
Blocks: 973191 RHOS40RFE 1055813
TreeView+ depends on / blocked
 
Reported: 2013-07-19 16:22 UTC by Eoghan Glynn
Modified: 2016-04-26 16:19 UTC (History)
8 users (show)

Fixed In Version: openstack-ceilometer-2013.2-0.10.1.b3.el6ost
Doc Type: Enhancement
Doc Text:
A feature has been added in OpenStack Metering (Ceilometer) which allows the retention of alarm history in terms of lifecycle events, rule changes and state transformations. This was required because alarms encapsulate a transient state and a snapshot of their current evaluation rule, but users also need the capability of inspecting how the alarm state and rules changed over a longer timespan, including the period after the alarm no longer exists. Now, alarm history is configurably retained for lifecycle events, rule changes and state transformations.
Clone Of:
Environment:
Last Closed: 2013-12-20 00:14:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 41065 0 None MERGED Reorg alarms controller to facilitate history API 2020-05-11 13:56:15 UTC
OpenStack gerrit 41135 0 None MERGED Skeletal alarm history API 2020-05-11 13:56:15 UTC
OpenStack gerrit 43848 0 None MERGED Base Alarm history persistence model 2020-05-11 13:56:15 UTC
OpenStack gerrit 43849 0 None MERGED Plug alarm history logic into the API 2020-05-11 13:56:15 UTC
OpenStack gerrit 43850 0 None MERGED Alarm history storage implementation for mongodb 2020-05-11 13:56:15 UTC
OpenStack gerrit 44908 0 None MERGED Add query support to alarm history API 2020-05-11 13:56:15 UTC
OpenStack gerrit 45244 0 None MERGED Alarm history storage implementation for sqlalchemy 2020-05-11 13:56:16 UTC
Red Hat Product Errata RHEA-2013:1859 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory 2013-12-21 00:01:48 UTC

Description Eoghan Glynn 2013-07-19 16:22:00 UTC
We to need persist and expose a limited period of alarm history to users.

For each alarm, this would be composed of lifecycle events (creation, deletion), state transitions (in and out of alarm), and attribute updates (especially those attributes that pertain to threshold evaluation).

The retention period must necessarily be limited, as alarm state may potentially flap rapidly producing high volumes.

The history retrieval API sould be:

* paginated with a limit and next marker
* constrainable by timestamp
* filterable by lifecycle event, state transition, attribute update

Upstream blueprint: https://blueprints.launchpad.net/ceilometer/+spec/alarm-audit-api

Comment 4 Eoghan Glynn 2013-10-21 15:36:31 UTC
How To Test
===========

0. Install packstack allinone, then spin up an instance in the usual way. 

Ensure the compute agent is gathering metrics at a reasonable cadence (every 60s for example instead of every 10mins as per the default):

  sudo sed -i '/^ *name: cpu_pipeline$/ { n ; s/interval: 600$/interval: 60/ }' /etc/ceilometer/pipeline.yaml
  sudo service openstack-ceilometer-compute restart


1. Create an alarm with a threshold sufficiently low that it's guaranteed to go into alarm:

  ceilometer alarm-threshold-create --name cpu_high --description 'instance running hot'  \
     --meter-name cpu_util  --threshold 0.01 --comparison-operator gt  --statistic avg \
     --period 60 --evaluation-periods 1 \
     --alarm-action 'log://' \
     --query resource_id=$INSTANCE_ID


2. Update the alarm:

  ceilometer alarm-update --threshold 75.0 -a $ALARM_ID



3. Wait a while, then delete the alarm:

  ceilometer alarm-delete -a $ALARM_ID


3. Ensure that the alarm-history reports the following events:

  * creation
  * rule change
  * state transition
  * deletion

  ceilometer alarm-history -a ALARM_ID
 +------------------+----------------------------+---------------------------------------+
 | Type             | Timestamp                  | Detail                                |
 +------------------+----------------------------+---------------------------------------+
 | creation         | 2013-10-01T16:20:29.238000 | name: cpu_high                        |
 |                  |                            | description: instance running hot     |
 |                  |                            | type: threshold                       |
 |                  |                            | rule: cpu_util > 0.01 during 1 x 60s |
 | state transition | 2013-10-01T16:20:40.626000 | state: alam                             |
 | rule change      | 2013-10-01T16:22:40.718000 | rule: cpu_util > 75.0 during 3 x 600s |
 | creation         | 2013-10-01T16:20:29.238000 | name: cpu_high                        |
 |                  |                            | description: instance running hot     |
 |                  |                            | type: threshold                       |
 |                  |                            | rule: cpu_util > 75. during 1 x 60s |

 +------------------+----------------------------+---------------------------------------+

Comment 5 Ami Jeain 2013-10-28 11:48:01 UTC
QANAK'ing due to QE capacity

Comment 11 errata-xmlrpc 2013-12-20 00:14:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2013-1859.html


Note You need to log in before you can comment on or make changes to this bug.