986393 – [RFE] Alarm audit/history API

Bug 986393 - [RFE] Alarm audit/history API

Summary: [RFE] Alarm audit/history API

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-ceilometer
Sub Component:
Version:	4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	low
Target Milestone:	Upstream M3
Target Release:	4.0
Assignee:	Eoghan Glynn
QA Contact:	Kevin Whitney
Docs Contact:
URL:	https://blueprints.launchpad.net/ceil...
Whiteboard:
Depends On:	988358
Blocks:	973191 RHOS40RFE 1055813
TreeView+	depends on / blocked

Reported:	2013-07-19 16:22 UTC by Eoghan Glynn
Modified:	2016-04-26 16:19 UTC (History)
CC List:	8 users (show)
Fixed In Version:	openstack-ceilometer-2013.2-0.10.1.b3.el6ost
Doc Type:	Enhancement
Doc Text:	A feature has been added in OpenStack Metering (Ceilometer) which allows the retention of alarm history in terms of lifecycle events, rule changes and state transformations. This was required because alarms encapsulate a transient state and a snapshot of their current evaluation rule, but users also need the capability of inspecting how the alarm state and rules changed over a longer timespan, including the period after the alarm no longer exists. Now, alarm history is configurably retained for lifecycle events, rule changes and state transformations.
Clone Of:
Environment:
Last Closed:	2013-12-20 00:14:24 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	41065	None	MERGED	Reorg alarms controller to facilitate history API	2020-05-11 13:56:15 UTC
OpenStack gerrit	41135	None	MERGED	Skeletal alarm history API	2020-05-11 13:56:15 UTC
OpenStack gerrit	43848	None	MERGED	Base Alarm history persistence model	2020-05-11 13:56:15 UTC
OpenStack gerrit	43849	None	MERGED	Plug alarm history logic into the API	2020-05-11 13:56:15 UTC
OpenStack gerrit	43850	None	MERGED	Alarm history storage implementation for mongodb	2020-05-11 13:56:15 UTC
OpenStack gerrit	44908	None	MERGED	Add query support to alarm history API	2020-05-11 13:56:15 UTC
OpenStack gerrit	45244	None	MERGED	Alarm history storage implementation for sqlalchemy	2020-05-11 13:56:16 UTC
Red Hat Product Errata	RHEA-2013:1859	normal	SHIPPED_LIVE	Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory	2013-12-21 00:01:48 UTC

Description Eoghan Glynn 2013-07-19 16:22:00 UTC

We to need persist and expose a limited period of alarm history to users.

For each alarm, this would be composed of lifecycle events (creation, deletion), state transitions (in and out of alarm), and attribute updates (especially those attributes that pertain to threshold evaluation).

The retention period must necessarily be limited, as alarm state may potentially flap rapidly producing high volumes.

The history retrieval API sould be:

* paginated with a limit and next marker
* constrainable by timestamp
* filterable by lifecycle event, state transition, attribute update

Upstream blueprint: https://blueprints.launchpad.net/ceilometer/+spec/alarm-audit-api

Comment 4 Eoghan Glynn 2013-10-21 15:36:31 UTC

How To Test
===========

0. Install packstack allinone, then spin up an instance in the usual way. 

Ensure the compute agent is gathering metrics at a reasonable cadence (every 60s for example instead of every 10mins as per the default):

  sudo sed -i '/^ *name: cpu_pipeline$/ { n ; s/interval: 600$/interval: 60/ }' /etc/ceilometer/pipeline.yaml
  sudo service openstack-ceilometer-compute restart


1. Create an alarm with a threshold sufficiently low that it's guaranteed to go into alarm:

  ceilometer alarm-threshold-create --name cpu_high --description 'instance running hot'  \
     --meter-name cpu_util  --threshold 0.01 --comparison-operator gt  --statistic avg \
     --period 60 --evaluation-periods 1 \
     --alarm-action 'log://' \
     --query resource_id=$INSTANCE_ID


2. Update the alarm:

  ceilometer alarm-update --threshold 75.0 -a $ALARM_ID



3. Wait a while, then delete the alarm:

  ceilometer alarm-delete -a $ALARM_ID


3. Ensure that the alarm-history reports the following events:

  * creation
  * rule change
  * state transition
  * deletion

  ceilometer alarm-history -a ALARM_ID
 +------------------+----------------------------+---------------------------------------+
 | Type             | Timestamp                  | Detail                                |
 +------------------+----------------------------+---------------------------------------+
 | creation         | 2013-10-01T16:20:29.238000 | name: cpu_high                        |
 |                  |                            | description: instance running hot     |
 |                  |                            | type: threshold                       |
 |                  |                            | rule: cpu_util > 0.01 during 1 x 60s |
 | state transition | 2013-10-01T16:20:40.626000 | state: alam                             |
 | rule change      | 2013-10-01T16:22:40.718000 | rule: cpu_util > 75.0 during 3 x 600s |
 | creation         | 2013-10-01T16:20:29.238000 | name: cpu_high                        |
 |                  |                            | description: instance running hot     |
 |                  |                            | type: threshold                       |
 |                  |                            | rule: cpu_util > 75. during 1 x 60s |

 +------------------+----------------------------+---------------------------------------+

Comment 5 Ami Jeain 2013-10-28 11:48:01 UTC

QANAK'ing due to QE capacity

Comment 11 errata-xmlrpc 2013-12-20 00:14:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2013-1859.html

Note You need to log in before you can comment on or make changes to this bug.