Bug 1398339

Summary: AODH: composite alarm - transition to insufficient data, due to rules: state evaluated to unknown
Product: Red Hat OpenStack Reporter: Yurii Prokulevych <yprokule>
Component: openstack-aodhAssignee: Mehdi ABAAKOUK <mabaakou>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: apevec, fbaudin, jdanjou, jschluet, lhh, mlopes, nlevinki, pkilambi
Target Milestone: z2Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-aodh-3.0.1-6.el7ost Doc Type: Bug Fix
Doc Text:
Prior to this update, the aodh composite alarm would not compute the final alarm state correctly. This occurred when one alarm only had a trending state and not a hard state. The trending state occurrs only when the alarm monitors new data, which is usually just after an alarm is created, or when the resource that the alarm monitors is created. Consequently, the aodh composite alarm would report the incorrect state at the beginning of the alarm life. With this update, the aodh has been updated to openstack-aodh-3.0.1-6.el7ost. As a result, the aodh composite alarm no longer reports the incorrect state.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-01 13:38:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yurii Prokulevych 2016-11-24 13:33:42 UTC
Description of problem:
-----------------------
AODH failed properly evaluate composite alarm and transitioned it to unknown state:

aodh.evaluator [-] alarm ecdeee1d-2aa0-4de5-9208-dbad6fde23fa transitioning to insufficient data because Composite rule alarm with composition form: (rule1 or rule2) transition to insufficient data, due to rules:  state evaluated to unknown.

Excerpt from aodh/evaluator.log
-------------------------------

2016-11-24 10:05:12.138 58404 DEBUG aodh.evaluator.composite [-] Evaluating composite rule alarm ecdeee1d-2aa0-4de5-9208-dbad6fde23fa ... evaluate /usr/lib/python2.7/site-packages/aodh/evaluator/composite.py:213
2016-11-24 10:05:12.138 58404 DEBUG aodh.evaluator.composite [-] Evaluating gnocchi_resources_threshold rule: {u'evaluation_periods': 3, u'metric': u'radosgw.objects.containers', u'resource_id': u'alarm-resource
-1', u'aggregation_method': u'mean', u'granularity': 60, u'threshold': 5.0, u'type': u'gnocchi_resources_threshold', u'comparison_operator': u'ge', u'resource_type': u'ceph_account'} evaluate /usr/lib/python2.7/
site-packages/aodh/evaluator/composite.py:45
2016-11-24 10:05:12.139 58404 DEBUG aodh.evaluator.threshold [-] query stats from 2016-11-24 10:01:12.139154 to 2016-11-24 10:05:12.139154 _bound_duration /usr/lib/python2.7/site-packages/aodh/evaluator/threshol
d.py:90
2016-11-24 10:05:12.602 58404 DEBUG aodh.evaluator.gnocchi [-] sanitize stats [[u'2016-11-24T10:00:00+00:00', 3600.0, 3.0], [u'2016-11-24T10:00:00+00:00', 900.0, 3.0], [u'2016-11-24T10:00:00+00:00', 300.0, 3.0],
 [u'2016-11-24T10:01:00+00:00', 60.0, 2.0], [u'2016-11-24T10:02:00+00:00', 60.0, 3.0], [u'2016-11-24T10:03:00+00:00', 60.0, 4.0], [u'2016-11-24T10:04:00+00:00', 60.0, 5.0]] _sanitize /usr/lib/python2.7/site-pack
ages/aodh/evaluator/gnocchi.py:48
2016-11-24 10:05:12.602 58404 DEBUG aodh.evaluator.gnocchi [-] pruned statistics to 3 _sanitize /usr/lib/python2.7/site-packages/aodh/evaluator/gnocchi.py:52
2016-11-24 10:05:12.602 58404 DEBUG aodh.evaluator.threshold [-] comparing value 3.0 against threshold 5.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:181
2016-11-24 10:05:12.603 58404 DEBUG aodh.evaluator.threshold [-] comparing value 4.0 against threshold 5.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:181
2016-11-24 10:05:12.603 58404 DEBUG aodh.evaluator.threshold [-] comparing value 5.0 against threshold 5.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:181
2016-11-24 10:05:12.603 58404 DEBUG aodh.evaluator.composite [-] Evaluating gnocchi_resources_threshold rule: {u'evaluation_periods': 3, u'metric': u'radosgw.objects.containers', u'resource_id': u'alarm-resource
-2', u'aggregation_method': u'mean', u'granularity': 60, u'threshold': 5.0, u'type': u'gnocchi_resources_threshold', u'comparison_operator': u'ge', u'resource_type': u'ceph_account'} evaluate /usr/lib/python2.7/
site-packages/aodh/evaluator/composite.py:45
2016-11-24 10:05:12.603 58404 DEBUG aodh.evaluator.threshold [-] query stats from 2016-11-24 10:01:12.603516 to 2016-11-24 10:05:12.603516 _bound_duration /usr/lib/python2.7/site-packages/aodh/evaluator/threshol
d.py:90
2016-11-24 10:05:13.073 58404 DEBUG aodh.evaluator.gnocchi [-] sanitize stats [[u'2016-11-24T10:00:00+00:00', 3600.0, 3.0], [u'2016-11-24T10:00:00+00:00', 900.0, 3.0], [u'2016-11-24T10:00:00+00:00', 300.0, 3.0],
 [u'2016-11-24T10:01:00+00:00', 60.0, 2.0], [u'2016-11-24T10:02:00+00:00', 60.0, 3.0], [u'2016-11-24T10:03:00+00:00', 60.0, 4.0], [u'2016-11-24T10:04:00+00:00', 60.0, 5.0]] _sanitize /usr/lib/python2.7/site-pack
ages/aodh/evaluator/gnocchi.py:48
2016-11-24 10:05:13.074 58404 DEBUG aodh.evaluator.gnocchi [-] pruned statistics to 3 _sanitize /usr/lib/python2.7/site-packages/aodh/evaluator/gnocchi.py:52
2016-11-24 10:05:13.074 58404 DEBUG aodh.evaluator.threshold [-] comparing value 3.0 against threshold 5.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:181
2016-11-24 10:05:13.074 58404 DEBUG aodh.evaluator.threshold [-] comparing value 4.0 against threshold 5.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:181
2016-11-24 10:05:13.074 58404 DEBUG aodh.evaluator.threshold [-] comparing value 5.0 against threshold 5.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:181
2016-11-24 10:05:13.075 58404 INFO aodh.evaluator [-] alarm ecdeee1d-2aa0-4de5-9208-dbad6fde23fa transitioning to insufficient data because Composite rule alarm with composition form: (rule1 or rule2) transition to insufficient data, due to rules:  state evaluated to unknown.


aodh alarm show ecdeee1d-2aa0-4de5-9208-dbad6fde23fa
+---------------------------+----------------------------------------------------------------------------------------+
| Field                     | Value                                                                                  |
+---------------------------+----------------------------------------------------------------------------------------+
| alarm_actions             | [u'log://']                                                                            |
| alarm_id                  | ecdeee1d-2aa0-4de5-9208-dbad6fde23fa                                                   |
| composite_rule            | {                                                                                      |
|                           |   "or": [                                                                              |
|                           |     {                                                                                  |
|                           |       "evaluation_periods": 3,                                                         |
|                           |       "metric": "radosgw.objects.containers",                                          |
|                           |       "resource_id": "alarm-resource-1",                                               |
|                           |       "aggregation_method": "mean",                                                    |
|                           |       "granularity": 60,                                                               |
|                           |       "threshold": 5.0,                                                                |
|                           |       "type": "gnocchi_resources_threshold",                                           |
|                           |       "comparison_operator": "ge",                                                     |
|                           |       "resource_type": "ceph_account"                                                  |
|                           |     },                                                                                 |
|                           |     {                                                                                  |
|                           |       "evaluation_periods": 3,                                                         |
|                           |       "metric": "radosgw.objects.containers",                                          |
|                           |       "resource_id": "alarm-resource-2",                                               |
|                           |       "aggregation_method": "mean",                                                    |
|                           |       "granularity": 60,                                                               |
|                           |       "threshold": 5.0,                                                                |
|                           |       "type": "gnocchi_resources_threshold",                                           |
|                           |       "comparison_operator": "ge",                                                     |
|                           |       "resource_type": "ceph_account"                                                  |
|                           |     }                                                                                  |
|                           |   ]                                                                                    |
|                           | }                                                                                      |
| description               | composite alarm converted from combination alarm: 8ec67ca9-374d-4ab2-8997-3591b538db7f |
| enabled                   | True                                                                                   |
| insufficient_data_actions | [u'log://']                                                                            |
| name                      | composite-GRT-OR-GRT                                                                   |
| ok_actions                | [u'log://']                                                                            |
| project_id                | 1dfd1dbd29d9443ba6c57869637e3ab3                                                       |
| repeat_actions            | False                                                                                  |
| severity                  | low                                                                                    |
| state                     | insufficient data                                                                      |
| state_timestamp           | 2016-11-23T09:26:14.646000                                                             |
| time_constraints          | []                                                                                     |
| timestamp                 | 2016-11-24T09:36:26.243821                                                             |
| type                      | composite                                                                              |
| user_id                   | adbf1eeca557477e85777e660307f0c4                                                       |
+---------------------------+----------------------------------------------------------------------------------------+


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-aodh-api-3.0.1-4.el7ost.noarch
openstack-aodh-listener-3.0.1-4.el7ost.noarch
puppet-aodh-9.4.1-1.el7ost.noarch
python-aodh-3.0.1-4.el7ost.noarch
openstack-aodh-evaluator-3.0.1-4.el7ost.noarch
openstack-aodh-common-3.0.1-4.el7ost.noarch
openstack-aodh-notifier-3.0.1-4.el7ost.noarch
python-aodhclient-0.7.0-1.el7ost.noarch


Steps to Reproduce:
-------------------
1. Create composite alarm
2. Send samples to trigger alarm transition
3. Observe the state transition and aodh/evaluator.log

Comment 1 Jon Schlueter 2017-01-12 12:36:52 UTC
landed on upstream/stable/newton updating external tracker with that patch

Comment 4 nlevinki 2017-02-28 13:50:14 UTC
https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS10/job/qe-phase2-10_director-rhel-7.3-virthost-1cont_1comp_1ceph-ipv4-vxlan-ceph/76/

All regression tests passed with no errors and the right rpm is in the puddle.

Comment 6 errata-xmlrpc 2017-03-01 13:38:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0355.html