| Summary: | Composite alarm uses last value from the evaluation for alarm evaluation | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Yurii Prokulevych <yprokule> |
| Component: | openstack-aodh | Assignee: | Mehdi ABAAKOUK <mabaakou> |
| Status: | CLOSED CANTFIX | QA Contact: | Sasha Smolyak <ssmolyak> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 10.0 (Newton) | CC: | apevec, jschluet, lhh, pkilambi, tvignaud |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | 12.0 (Pike) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
When (individual) alarms are evaluated, we compute a trending state when the previous state is unknown and change the real state to this trending state if we got enough datapoints but not all other them cross the threshold. But currently the "trending state" is calculated only with the last datapoint, but because have enough datapoints it should be possible to compute a better trending state. Upstream doesn't have any plan to fix that. This is a corner case that occurs when metric doesn't have enough data yet. Upstream doesn't have any plan to fix that. This is a corner case that occurs when metric doesn't have enough data yet. |
Description of problem: ----------------------- Composite alarm uses last value from the evaluation periods, causing false positive alarm transition. 2016-10-03 08:49:35.884 11794 DEBUG aodh.evaluator [-] evaluating alarm c9891f7b-42ac-40fc-8b30-9631d21d228e _evaluate_alarm /usr/lib/python2.7/site-packages/aodh/evaluator/__init__.py:257 [63/1810] 2016-10-03 08:49:35.885 11794 DEBUG aodh.evaluator.composite [-] Evaluating composite rule alarm c9891f7b-42ac-40fc-8b30-9631d21d228e ... evaluate /usr/lib/python2.7/site-packages/aodh/evaluator/composite.py:213 2016-10-03 08:49:35.885 11794 DEBUG aodh.evaluator.composite [-] Evaluating gnocchi_aggregation_by_metrics_threshold rule: {u'evaluation_periods': 3, u'metrics': [u'b6ba3db7-78d5-4e66-a592-d999a2988a91', u'7f16d ccc-92e6-43a9-a87e-009f078b9b55'], u'threshold': 6.0, u'granularity': 60, u'aggregation_method': u'mean', u'type': u'gnocchi_aggregation_by_metrics_threshold', u'comparison_operator': u'ge'} evaluate /usr/lib/py thon2.7/site-packages/aodh/evaluator/composite.py:45 2016-10-03 08:49:35.885 11794 DEBUG aodh.evaluator.threshold [-] query stats from 2016-10-03 08:45:35.885525 to 2016-10-03 08:49:35.885525 _bound_duration /usr/lib/python2.7/site-packages/aodh/evaluator/threshol d.py:89 2016-10-03 08:49:36.097 11794 DEBUG aodh.evaluator.gnocchi [-] sanitize stats [] _sanitize /usr/lib/python2.7/site-packages/aodh/evaluator/gnocchi.py:48 2016-10-03 08:49:36.098 11794 DEBUG aodh.evaluator.gnocchi [-] pruned statistics to 0 _sanitize /usr/lib/python2.7/site-packages/aodh/evaluator/gnocchi.py:52 2016-10-03 08:49:36.098 11794 DEBUG aodh.evaluator.composite [-] Evaluating gnocchi_aggregation_by_resources_threshold rule: {u'evaluation_periods': 3, u'metric': u'radosgw.api.request', u'threshold': 4.0, u'gra nularity': 60, u'aggregation_method': u'mean', u'query': u'{"or":[{"=":{"id":"alarm-resource-3"}},{"=":{"id":"alarm-resource-4"}}]}', u'type': u'gnocchi_aggregation_by_resources_threshold', u'comparison_operator ': u'ge', u'resource_type': u'ceph_account'} evaluate /usr/lib/python2.7/site-packages/aodh/evaluator/composite.py:45 2016-10-03 08:49:36.098 11794 DEBUG aodh.evaluator.threshold [-] query stats from 2016-10-03 08:45:36.098557 to 2016-10-03 08:49:36.098557 _bound_duration /usr/lib/python2.7/site-packages/aodh/evaluator/threshol d.py:89 2016-10-03 08:49:36.598 11794 DEBUG aodh.evaluator.gnocchi [-] sanitize stats [[u'2016-10-03T08:00:00+00:00', 3600.0, 3.1875], [u'2016-10-03T08:45:00+00:00', 900.0, 3.3], [u'2016-10-03T08:45:00+00:00', 300.0, 3$ 3], [u'2016-10-03T08:45:00+00:00', 60.0, 3.0], [u'2016-10-03T08:46:00+00:00', 60.0, 3.0], [u'2016-10-03T08:47:00+00:00', 60.0, 3.0], [u'2016-10-03T08:48:00+00:00', 60.0, 3.5], [u'2016-10-03T08:49:00+00:00', 60.$ , 4.0]] _sanitize /usr/lib/python2.7/site-packages/aodh/evaluator/gnocchi.py:48 2016-10-03 08:49:36.599 11794 DEBUG aodh.evaluator.gnocchi [-] pruned statistics to 3 _sanitize /usr/lib/python2.7/site-packages/aodh/evaluator/gnocchi.py:52 2016-10-03 08:49:36.599 11794 DEBUG aodh.evaluator.threshold [-] comparing value 3.0 against threshold 4.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:175 2016-10-03 08:49:36.600 11794 DEBUG aodh.evaluator.threshold [-] comparing value 3.5 against threshold 4.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:175 2016-10-03 08:49:36.600 11794 DEBUG aodh.evaluator.threshold [-] comparing value 4.0 against threshold 4.0 _compare /usr/lib/python2.7/site-packages/aodh/evaluator/threshold.py:175 2016-10-03 08:49:36.601 11794 INFO aodh.evaluator [-] alarm c9891f7b-42ac-40fc-8b30-9631d21d228e transitioning to alarm because Composite rule alarm with composition form: (rule1 or rule2) transition to alarm, $ ue to rules: rule2 outside their threshold. Version-Release number of selected component (if applicable): ------------------------------------------------------------- openstack-aodh-evaluator-3.0.0-0.20160907221145.3990c5b.el7ost.noarch openstack-aodh-notifier-3.0.0-0.20160907221145.3990c5b.el7ost.noarch openstack-aodh-common-3.0.0-0.20160907221145.3990c5b.el7ost.noarch puppet-aodh-9.2.0-0.20160902115754.16ea22a.el7ost.noarch openstack-aodh-listener-3.0.0-0.20160907221145.3990c5b.el7ost.noarch python-aodhclient-0.6.0-0.20160826150744.65d2e62.el7ost.noarch openstack-aodh-api-3.0.0-0.20160907221145.3990c5b.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Create composite alarm aodh --debug alarm create \ --type composite \ --name Composite-OR-Alarm \ --description 'Composite OR Alarm' \ --severity critical \ --enabled True \ --alarm-action 'log://' \ --ok-action 'log://' \ --insufficient-data-action 'log://' \ --evaluation-periods 3 \ --composite-rule '{"or": [{"type":"gnocchi_aggregation_by_metrics_threshold","threshold": 6, "metrics":["b6ba3db7-78d5-4e66-a592-d999a2988a91", "7f16dccc-92e6-43a9-a87e-009f078b9b55"], "evaluation_periods": 3, "granularity": 60, "comparison_operator": "ge", "aggregation_method":"mean"}, { "type":"gnocchi_aggregation_by_resources_threshold", "query": "{\"or\":[{\"=\":{\"id\":\"alarm-resource-3\"}},{\"=\":{\"id\":\"alarm-resource-4\"}}]}", "metric": "radosgw.api.request", "evaluation_periods":3, "granularity":60, "comparison_operator": "ge", "threshold":"4", "resource_type":"ceph_account", "aggregation_method":"mean"}]}' 2. Trigger alarm transition for i in {1..9} do ceilometer sample-create --resource-id alarm-resource-3 --meter-name radosgw.api.request --meter-type gauge --meter-unit unit1 --sample-volume ${i}; ceilometer sample-create --resource-id alarm-resource-4 --meter-name radosgw.api.request --meter-type gauge --meter-unit unit1 --sample-volume 5; sleep 60; done 3. Assert alarm transitions to new state