Bug 1715114 - tests.scenario.test_aodh_alarm.AodhAlarmTest.test_alarm fails to receive expected alarm
Summary: tests.scenario.test_aodh_alarm.AodhAlarmTest.test_alarm fails to receive expe...
Keywords:
Status: CLOSED DUPLICATE of bug 1716350
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-aodh
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ga
: ---
Assignee: Vinay Kapalavai
QA Contact: Nataf Sharabi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-29 15:29 UTC by Victor Voronkov
Modified: 2019-06-17 14:37 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-17 14:37:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Victor Voronkov 2019-05-29 15:29:57 UTC
Description of problem:
heat_tempest_plugin.tests.scenario.test_aodh_alarm.AodhAlarmTest.test_alarm[id-fc0f18a6-f65c-4df1-b9c5-e160dea59849] fails to receive expected alarm

http://staging-jenkins2-qe-playground.usersys.redhat.com/view/DFG/view/cloud_apps/view/heat/job/DFG-cloud_apps-heat-15_director-rhel-virthost-3cont_1comp_3ceph-ipv4-geneve-poc/12/testReport/heat_tempest_plugin.tests.scenario.test_aodh_alarm/AodhAlarmTest/test_alarm_id_fc0f18a6_f65c_4df1_b9c5_e160dea59849_/

WARNING [heat_tempest_plugin.tests.scenario.test_aodh_alarm] check_instance_count exp:2, act:1
 WARNING [heat_tempest_plugin.tests.scenario.test_aodh_alarm] check_instance_count exp:2, act:1
 WARNING [heat_tempest_plugin.tests.scenario.test_aodh_alarm] check_instance_count exp:2, act:1
}}}

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/heat_tempest_plugin/tests/scenario/test_aodh_alarm.py", line 64, in test_alarm
    120, 2, self.check_instance_count, stack_identifier, 2))
  File "/usr/lib/python3.6/site-packages/unittest2/case.py", line 705, in assertTrue
    raise self.failureException(msg)
AssertionError: False is not true

Version-Release number of selected component (if applicable):


How reproducible:
Rune regression tempest job on heat

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Rabi Mishra 2019-05-30 08:52:51 UTC
This seems python3 issue for aodh (member list has b' as they are converted to bytes in python3) , as you can see in the traceback. Alarm evaluators can't join the group, hence alarms are not evaluated.

I don't know whether we still support aodh in OSP15. Anyway, this should go to DFG:MetMon

: 3a2912f6-031f-45f2-9bc4-0ef0e1b49845 extract_my_subset /usr/lib/python3.6/site-packages/aodh/coordination.py:227
2019-05-30 07:41:43.998 20 WARNING aodh.coordination [-] Cannot extract tasks because agent failed to join group properly. Rejoining group.
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator [-] alarm evaluation cycle failed: aodh.coordination.MemberNotInGroupError: Group ID: alarm_evaluator, Members: {b'3a2912f6-031f-45f2-9bc4-0ef0e1b49845', b'd58cfc59-daaf-4b7d-8476-19f7e18ddb77', b'0d5e8674-1c2e-4884-910f-887dd62fc53c'}, Me: 3a2912f6-031f-45f2-9bc4-0ef0e1b49845: Current agent is not part of group and cannot take tasks
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator Traceback (most recent call last):
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/aodh/evaluator/__init__.py", line 249, in _evaluate_assigned_alarms
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     alarms = self._assigned_alarms()
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/aodh/evaluator/__init__.py", line 278, in _assigned_alarms
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     self.PARTITIONING_GROUP_NAME, all_alarm_ids)
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     return self.call(f, *args, **kw)
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     do = self.iter(retry_state=retry_state)
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 331, in iter
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     raise retry_exc.reraise()
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 167, in reraise
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     raise self.last_attempt.result()
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     return self.__get_result()
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     raise self._exception
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     result = fn(*args, **kwargs)
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator   File "/usr/lib/python3.6/site-packages/aodh/coordination.py", line 234, in extract_my_subset
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator     raise MemberNotInGroupError(group_id, members, self._my_id)
2019-05-30 07:41:44.004 20 ERROR aodh.evaluator aodh.coordination.MemberNotInGroupError: Group ID: alarm_evaluator, Members: {b'3a2912f6-031f-45f2-9bc4-0ef0e1b49845', b'd58cfc59-daaf-4b7d-8476-19f7e18ddb77', b'0d5e8674-1c2e-4884-910f-887dd62fc53c'}, Me: 3a2912f6-031f-45f2-9bc4-0ef0e1b49845: Current agent is not part of group and cannot take tasks


I did fix this issue in your setup by encoding the uuids. But then it failed with another error.


2019-05-30 08:38:37.181 19 DEBUG aodh.coordination [-] My subset: ['0fa4768d-6793-4bb7-9410-db47f807ebdf'] extract_my_subset /usr/lib/python3.6/site-packages/aodh/coordination.py:240
2019-05-30 08:38:37.181 19 INFO aodh.evaluator [-] initiating evaluation cycle on 1 alarms
2019-05-30 08:38:37.181 19 DEBUG aodh.evaluator [-] evaluating alarm 0fa4768d-6793-4bb7-9410-db47f807ebdf _evaluate_alarm /usr/lib/python3.6/site-packages/aodh/evaluator/__init__.py:263
2019-05-30 08:38:37.182 19 DEBUG aodh.evaluator.threshold [-] query stats from 2019-05-30 08:36:37.182117 to 2019-05-30 08:38:37.182117 _bound_duration /usr/lib/python3.6/site-packages/aodh/evaluator/threshold.py:71
2019-05-30 08:38:37.726 19 WARNING aodh.evaluator.gnocchi [-] alarm statistics retrieval failed: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator at 
 [no address given] to inform them of the time this error occurred,
 and the actions you performed just before this error.</p>
<p>More information about this error may be available
in the server error log.</p>
</body></html>
 (HTTP 500): gnocchiclient.exceptions.ClientException: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
2019-05-30 08:38:37.727 19 DEBUG aodh.queue [-] alarm 0fa4768d-6793-4bb7-9410-db47f807ebdf has no action configured for state transition from insufficient data to state insufficient data, skipping the notification. notify /usr/lib/python3.6/site-packages/aodh/queue.py:48

Which tells me that there is some config issue as it can't do a alarm statistics retrieval.

Comment 4 Martin Magr 2019-06-17 14:37:45 UTC

*** This bug has been marked as a duplicate of bug 1716350 ***


Note You need to log in before you can comment on or make changes to this bug.