Bug 1496838

Summary: [PRD][RFE][Alerts] Add CloudForms Alerts for OpenShift Provider based on Hourly Timer
Product: Red Hat CloudForms Management Engine Reporter: Beni Paskin-Cherniavsky <cben>
Component: ControlAssignee: Greg McCullough <gmccullo>
Status: CLOSED ERRATA QA Contact: juwatts
Severity: medium Docs Contact:
Priority: medium    
Version: 5.8.0CC: agrare, cben, dlamotta, gblomqui, jhardy, jnovotni, jocarter, lavenel, obarenbo, oourfali, simaishi, smallamp
Target Milestone: GAKeywords: FutureFeature, Reopened, RFE
Target Release: 5.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: container
Fixed In Version: 5.10.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-07 23:02:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: Container Management Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1555371    
Attachments:
Description Flags
an exported alert that doesn't work none

Description Beni Paskin-Cherniavsky 2017-09-28 14:47:59 UTC
Created attachment 1332025 [details]
an exported alert that doesn't work

Description of problem:
Closely related to bug 1494599. That BZ focuses on alerts with Driving Event being an event from openshift.

This BZ is about ability to define alerts for openshift enties with:
Driving Event: Hourly Timer
What to Evaluate: Expression


Unfortunately, such alerts are already possible to define in UI for container Nodes, but this was never really implemented or tested, and they don't work.
https://access.redhat.com/support/cases/#/case/01925440 is an example customers they tried creating such alert (not sure if they need them or are just experimenting).

Version-Release number of selected component (if applicable):
5.8.1.5-20170725160636_e433fc0

Steps to Reproduce:
1. create such alert (attaching exported yaml)
2. assign to alert profile assigned to The Enterprise

Actual results:
periodic [NoMethodError]: undefined method `container_nodes' for #<Zone:0x0000000af8bf88> error
https://bugzilla.redhat.com/show_bug.cgi?id=1494599#c3 has full stacktrace

Expected results:
alert evaluated periodically

Comment 2 Dave Johnson 2017-09-28 15:04:09 UTC
Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.

Comment 3 Beni Paskin-Cherniavsky 2017-09-30 21:47:35 UTC
Setting severity medium because it's looks like it works in UI but doesn't.

Comment 6 Ari Zellner 2018-01-29 13:48:55 UTC
https://github.com/ManageIQ/manageiq/pull/16902

Comment 12 juwatts 2018-09-25 15:53:02 UTC
Verified in 5.10.0.16.20180919151347_a0c9e02

Verification Steps:
1) Navigated to Control->Explorer
2) Created a new alert with the following:
Description: OCP Node CPU > 0
Active: Checked
Based On: Container Node
What to evaluate: Expression
Notification Frequency: 1 Hour
Expression: Container Node : Cpu Usage Rate Average Avg Over Time Period > 0
Send an E-mail: checked
from: cfadmin
To:  <my email address>

3) Clicked Save
4) Navigated to Alert Profiles, created a new profile and assigned it to "The Enterprise"

Verified the alert was created:
[----] I, [2018-09-25T09:13:58.839388 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-compute2-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [5]
[----] I, [2018-09-25T09:13:58.864617 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-compute2-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [5] Queuing evaluation of Alert: [OSE Node CPU > 0]
[----] I, [2018-09-25T09:13:58.872602 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-infra-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [6]
[----] I, [2018-09-25T09:13:58.879234 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-infra-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [6] Queuing evaluation of Alert: [OSE Node CPU > 0]
[----] I, [2018-09-25T09:13:58.886662 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-compute1-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [4]
[----] I, [2018-09-25T09:13:58.893708 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-compute1-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [4] Queuing evaluation of Alert: [OSE Node CPU > 0]
[----] I, [2018-09-25T09:13:58.902050 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [7]
[----] I, [2018-09-25T09:13:58.909385 #11969:129cf88]  INFO -- : MIQ(MiqAlert.evaluate_alerts) [_hourly_timer_] Target: ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode Name: [epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com], Id: [7] Queuing evaluation of Alert: [OSE Node CPU > 0]


Verified the the expression was evaluated every hour for 2 hours and the alert was triggered:

[root@dhcp-8-197-248 log]# cat evm.log | grep -i Result:
[----] I, [2018-09-25T09:14:04.567136 #11961:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-compute2-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T09:14:04.610162 #11961:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-infra-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T09:14:04.633313 #11961:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-compute1-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T09:14:04.655535 #11961:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T10:14:10.028526 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-compute2-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T10:14:10.069535 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-infra-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T10:14:10.091800 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-compute1-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T10:14:10.112144 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T11:14:16.912056 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-infra-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T11:14:16.957084 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-compute1-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T11:14:16.960319 #11961:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-compute2-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[----] I, [2018-09-25T11:14:16.997464 #11969:129cf88]  INFO -- : MIQ(MiqAlert#evaluate) Evaluating Alert [OSE Node CPU > 0] for target: [epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com]... Result: [true]
[root@dhcp-8-197-248 log]# 

Example of the alert trigger from the log:

[----] I, [2018-09-25T11:14:26.620780 #11961:129cf88]  INFO -- : MIQ(GenericMailer#prepare_generic_email) options: {:to=>"juwatts", :from=>"cfadmin", :subject=>"Alert Triggered: OSE Node CPU > 0, for (MANAGEIQ::PROVIDERS::KUBERNETES::CONTAINERMANAGER::CONTAINERNODE) epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com", :miq_action_hash=>{:header=>"Alert Triggered", :policy_detail=>"Alert 'OSE Node CPU > 0', triggered", :event_description=>"Alert condition met", :event_details=>nil, :entity_type=>"ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode", :entity_name=>"epacific-ocp-master-v3.cmqe.lab.eng.rdu2.redhat.com"}, :miq_task_id=>nil, :sent_on=>2018-09-25 11:14:26 -0400, :attachment=>[]}

Comment 13 errata-xmlrpc 2019-02-07 23:02:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0212