Bug 1494599 - [PRD][RFE] Add CloudForms Alerts for OpenShift Provider based on Kubernetes Event
Summary: [PRD][RFE] Add CloudForms Alerts for OpenShift Provider based on Kubernetes E...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.8.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: GA
: 5.10.0
Assignee: Oved Ourfali
QA Contact: Einat Pacifici
URL:
Whiteboard:
: 1438002 (view as bug list)
Depends On:
Blocks: 1480786 1511957
TreeView+ depends on / blocked
 
Reported: 2017-09-22 15:20 UTC by Loic Avenel
Modified: 2018-03-26 10:51 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-26 10:51:34 UTC
Category: Feature
Cloudforms Team: Container Management
Target Upstream Version:


Attachments (Terms of Use)

Description Loic Avenel 2017-09-22 15:20:13 UTC
Description of problem: CloudForms should allow Customers to create Alerts based Kubernetes Events 

Alerts should be based on the following category of Events:

Pods
Containers
Container Nodes

For each Category it should be possible to define the Frequency of notification and custom expression for filtering

Alert Profile Assignment should offer:

Pods: Providers, tagged Providers,Tagged Project
Containers: Providers, tagged Providers, Project, Tagged Project
Containers Nodes: Providers, tagged Providers, Nodes, Tagged Nodes

Comment 2 Dave Johnson 2017-09-22 15:44:49 UTC
Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.

Comment 3 Beni Paskin-Cherniavsky 2017-09-26 11:46:50 UTC
UI does allow creating regular Node alerts, What to Evaluate: Expression (custom), Driving Event: Hourly timer.

Such alert doesn't work.  Log of queuing the evaluation:

[----] I, [2017-09-14T04:44:33.455577 #5795:737134]  INFO -- : MIQ(MiqScheduleWorker::Runner#do_work) Number of scheduled items to be processed: 2.
[----] I, [2017-09-14T04:44:33.462695 #5795:737134]  INFO -- : MIQ(MiqQueue.put) Message id: [101000021485536],  id: [], Zone: [default], Role: [], Server: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [MiqAlert.evaluate_hourly_timer], Timeout: [600], Priority: [90], State: [ready], Deliver On: [], Data: [], Args: []

and stacktrace of evaluation crashing:

[----] I, [2017-09-14T04:44:36.582339 #22039:737134]  INFO -- : MIQ(MiqQueue#deliver) Message id: [101000021485536], Delivering...
[----] I, [2017-09-14T04:44:36.582666 #22039:737134]  INFO -- : MIQ(MiqAlert.evaluate_hourly_timer) Starting
[----] E, [2017-09-14T04:44:36.644221 #22039:737134] ERROR -- : MIQ(MiqQueue#deliver) Message id: [101000021485536], Error: [undefined method `container_nodes' for #<Zone:0x0000000af8bf88>]
[----] E, [2017-09-14T04:44:36.644701 #22039:737134] ERROR -- : [NoMethodError]: undefined method `container_nodes' for #<Zone:0x0000000af8bf88>  Method:[rescue in deliver]
[----] E, [2017-09-14T04:44:36.645494 #22039:737134] ERROR -- : /opt/rh/cfme-gemset/gems/activemodel-5.0.3/lib/active_model/attribute_methods.rb:433:in `method_missing'
/var/www/miq/vmdb/app/models/miq_alert.rb:137:in `public_send'
/var/www/miq/vmdb/app/models/miq_alert.rb:137:in `block (2 levels) in evaluate_hourly_timer'
/var/www/miq/vmdb/app/models/miq_alert.rb:131:in `each'
/var/www/miq/vmdb/app/models/miq_alert.rb:131:in `block in evaluate_hourly_timer'
/var/www/miq/vmdb/app/models/miq_alert.rb:130:in `each'
/var/www/miq/vmdb/app/models/miq_alert.rb:130:in `evaluate_hourly_timer'
/var/www/miq/vmdb/app/models/miq_queue.rb:347:in `block in deliver'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:343:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:107:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:135:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:153:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:147:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:147:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:340:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:337:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:337:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:160:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:134:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:21:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:339:in `block in start_runner'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
...


This all from 5.8.1, found in customer logs from bug 1489616.  I have DB dump restored locally, if anyone wants to debug further.
Didn't test on master.

Comment 4 Loic Avenel 2017-09-26 21:17:30 UTC
I was looking at different parameters:

Based: Node:
What to Evaluate: Nothing
Driving Event: Kubernetes Event..

I think the issue is not in the definition of the Alert BUT in the Alert Profiles, you cannot add a provider to a Node or a Provider, these are not available for Node Alert Profiles.

We can try at Provider level

Comment 5 Beni Paskin-Cherniavsky 2017-09-27 07:29:07 UTC
Sorry, maybe I commented on wrong BZ (or should open new one)?

What I'm talking about are "good old" MiqExpression-based periodically evaluated alerts, not externally evaluated ones.  We only worked on external alerts for container providers, but didn't make sure internally evaluated MiqExpression-based alerts work in the backend.  AFAIK we didn't intend UI to allow those, it's accidental side effect of allowing externally evaluated Node alerts.
(Not sure about profile UI, but I see in customer DB they did define MiqExpression Node alerts, and log proves it tries to run, with error.)

Error in traceback sounds very easy to fix (add few lines in zone.rb so it can find targets to evaluate).
Of course that could reveal more work to do. (Specifically, I heard if expression looks at metrics, there is a complex mechanism to collect those metrics frequently, and that's something we wanted to neglect in upcoming effort to scale metric collection?)

The other option is forbid these in UI for now.

Comment 8 Dave Johnson 2017-09-28 15:04:05 UTC
Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.

Comment 10 Dave Johnson 2017-09-28 15:45:02 UTC
Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.

Comment 11 Loic Avenel 2018-01-25 13:28:45 UTC
*** Bug 1438002 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.