Bug 1610449 - CloudForms tries to collect metrics from OCP despite not being configured for it
Summary: CloudForms tries to collect metrics from OCP despite not being configured for it
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.9.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: GA
: 5.10.0
Assignee: Greg Blomquist
QA Contact: Matouš Mojžíš
URL:
Whiteboard:
Depends On:
Blocks: 1618805
TreeView+ depends on / blocked
 
Reported: 2018-07-31 15:52 UTC by David Luong
Modified: 2021-12-10 16:50 UTC (History)
15 users (show)

Fixed In Version: 5.10.0.11
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1618805 (view as bug list)
Environment:
Last Closed: 2019-02-12 16:52:11 UTC
Category: Bug
Cloudforms Team: CFME Core
Target Upstream Version:
Embargoed:
izapolsk: automate_bug+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3548701 0 None None None 2018-07-31 17:00:35 UTC

Description David Luong 2018-07-31 15:52:54 UTC
Description of problem:
Podified CloudForms tries to collect metrics from OCP Providers, even if they have metrics disabled.

Version-Release number of selected component (if applicable):
5.9.3

How reproducible:
Always

Steps to Reproduce:
1.  Add OCP Provider
2.  Turn on C&U roles
3.  Wait until appliance performs metrics collection

Actual results:
CloudForms tries to collect metrics, filling up to 9 GB of data worth of logs in one day.

Expected results:
CloudForms does not collect metrics on container provider not configured for it

Additional info:
[----] I, [2018-07-31T15:42:37.661824 #774:d1d10c]  INFO -- : MIQ(ManageIQ::Providers::Openshift::ContainerManager::MetricsCollectorWorker::Runner#get_message_via_drb) Message id: [259], MiqWorker id: [14], Zone: [default], Role: [ems_metrics_collector], Server: [], Ident: [openshift], Target id: [], Instance id: [10], Task id: [], Command: [ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode.perf_capture_realtime], Timeout: [600], Priority: [100], State: [dequeue], Deliver On: [], Data: [], Args: [2018-07-31 00:00:00 UTC], Dequeued in: [27.914560916] seconds
[----] I, [2018-07-31T15:42:37.662860 #774:d1d10c]  INFO -- : MIQ(MiqQueue#deliver) Message id: [259], Delivering...
[----] I, [2018-07-31T15:42:37.843357 #782:d1d10c]  INFO -- : MIQ(ManageIQ::Providers::Openshift::ContainerManager::MetricsCollectorWorker::Runner#get_message_via_drb) Message id: [260], MiqWorker id: [15], Zone: [default], Role: [ems_metrics_collector], Server: [], Ident: [openshift], Target id: [], Instance id: [9], Task id: [], Command: [ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode.perf_capture_realtime], Timeout: [600], Priority: [100], State: [dequeue], Deliver On: [], Data: [], Args: [2018-07-31 00:00:00 UTC], Dequeued in: [28.07026418] seconds
[----] I, [2018-07-31T15:42:37.844282 #782:d1d10c]  INFO -- : MIQ(MiqQueue#deliver) Message id: [260], Delivering...
[----] I, [2018-07-31T15:42:37.902635 #774:d1d10c]  INFO -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode#just_perf_capture) [realtime] Capture for ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode name: [node-5.cloudforms.lab.rdu2.cee.redhat.com], id: [10], start_time: [2018-07-31 00:00:00 UTC]...
[----] I, [2018-07-31T15:42:38.191383 #782:d1d10c]  INFO -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode#just_perf_capture) [realtime] Capture for ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode name: [node-4.cloudforms.lab.rdu2.cee.redhat.com], id: [9], start_time: [2018-07-31 00:00:00 UTC]...
[----] I, [2018-07-31T15:42:38.443311 #774:d1d10c]  INFO -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture#perf_collect_metrics) Collecting metrics for ContainerNode(10) [realtime] [2018-07-31 00:00:00 UTC] []
[----] W, [2018-07-31T15:42:38.550293 #774:d1d10c]  WARN -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture#perf_collect_metrics) [ContainerNode(10)] no metrics endpoint found for ContainerNode(10)
[----] I, [2018-07-31T15:42:38.608412 #774:d1d10c]  INFO -- : Exception in realtime_block :total_time - Timings: {:capture_state=>0.5117576122283936, :total_time=>0.7051336765289307}
[----] E, [2018-07-31T15:42:38.609509 #774:d1d10c] ERROR -- : MIQ(MiqQueue#deliver) Message id: [259], Error: [no metrics endpoint found for ContainerNode(10)]
[----] E, [2018-07-31T15:42:38.609970 #774:d1d10c] ERROR -- : [ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture::TargetValidationWarning]: no metrics endpoint found for ContainerNode(10)  Method:[block in method_missing]
[----] E, [2018-07-31T15:42:38.610163 #774:d1d10c] ERROR -- : /opt/rh/cfme-gemset/bundler/gems/cfme-providers-kubernetes-1748f9b993cf/app/models/manageiq/providers/kubernetes/container_manager/metrics_capture.rb:102:in `perf_collect_metrics'
/var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:6:in `perf_collect_metrics'
/var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:193:in `block in just_perf_capture'
/opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-3457c5b58220/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-3457c5b58220/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block'
/var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:189:in `just_perf_capture'
/var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:135:in `perf_capture'
/var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:117:in `perf_capture_realtime'
/var/www/miq/vmdb/app/models/miq_queue.rb:449:in `block in dispatch_method'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:448:in `dispatch_method'
/var/www/miq/vmdb/app/models/miq_queue.rb:425:in `block in deliver'
/var/www/miq/vmdb/app/models/user.rb:261:in `with_user_group'
/var/www/miq/vmdb/app/models/miq_queue.rb:425:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:104:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:152:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:329:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:127:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:22:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:376:in `block in start_runner_via_fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:374:in `start_runner_via_fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:368:in `start_runner'
/var/www/miq/vmdb/app/models/miq_worker.rb:415:in `start'
/var/www/miq/vmdb/app/models/miq_worker.rb:266:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:153:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_worker.rb:153:in `times'
/var/www/miq/vmdb/app/models/miq_worker.rb:153:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:53:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:22:in `monitor_workers'
/var/www/miq/vmdb/app/models/miq_server.rb:338:in `block in monitor'
/opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-3457c5b58220/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-3457c5b58220/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:338:in `monitor'
/var/www/miq/vmdb/app/models/miq_server.rb:377:in `block (2 levels) in monitor_loop'
/opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-3457c5b58220/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-3457c5b58220/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:377:in `block in monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:376:in `loop'
/var/www/miq/vmdb/app/models/miq_server.rb:376:in `monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:239:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:27:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:48:in `start'
/var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>'
[----] I, [2018-07-31T15:42:38.611092 #774:d1d10c]  INFO -- : MIQ(MiqQueue#delivered) Message id: [259], State: [error], Delivered in [0.948315115] seconds
[----] I, [2018-07-31T15:42:38.650455 #774:d1d10c]  INFO -- : MIQ(ManageIQ::Providers::Openshift::ContainerManager::MetricsCollectorWorker::Runner#get_message_via_drb) Message id: [261], MiqWorker id: [14], Zone: [default], Role: [ems_metrics_collector], Server: [], Ident: [openshift], Target id: [], Instance id: [8], Task id: [], Command: [ManageIQ::Providers::Kubernetes::ContainerManager::ContainerNode.perf_capture_realtime], Timeout: [600], Priority: [100], State: [dequeue], Deliver On: [], Data: [], Args: [2018-07-31 00:00:00 UTC], Dequeued in: [28.859538532] seconds

Comment 2 Alex Mayberry 2018-07-31 20:59:52 UTC
For what it's worth.

I'm seeing this on an appliance version of CFME.
5.9.0.22.20180221205805_f93a675

Comment 3 David Luong 2018-07-31 22:00:17 UTC
Changing component from pod to appliance due to new information from comment 2 and customer case update.

Comment 4 Joe Rafaniello 2018-08-07 21:52:01 UTC
Bronagh, this looks to be related to the provider configuration and enabling of cap and u.  Can you take a look?  Thank you.

Comment 9 CFME Bot 2018-08-09 15:01:25 UTC
New commit detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/d78c0766e438d6d1d6157546fefe54cafc297699
commit d78c0766e438d6d1d6157546fefe54cafc297699
Author:     Adam Grare <agrare>
AuthorDate: Wed Aug  8 10:54:54 2018 -0400
Commit:     Adam Grare <agrare>
CommitDate: Wed Aug  8 10:54:54 2018 -0400

    Don't queue metrics capture if metrics unsupported

    If metrics capture is unsupported by the provider then do not queue
    perf_capture for targets on that EMS.

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1610449

 app/models/manageiq/providers/container_manager.rb | 5 +
 app/models/metric/targets.rb | 2 +
 2 files changed, 7 insertions(+)

Comment 12 Matouš Mojžíš 2018-10-22 17:14:24 UTC
Verified in 5.10.0.20. There is no information in logs about collecting metrics(enabled C&U roles) from OCP provider which has metrics disabled.


Note You need to log in before you can comment on or make changes to this bug.