Bug 1416918

Summary: CloudForms can no longer connect to Hawkular metrics in OpenShift
Product: Red Hat CloudForms Management Engine Reporter: myoder
Component: C&U Capacity and UtilizationAssignee: Ari Zellner <azellner>
Status: CLOSED CURRENTRELEASE QA Contact: Einat Pacifici <epacific>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.7.0CC: cpelland, epacific, erich, fsimonce, jhardy, mtayer, myoder, obarenbo, simaishi, snaim
Target Milestone: GAKeywords: TestOnly
Target Release: 5.7.2Flags: snaim: automate_bug+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: container
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-08 19:55:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: Bug
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: Container Management Target Upstream Version:
Embargoed:

Description myoder 2017-01-26 19:19:26 UTC
Description of problem:

Getting the following error repeatedly in the logs:

[----] E, [2017-01-26T03:46:05.981866 #17175:e11130] ERROR -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture#perf_collect_metrics) Hawkular metrics service unavailable: Failed to perform operation due to an error: All host(s) tried for query failed (no host was tried)

The evm.log has about 350k lines, and this Hawkular metrics service error appears about 140k times.  The evm.log contains about 8.5 hours of logs.


In the Web UI I am seeing "Credential validation was not successful: Failed to perform operation due to an error: All host(s) tried for query failed (no host was tried)



Also seeing two stack trace errors related to the refresh:

[----] E, [2017-01-26T03:46:07.961636 #12423:e11130] ERROR -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager::Refresher#refresh) EMS: [OpenShift Enterprise (Production)], id: [123000000000001] Refresh failed
[----] E, [2017-01-26T03:46:07.962865 #12423:e11130] ERROR -- : [NoMethodError]: undefined method `[]' for nil:NilClass  Method:[rescue in block in refresh]
[----] E, [2017-01-26T03:46:07.963082 #12423:e11130] ERROR -- : /var/www/miq/vmdb/app/models/manageiq/providers/openshift/container_manager/refresh_parser.rb:193:in `parse_openshift_image'
/var/www/miq/vmdb/app/models/manageiq/providers/openshift/container_manager/refresh_parser.rb:17:in `block in get_openshift_images'
/var/www/miq/vmdb/app/models/manageiq/providers/kubernetes/container_manager/refresh_parser.rb:136:in `process_collection_item'
/var/www/miq/vmdb/app/models/manageiq/providers/kubernetes/container_manager/refresh_parser.rb:130:in `block in process_collection'
/opt/rh/rh-ruby23/root/usr/share/ruby/delegate.rb:341:in `each'
/opt/rh/rh-ruby23/root/usr/share/ruby/delegate.rb:341:in `block in delegating_block'
/var/www/miq/vmdb/app/models/manageiq/providers/kubernetes/container_manager/refresh_parser.rb:130:in `process_collection'
/var/www/miq/vmdb/app/models/manageiq/providers/openshift/container_manager/refresh_parser.rb:17:in `get_openshift_images'
/var/www/miq/vmdb/app/models/manageiq/providers/openshift/container_manager/refresh_parser.rb:11:in `ems_inv_to_hashes'
/var/www/miq/vmdb/app/models/manageiq/providers/kubernetes/container_manager/refresh_parser.rb:7:in `ems_inv_to_hashes'
/var/www/miq/vmdb/app/models/manageiq/providers/openshift/container_manager/refresher.rb:24:in `parse_legacy_inventory'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:122:in `block in parse_targeted_inventory'
/var/www/miq/vmdb/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/var/www/miq/vmdb/gems/pending/util/extensions/miq-benchmark.rb:30:in `realtime_block'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:122:in `parse_targeted_inventory'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:87:in `block in refresh_targets_for_ems'
/var/www/miq/vmdb/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/var/www/miq/vmdb/gems/pending/util/extensions/miq-benchmark.rb:30:in `realtime_block'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:86:in `refresh_targets_for_ems'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:24:in `block (2 levels) in refresh'
/var/www/miq/vmdb/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/var/www/miq/vmdb/gems/pending/util/extensions/miq-benchmark.rb:30:in `realtime_block'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:24:in `block in refresh'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:14:in `each'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:14:in `refresh'
/var/www/miq/vmdb/app/models/manageiq/providers/base_manager/refresher.rb:10:in `refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:91:in `block in refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:90:in `each'
/var/www/miq/vmdb/app/models/ems_refresh.rb:90:in `refresh'
/var/www/miq/vmdb/app/models/miq_queue.rb:347:in `block in deliver'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:343:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:106:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:152:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:334:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:331:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:331:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:128:in `start'


[----] E, [2017-01-26T03:46:07.963151 #12423:e11130] ERROR -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager::Refresher#refresh) EMS: [OpenShift Enterprise (Production)], id: [123000000000001] 
Unable to perform refresh for the following targets:
[----] E, [2017-01-26T03:46:07.963250 #12423:e11130] ERROR -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager::Refresher#refresh)  --- ManageIQ::Providers::OpenshiftEnterprise::ContainerManager 
[OpenShift Enterprise (Production)] id [123000000000001]
[----] E, [2017-01-26T03:46:07.983341 #12423:e11130] ERROR -- : MIQ(MiqQueue#deliver) Message id: [123000022269693], Error: [undefined method `[]' for nil:NilClass]
[----] E, [2017-01-26T03:46:07.983510 #12423:e11130] ERROR -- : [EmsRefresh::Refreshers::EmsRefresherMixin::PartialRefreshError]: undefined method `[]' for nil:NilClass  Method:[rescue in deliver]
[----] E, [2017-01-26T03:46:07.983655 #12423:e11130] ERROR -- : /var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:50:in `refresh'
/var/www/miq/vmdb/app/models/manageiq/providers/base_manager/refresher.rb:10:in `refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:91:in `block in refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:90:in `each'
/var/www/miq/vmdb/app/models/ems_refresh.rb:90:in `refresh'
/var/www/miq/vmdb/app/models/miq_queue.rb:347:in `block in deliver'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:343:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:106:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:152:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:334:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:331:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:331:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:128:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:21:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:343:in `block in start'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:341:in `start'
/var/www/miq/vmdb/app/models/miq_worker.rb:270:in `start_worker'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:68:in `start_worker_for_ems'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:46:in `block in sync_workers'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:45:in `each'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:45:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:52:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:22:in `monitor_workers'



Version-Release number of selected component (if applicable):
CFME 5.7.0.17

How reproducible: 
Since 4am this morning


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 11 Pavel Zagalsky 2017-03-27 14:50:38 UTC
Verified on 5.7.2