Bug 1196853

Summary: AWS API availability Issues can cause log thrashing
Product: Red Hat CloudForms Management Engine Reporter: Greg Blomquist <gblomqui>
Component: ProvidersAssignee: Matthew Draper <mdraper>
Status: CLOSED DUPLICATE QA Contact: Dave Johnson <dajohnso>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.4.0CC: dclarizi, gblomqui, jfrey, jhardy, jvlcek, obarenbo
Target Milestone: GA   
Target Release: 5.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-19 15:01:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Greg Blomquist 2015-02-26 23:19:13 UTC
As reported in github issues: https://github.com/ManageIQ/manageiq/issues/1871

Except from github issue:


EC2 availability issues caused our MIQ server to flood the logs with error messages and then crash itself because MIQ kept getting AWS API errors.

AWS Status Issue

> Amazon Elastic Compute Cloud (N. Virginia)  Increased API Error Rates     less 
> 12:22 PM PST We are investigating increased API error rates in the US-EAST-1 Region.
> 12:53 PM PST We can confirm increased API error rates for EC2 APIs in the US-EAST-1 Region. Running instances are not impacted.
> 1:58 PM PST We continue to work toward resolving increased API error rates for EC2 APIs in the US-EAST-1 Region. Running instances are not impacted.
> 2:51 PM PST We are beginning to see recovery for EC2 API error rates in the US-EAST-1 Region.

EVM logs

> [----] E, [2015-02-24T20:47:30.038345 #20069:80beac] ERROR -- : /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture/amazon.rb:79:in `block (2 levels) in perf_capture_data_amazon'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture/amazon.rb:79:in `collect'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture/amazon.rb:79:in `block in perf_capture_data_amazon'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture/amazon.rb:78:in `each'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture/amazon.rb:78:in `perf_capture_data_amazon'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture/amazon.rb:13:in `perf_collect_metrics_amazon'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture.rb:11:in `perf_collect_metrics'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture.rb:161:in `block in perf_capture'
> /opt/manageiq/lib/util/extensions/miq-benchmark.rb:12:in `realtime_store'
> /opt/manageiq/lib/util/extensions/miq-benchmark.rb:31:in `realtime_block'
> /opt/manageiq/vmdb/app/models/metric/ci_mixin/capture.rb:158:in `perf_capture'
> /opt/manageiq/vmdb/app/models/miq_queue.rb:360:in `block in deliver'
> /home/miqbuilder/.rvm/rubies/ruby-2.0.0-p598/lib/ruby/2.0.0/timeout.rb:66:in `timeout'
> /opt/manageiq/vmdb/app/models/miq_queue.rb:356:in `deliver'
> /opt/manageiq/vmdb/lib/workers/queue_worker_base.rb:107:in `deliver_queue_message'
> /opt/manageiq/vmdb/lib/workers/queue_worker_base.rb:135:in `deliver_message'
> /opt/manageiq/vmdb/lib/workers/queue_worker_base.rb:152:in `block in do_work'
> /opt/manageiq/vmdb/lib/workers/queue_worker_base.rb:146:in `loop'
> /opt/manageiq/vmdb/lib/workers/queue_worker_base.rb:146:in `do_work'
> /opt/manageiq/vmdb/lib/workers/worker_base.rb:317:in `block in do_work_loop'
> /opt/manageiq/vmdb/lib/workers/worker_base.rb:314:in `loop'
> /opt/manageiq/vmdb/lib/workers/worker_base.rb:314:in `do_work_loop'
> /opt/manageiq/vmdb/lib/workers/worker_base.rb:135:in `run'
> /opt/manageiq/vmdb/lib/workers/worker_base.rb:123:in `start'
> /opt/manageiq/vmdb/lib/workers/worker_base.rb:23:in `start_worker'
> /opt/manageiq/vmdb/lib/workers/bin/worker.rb:5:in `<top (required)>'
> /home/miqbuilder/.rvm/gems/ruby-2.0.0-p598/bundler/gems/rails-8f014fba21f9/railties/lib/rails/commands/runner.rb:52:in `eval'
> /home/miqbuilder/.rvm/gems/ruby-2.0.0-p598/bundler/gems/rails-8f014fba21f9/railties/lib/rails/commands/runner.rb:52:in `<top (required)>'
> /home/miqbuilder/.rvm/gems/ruby-2.0.0-p598/bundler/gems/rails-8f014fba21f9/railties/lib/rails/commands.rb:64:in `require'
> /home/miqbuilder/.rvm/gems/ruby-2.0.0-p598/bundler/gems/rails-8f014fba21f9/railties/lib/rails/commands.rb:64:in `<top (required)>'
> script/rails:30:in `require'
> script/rails:30:in `<main>'
> [----] E, [2015-02-24T20:47:30.038694 #20069:80beac] ERROR -- : MIQ(MiqQueue.deliver)    Message id: [4135889], Error: [undefined method `keys' for nil:NilClass]
> [----] E, [2015-02-24T20:47:30.038828 #20069:80beac] ERROR -- : [NoMethodError]: undefined method `keys' for nil:NilClass  Method:[rescue in deliver]


AWS logs

> [----] W, [2015-02-24T20:48:50.925473 #20086:31b8994]  WARN -- : MIQ(AmazonEventMonitor#find_or_create_queue) Unable to find the AWS Config Topic. Cannot collect Amazon events for AWS Access Key ID MYAWSACCESSKEYWASHEREHAHA
> [----] W, [2015-02-24T20:48:50.925526 #20086:31b8994]  WARN -- : MIQ(AmazonEventMonitor#find_or_create_queue) Contact Amazon to create the AWS Config service and topic for Amazon events.
> ... Repeated endlessly

******************************************************

There are a couple issues here:

1) We should figure out what's going on with the "undefined method `keys' for nil:NilClass" during perf_capture_data_amazon

2) When the event catcher worker for Amazon fails to start because the event monitor is unable to reach the Config Service, we should avoid flooding the logs with the useless message that the service is not configured.  It could be logged once, and that should be enough.

Comment 1 Joe Vlcek 2015-08-19 15:01:08 UTC

*** This bug has been marked as a duplicate of bug 1250444 ***