Bug 1429020 - An exception in a worker's sync_workers can cause the server process to exit with fatal error
Summary: An exception in a worker's sync_workers can cause the server process to exit ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: GA
: 5.8.0
Assignee: Joe Rafaniello
QA Contact: Tasos Papaioannou
URL:
Whiteboard: worker
Depends On:
Blocks: 1429648
TreeView+ depends on / blocked
 
Reported: 2017-03-03 20:48 UTC by Joe Rafaniello
Modified: 2020-09-10 10:16 UTC (History)
6 users (show)

Fixed In Version: 5.8.0.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1429648 (view as bug list)
Environment:
Last Closed: 2017-06-12 16:23:24 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Joe Rafaniello 2017-03-03 20:48:06 UTC
Description of problem:  If a specific worker's sync_workers
raises an exception, the server process exits fatally and
all workers exit with 'Error heartbeating to MiqServer because DRb::DRbConnError: Connection reset by peer Worker exiting.'  The server process should not exit with a fatal error if a worker class fails to sync_workers.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Joe Rafaniello 2017-03-03 20:49:08 UTC
Merged upstream in: https://github.com/ManageIQ/manageiq/pull/13976

Comment 3 Joe Rafaniello 2017-03-03 20:49:27 UTC
Related to https://github.com/ManageIQ/manageiq/issues/13958

Comment 5 Tasos Papaioannou 2017-06-02 16:37:28 UTC
Verified on 5.8.0.17.

Forced an exception by editing /var/www/miq/vmdb/app/models/miq_reporting_worker.rb and changing self.required_roles to an invalid type, e.g.,

  # self.required_roles       = ["reporting"]
  self.required_roles = 0

/var/www/miq/vmdb/log/evm.log shows the error and backtrace:

****
[----] E, [2017-06-02T12:14:34.917810 #2596:12c1130] ERROR -- : MIQ(MiqServer#sync_workers) Failed to sync_workers for class: MiqReportingWorker
[----] E, [2017-06-02T12:14:34.918088 #2596:12c1130] ERROR -- : [RuntimeError]: Unexpected type: <self.required_roles.class.name>  Method:[rescue in block in sync_workers]
[----] E, [2017-06-02T12:14:34.918241 #2596:12c1130] ERROR -- : /var/www/miq/vmdb/app/models/miq_worker.rb:133:in `has_required_role?'
/var/www/miq/vmdb/app/models/miq_worker.rb:144:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:53:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:22:in `monitor_workers'
/var/www/miq/vmdb/app/models/miq_server.rb:348:in `block in monitor'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-e0f3ea8755bf/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-e0f3ea8755bf/lib/gems/pending/util/extensions/miq-benchmark.rb:30:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:348:in `monitor'
/var/www/miq/vmdb/app/models/miq_server.rb:370:in `block (2 levels) in monitor_loop'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-e0f3ea8755bf/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-e0f3ea8755bf/lib/gems/pending/util/extensions/miq-benchmark.rb:30:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:370:in `block in monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:369:in `loop'
/var/www/miq/vmdb/app/models/miq_server.rb:369:in `monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:252:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:65:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:91:in `start'
/var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>'
****

On version 5.7.0.17, without the fix, the exception is not caught, and the EVM server process dies.


Note You need to log in before you can comment on or make changes to this bug.