Bug 1247200 - OpenStack Event Catcher doesn't reconnect if RabbitMQ server restarted
OpenStack Event Catcher doesn't reconnect if RabbitMQ server restarted
Status: CLOSED DUPLICATE of bug 1222005
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers (Show other bugs)
5.4.0
Unspecified Unspecified
medium Severity medium
: GA
: 5.6.0
Assigned To: Greg Blomquist
Pete Savage
openstack:event
:
Depends On:
Blocks: 1291721
  Show dependency treegraph
 
Reported: 2015-07-27 10:37 EDT by Pete Savage
Modified: 2016-02-19 15:51 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1291721 (view as bug list)
Environment:
Last Closed: 2016-02-19 15:51:14 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pete Savage 2015-07-27 10:37:35 EDT
Description of problem: If the rabbitmq server is restarted which the event catcher is running, the event catcher continually loops and never reconnects properly until either the worker is restarted, or the service is restarted.


Version-Release number of selected component (if applicable): 5.4.1.0


How reproducible: 100%


Steps to Reproduce:
1. Add an OpenStack provider with RabbitMQ as the message queue
2. Restart the RabbitMQ server
3.

Actual results: Event Catcher doesn't catch any more events


Expected results: Event Catcher should handle the rabbit restart


Additional info:
[----] E, [2015-07-27T10:36:43.033493 #3261:3298b0c] ERROR -- : MIQ(EventCatcherOpenstack) EMS [xx.xx.xx.xx] as [admin] Event Monitor Thread aborted because [Connection reset by peer]
[----] E, [2015-07-27T10:36:43.033693 #3261:3298b0c] ERROR -- : [Errno::ECONNRESET]: Connection reset by peer  Method:[rescue in block in start_event_monitor]
[----] E, [2015-07-27T10:36:43.033815 #3261:3298b0c] ERROR -- : /opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/cruby/socket.rb:41:in `read_nonblock'
/opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/cruby/socket.rb:41:in `block in read_fully'
/opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/cruby/socket.rb:40:in `loop'
/opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/cruby/socket.rb:40:in `read_fully'
/opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/transport.rb:196:in `read_next_frame'
/opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/session.rb:876:in `init_connection'
/opt/rh/cfme-gemset/gems/bunny-1.0.7/lib/bunny/session.rb:247:in `start'
/var/www/miq/lib/openstack/amqp/openstack_rabbit_event_monitor.rb:58:in `start'
/var/www/miq/vmdb/lib/workers/mixins/event_catcher_openstack_mixin.rb:41:in `monitor_events'
/var/www/miq/vmdb/lib/workers/event_catcher.rb:99:in `block in start_event_monitor'
/var/www/miq/vmdb/lib/extensions/ar_thread.rb:11:in `block in new_with_release'
Comment 2 Greg Blomquist 2015-07-27 16:45:41 EDT
https://github.com/ManageIQ/manageiq/pull/3616
Comment 3 CFME Bot 2015-07-30 10:30:01 EDT
New commit detected on manageiq/master:
https://github.com/ManageIQ/manageiq/commit/81cd635b10e46765c613fb31426c18ae1f1678db

commit 81cd635b10e46765c613fb31426c18ae1f1678db
Author:     Greg Blomquist <gblomqui@redhat.com>
AuthorDate: Mon Jul 27 16:20:01 2015 -0400
Commit:     Greg Blomquist <gblomqui@redhat.com>
CommitDate: Wed Jul 29 18:20:06 2015 -0400

    Change caching for OpenstackEventMonitor
    
    OpenstackEventMonitor implementation classes were cached to be sure that we
    didn't do expensive connection tests each time events were gathered for an
    openstack provider.
    
    However, the cache was a permenant cache and was only cleared when the appliance
    was restarted.  The new cache will invalidate every 5 minutes.
    
    This cache invalidation will allow the OpenstackEventMonitor to recover from
    communication failures with the AMQP service.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1247200

 gems/pending/openstack/openstack_event_monitor.rb | 47 +++++++++++------------
 1 file changed, 22 insertions(+), 25 deletions(-)
Comment 4 CFME Bot 2015-07-30 10:30:04 EDT
New commit detected on manageiq/master:
https://github.com/ManageIQ/manageiq/commit/728aa7b01221d4ef3426f1bf751eb1cade0471d2

commit 728aa7b01221d4ef3426f1bf751eb1cade0471d2
Author:     Greg Blomquist <gblomqui@redhat.com>
AuthorDate: Mon Jul 27 16:22:34 2015 -0400
Commit:     Greg Blomquist <gblomqui@redhat.com>
CommitDate: Wed Jul 29 18:20:07 2015 -0400

    Implement OpenstackNullEventMonitor methods
    
    Originally, OpenstackNullEventMonitor raised NotImplementedErrors when the
    standard start, stop, and each_batch methods were called.  It turns out that
    this was killing the OpenstackEventCatcher worker thread.  In turn, this
    resulted in tons of messages in the logs showing the event catcher dying and
    restarting.
    
    By changing these methods to be implemented and empty, it will allow the event
    catcher thread to do nothing when the event monitor is the
    OpenstackNullEventMonitor.
    
    This coupled with better cache invalidation will allow the OpenstackEventMonitor
    to recover from communication failures with the AMQP service.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1247200

 gems/pending/openstack/amqp/openstack_null_event_monitor.rb | 13 +++++--------
 lib/workers/mixins/event_catcher_openstack_mixin.rb         |  6 ++++--
 2 files changed, 9 insertions(+), 10 deletions(-)
Comment 6 John Prause 2016-02-19 15:51:14 EST

*** This bug has been marked as a duplicate of bug 1222005 ***

Note You need to log in before you can comment on or make changes to this bug.