Summary: | AMQP connection doesn't recover after rabbitmq-server restart | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Marius Cornea <mcornea> | ||||
Component: | Providers | Assignee: | Ladislav Smola <lsmola> | ||||
Status: | CLOSED ERRATA | QA Contact: | Ola Pavlenko <opavlenk> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.4.0 | CC: | brant.evans, clasohm, cpelland, gblomqui, gfa, jfrey, jhardy, jocarter, jprause, mfeifer, mfuruta, obarenbo, psavage, vstinner | ||||
Target Milestone: | GA | Keywords: | ZStream | ||||
Target Release: | 5.6.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | retest:openstack:event | ||||||
Fixed In Version: | 5.6.0.0 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1310245 1310248 (view as bug list) | Environment: | |||||
Last Closed: | 2016-06-29 14:55:22 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Bug Depends On: | |||||||
Bug Blocks: | 1291721, 1310245, 1310248 | ||||||
Attachments: |
|
I forgot to mention the rabbitmq-server version I tested against: rabbitmq-server-3.3.5-3.el7ost.noarch rabbitmq-server-3.3.5-4.el7.noarch @Ladas, can you work with @Marius to reproduce this? John Eckersberg found this: https://github.com/ruby-amqp/bunny/pull/251. Which was released by Bunny in 1.5.0+. So, we might just want to bump the version of Bunny to see if that fixes this problem. According to John, the libraries should be handling the reconnect for us. At least, that's his experience is on the python side. @Greg but the gem update can't be backported, right? So should we target this to 5.6? @ladas, I'll check with Jason and John Prause regarding the backport of the gem. In the meantime, work with Marius to see if bumping the gem has any impact. *** Bug 1247200 has been marked as a duplicate of this bug. *** *** Bug 1291721 has been marked as a duplicate of this bug. *** https://github.com/ManageIQ/manageiq/pull/6857 Seems like bunny update is fixing that. After systemctl restart rabbitmq-server, I don't see the frame related error and ManageIQ continues to receive new events. we need to use Bunny 2.1.0 because of https://github.com/ruby-amqp/bunny/issues/383 New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/0b3328db34a5cca44124a5496fac1c092e407eda commit 0b3328db34a5cca44124a5496fac1c092e407eda Author: Ladislav Smola <lsmola> AuthorDate: Mon Feb 22 15:43:07 2016 +0100 Commit: Ladislav Smola <lsmola> CommitDate: Tue Mar 1 15:01:49 2016 +0100 Update Bunny gem Update Bunny gem, the old gem couldn't handle reconnect when amqp service got restarted. Seems like new bunny works the same, so no changes in using it are needed. Fixes BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1222005 gems/pending/Gemfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) New commit detected on cfme/5.5.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=b1ab1307cb92e0ccff433ac4a394e8174e3f574c commit b1ab1307cb92e0ccff433ac4a394e8174e3f574c Author: Ladislav Smola <lsmola> AuthorDate: Mon Feb 22 15:43:07 2016 +0100 Commit: Ladislav Smola <lsmola> CommitDate: Mon Mar 7 13:02:02 2016 +0100 Update Bunny gem Update Bunny gem, the old gem couldn't handle reconnect when amqp service got restarted. Seems like new bunny works the same, so no changes in using it are needed. Fixes BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1222005 gems/pending/Gemfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) New commit detected on cfme/5.5.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=564108ee801d389b433777936eb9824b1dad81c2 commit 564108ee801d389b433777936eb9824b1dad81c2 Merge: 934d6fa b1ab130 Author: Greg Blomquist <gblomqui> AuthorDate: Fri Mar 11 09:41:15 2016 -0500 Commit: Greg Blomquist <gblomqui> CommitDate: Fri Mar 11 09:41:15 2016 -0500 Merge branch 'bz1310245' into '5.5.z' Update Bunny gem Update Bunny gem, the old gem couldn't handle reconnect when amqp service got restarted. Seems like new bunny works the same, so no changes in using it are needed. Fixes BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1222005 Clean cherry-pick of: https://github.com/ManageIQ/manageiq/pull/6857 Fixes 5.5.z BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1310245 See merge request !839 gems/pending/Gemfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Verified on 5.6.0.8-rc1-nightly the errors did not appear, the amqp could re established Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1348 > External Bug ID: OpenStack gerrit 436958
Sorry, I commented the wrong bugzilla issue :-/
|
Created attachment 1025839 [details] logs on server and cfme Description of problem: I've hit this with an Openstack Infra provider when restarting the rabbitmq-server on the undercloud. The AMQP connection doesn't seem to recover after rabbitmq-server restart. Version-Release number of selected component (if applicable): 5.4.0.1.20150512111354_4368716 How reproducible: Add a new Openstack Infrastructure provider. Steps to Reproduce: 1. systemctl restart rabbitmq-server on the undercloud node 2. Check evm.log 3. Actual results: AMQP connection results in {frame_too_large,1342177289,131064} error. Expected results: AMQP connection is re-established after server restart. Additional info: I am attaching all relevent logs.