1312912 – Option "rabbit_retry_backoff" from component config has no effect when autoretrying connection

Bug 1312912 - Option "rabbit_retry_backoff" from component config has no effect when autoretrying connection

Summary: Option "rabbit_retry_backoff" from component config has no effect when autore...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-oslo-messaging
Sub Component:
Version:	8.0 (Liberty)
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	ga
Target Release:	8.0 (Liberty)
Assignee:	Flavio Percoco
QA Contact:	Leonid Natapov
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1313522
TreeView+	depends on / blocked

Reported:	2016-02-29 14:03 UTC by Marian Krcmarik
Modified:	2016-04-07 21:31 UTC (History)
CC List:	6 users (show)
Fixed In Version:	python-oslo-messaging-2.5.0-5.el7ost
Doc Type:	Bug Fix
Doc Text:	When the RabbitMQ service fails to deliver an AMQP message from one OpenStack service to another, it reconnects and retries delivery. The "rabbit_retry_backoff" option, whose default is 2 seconds, is supposed to control the pace of retries; however, retries were previously done every second irrespective of the configured value of this option. The consequence of this problem was excessive retries, for example, when an endpoint was not available. This problem has now been fixed, and the "rabbit_retry_backoff" option, as explicitly configured or with the default value of two seconds, properly controls message delivery retries.
Clone Of:
Clones:	1313522 (view as bug list)
Environment:
Last Closed:	2016-04-07 21:31:44 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2016:0603	0	normal	SHIPPED_LIVE	Red Hat OpenStack Platform 8 Enhancement Advisory	2016-04-08 00:53:53 UTC

Description Marian Krcmarik 2016-02-29 14:03:24 UTC

Description of problem:
There is a messed up code in _drivers/impl_rabbit.py which causing that autoretry reconnection is always triggered every second despite rabbit_retry_backoff option being set (in default to 2).

Specifically:
865             autoretry_method = self.connection.autoretry(
866                 execute_method, channel=self.channel,
867                 max_retries=retry,
868                 errback=on_error,
869                 interval_start=self.interval_start or 1,
870                 interval_step=self.interval_stepping,
871                 on_revive=on_reconnection,
872             )
There is no interval_max=self.interval_max parameter specified (hardcoded to value 30).
Method kombu.connection.autoretry() calls kombu.connectio.ensure() which sets interval_max to 1 by default so backing off cannot have any effect since max interval is 1.

It would be nice to be able to set interval_max in component configs and not hardcoded it to 30 maybe.

Version-Release number of selected component (if applicable):
python-oslo-messaging-2.5.0-4.el7ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Set rabbit_retry_backoff to anything 2+
2. Take down rabbitmq server

Actual results:
Reconenction is actually happening every second, instead of interval being backed off by 2+ seconds each try up to 30 seconds

Expected results:


Additional info:

Comment 2 Victor Stinner 2016-03-17 14:10:11 UTC

python-oslo-messaging-2.5.0-5.el7ost is ready for tests.

Comment 5 errata-xmlrpc 2016-04-07 21:31:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0603.html

Note You need to log in before you can comment on or make changes to this bug.