Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1060772

Summary: nova qpid reconnection delay must be more accurate
Product: Red Hat OpenStack Reporter: Fabio Massimo Di Nitto <fdinitto>
Component: openstack-novaAssignee: Flavio Percoco <fpercoco>
Status: CLOSED ERRATA QA Contact: Toure Dunnon <tdunnon>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.0CC: apevec, breeler, dallan, eharney, fpercoco, ndipanov, sclewis, sgordon, xqueralt, yeylon
Target Milestone: z4Keywords: Rebase, Triaged, ZStream
Target Release: 4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-nova-2013.2.3-1.el6ost Doc Type: Rebase: Bug Fixes and Enhancements
Doc Text:
Qpid's driver reconnection delay is not configurable. This and the fact that the hard-coded delay was quite high became a blocker issue from an HA perspective. Making this value configurable is not possible for this version, hence the hard-coded delay was tweaked and made reasonable HA-wise. The new delay cap is 5 seconds.
Story Points: ---
Clone Of: 1060689 Environment:
Last Closed: 2014-05-29 20:35:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Fabio Massimo Di Nitto 2014-02-03 14:53:07 UTC
+++ This bug was initially created as a clone of Bug #1060689 +++

The current loop is:

        delay = 1
        while True:
            # Close the session if necessary
            if self.connection.opened():
                try:
                    self.connection.close()
                except qpid_exceptions.ConnectionError:
                    pass

            broker = self.brokers[attempt % len(self.brokers)]
            attempt += 1

            try:
                self.connection_create(broker)
                self.connection.open()
            except qpid_exceptions.ConnectionError, e:
                msg_dict = dict(e=e, delay=delay)
                msg = _("Unable to connect to AMQP server: %(e)s. "
                        "Sleeping %(delay)s seconds") % msg_dict
                LOG.error(msg)
                time.sleep(delay)
                delay = min(2 * delay, 60)

that can lead to over 60 seconds waiting time if the qpid sever is not immediately available at startup.

60 seconds is too long for HA environment where timers need to be very aggressive to reduce downtime to the very minimum.

This is a blocker for HA deployments.

Comment 9 errata-xmlrpc 2014-05-29 20:35:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0578.html