Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1102910 - TTL never set on messages, causes messages to live forever
TTL never set on messages, causes messages to live forever
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
4.0
x86_64 Linux
high Severity urgent
: z5
: 4.0
Assigned To: Ihar Hrachyshka
Nir Magnezi
: ZStream
Depends On:
Blocks: 1081488 1147618
  Show dependency treegraph
 
Reported: 2014-05-29 15:36 EDT by Mark Wagner
Modified: 2016-04-26 23:35 EDT (History)
9 users (show)

See Also:
Fixed In Version: openstack-neutron-2013.2.4-4.el6ost
Doc Type: Bug Fix
Doc Text:
In the previous version, a Qpid OpenStack Networking (neutron) client created a new queue instead of reusing the old one. This meant that old Qpid queues could be left abandoned, using precious broker resources, and that messages piled up in the queue were never consumed. With this update, the Qpid queue name is reused on reconnect. This ensures that old Qpid queues are no longer abandoned by OpenStack AMQP clients, and all existing messages are consumed.
Story Points: ---
Clone Of:
: 1147618 (view as bug list)
Environment:
Last Closed: 2014-10-22 13:23:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 105494 None None None Never
OpenStack gerrit 124597 None None None Never
Red Hat Product Errata RHSA-2014:1686 normal SHIPPED_LIVE Moderate: openstack-neutron security and bug fix update 2014-10-22 17:21:18 EDT

  None (edit)
Description Mark Wagner 2014-05-29 15:36:26 EDT
Description of problem:

The core neutron routines never set a timeout and thus a TTL (Time To Live) on any of the qpid messages. Meanwhile the waiters use the rpc_response_timeout to determine when to give up on a response.As a result, messages that are no longer recognized can build up in the queues. This causes addition, wasted processing overhead and contributes to a longer backlog. 
 
Version-Release number of selected component (if applicable):
openstack-neutron.noarch 2013.2.3-6.el6ost

How reproducible:
Design Flaw so every time

Steps to Reproduce:
1.use the product
2.
3.

Actual results:
Never get a TTL, get big backlogs when things go awry

Expected results:
Messages that are no longer be waiting for should get deleted

Additional info:

Simple two liner in /usr/lib/python2.6/site-packages/neutron/openstack/common/rpc/impl_qpid.py

In the top of the topic_send()
        if timeout is None:
            timeout = self.conf.rpc_response_timeout
Comment 1 Mark Wagner 2014-05-29 15:38:44 EDT
This should a RHEL-OSP Havana-A5 target
Comment 2 Mark Wagner 2014-05-29 15:46:29 EDT
Also note that this proposed change is not the ideal solution. It does not correctly factor the overall TTL. In fact it uses two different TTLs, one for the message from the client to the server.  The other for the responses from the server top the client.  

Th ideal solution will factor in the entire time in and use a single TTL.

For example, the client sends a request with a TTL of 60 (sec). The server pulls the request off of the queue, processes request and prepares the response. At the end, the server needs to determine how much time is left on the original TTL and use that value before sticking the response into the queue. This is because the client will be timing out at the time of its original TTL.
Comment 4 Ihar Hrachyshka 2014-09-26 12:03:13 EDT
I've talked to oslo.messaging cores, they say setting TTL for reply messages is not needed. Instead, we should make sure queues are auto-deleted (bug 1099657) and clients reuse the same queue on reconnect (see launchpad bug in external tracker list).
Comment 5 Ihar Hrachyshka 2014-09-29 09:22:56 EDT
Moving to A6 as per Livnat's request. The reasoning is that the bug is internal and is not something any known customer is waiting for.
Comment 7 Nir Magnezi 2014-10-08 04:15:17 EDT
Verified NVR: openstack-neutron-2013.2.4-4.el6ost.noarch

Verified that the fix was incorporated in the package.
Comment 9 errata-xmlrpc 2014-10-22 13:23:10 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2014-1686.html

Note You need to log in before you can comment on or make changes to this bug.