Description of problem: The core neutron routines never set a timeout and thus a TTL (Time To Live) on any of the qpid messages. Meanwhile the waiters use the rpc_response_timeout to determine when to give up on a response.As a result, messages that are no longer recognized can build up in the queues. This causes addition, wasted processing overhead and contributes to a longer backlog. Version-Release number of selected component (if applicable): openstack-neutron.noarch 2013.2.3-6.el6ost How reproducible: Design Flaw so every time Steps to Reproduce: 1.use the product 2. 3. Actual results: Never get a TTL, get big backlogs when things go awry Expected results: Messages that are no longer be waiting for should get deleted Additional info: Simple two liner in /usr/lib/python2.6/site-packages/neutron/openstack/common/rpc/impl_qpid.py In the top of the topic_send() if timeout is None: timeout = self.conf.rpc_response_timeout
This should a RHEL-OSP Havana-A5 target
Also note that this proposed change is not the ideal solution. It does not correctly factor the overall TTL. In fact it uses two different TTLs, one for the message from the client to the server. The other for the responses from the server top the client. Th ideal solution will factor in the entire time in and use a single TTL. For example, the client sends a request with a TTL of 60 (sec). The server pulls the request off of the queue, processes request and prepares the response. At the end, the server needs to determine how much time is left on the original TTL and use that value before sticking the response into the queue. This is because the client will be timing out at the time of its original TTL.
I've talked to oslo.messaging cores, they say setting TTL for reply messages is not needed. Instead, we should make sure queues are auto-deleted (bug 1099657) and clients reuse the same queue on reconnect (see launchpad bug in external tracker list).
Moving to A6 as per Livnat's request. The reasoning is that the bug is internal and is not something any known customer is waiting for.
Verified NVR: openstack-neutron-2013.2.4-4.el6ost.noarch Verified that the fix was incorporated in the package.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2014-1686.html