Bug 1038717

Summary: [oslo] With QPID, RPC calls to a topic are always fanned-out to all subscribers.
Product: Red Hat OpenStack Reporter: Perry Myers <pmyers>
Component: openstack-neutronAssignee: Ihar Hrachyshka <ihrachys>
Status: CLOSED ERRATA QA Contact: Ofer Blaut <oblaut>
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: apevec, breeler, chrisw, dallan, fpercoco, hateya, kgiusti, lpeer, mlopes, ndipanov, oblaut, twilson, yeylon
Target Milestone: asyncKeywords: TestOnly
Target Release: 4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-2013.2-14.el6ost Doc Type: Bug Fix
Doc Text:
Prior to this update, QPID topic consumer re-connection logic (under the v2 topology) incorrectly resulted in the creation of duplicate RPC notifications delivered to every subscribed consumer. Consequently, samples derived from RPC notifications were duplicated to the extent that the collector service made multiple subscriptions to the topic control exchanges for individual services (e.g. Compute). With this release, QPID creates a single queue per topic and shares it among all corresponding consumers. This ensures that each RPC notification is only received by a single consumer, and prevents any unnecessary duplication of samples.
Story Points: ---
Clone Of: 1038641
: 1045067 (view as bug list) Environment:
Last Closed: 2013-12-20 00:42:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1045067    

Description Perry Myers 2013-12-05 16:39:05 UTC
+++ This bug was initially created as a clone of Bug #1038641 +++

Description of problem:


  See upstream bug:

  https://bugs.launchpad.net/oslo/+bug/1257293

  Note well: this bug _only_ affects those QPID configurations that have applied the fix of the following bug:

   https://bugs.launchpad.net/oslo/+bug/1178375


  _AND_ are have configured "qpid_topology_version=2"



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from RHEL Product and Program Management on 2013-12-05 09:44:55 EST ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Dave Allan on 2013-12-05 10:11:18 EST ---

Checking into risk of fix; may want to take for 4.0GA

--- Additional comment from Perry Myers on 2013-12-05 11:30:41 EST ---

This seems important enough to be an RC blocker for 4.0.  But doesn't this bug need to be cloned to Nova, Ceilometer, Neutron, etc (every component that copies and pastes the code from Oslo for RPC?)

Comment 4 Terry Wilson 2013-12-13 20:27:32 UTC
Copied doc-text from oslo issue where the bug was ultimately fixed. Not sure if this should really be requires_doc_text- or not.

Comment 5 Ihar Hrachyshka 2013-12-17 16:08:58 UTC
Ofer,

it's not clear how to verify the bug. It seems it was originally detected against oslo library, as per comment https://bugs.launchpad.net/oslo/+bug/1178375/comments/26. Then it was fixed there, and the fix was backported to multiple modules which have the failing code copy-pasted into their source tree.

This bug is for openstack-neutron package, meaning it should be verified against it, not against oslo.messaging library that is used in the comment referred above. And there are no clear steps on how to verify it against Neutron.

BTW I've checked whether the steps in the comment do not result in incorrect behaviour anymore, and I still see the issue (duplicate messages when using topology=2).

We could try to extrapolate the (still incorrect) observed behaviour to openstack-neutron and conclude that the fix didn't fix the issue, but that does not seem strictly correct.

Can you elaborate on how to properly verify the bug?

===

For your reference, putting my steps to reproduce the upstream fix.

1. Install RHEL 6.5.
2. Install RHOS-4.0.
3. Install pip (f.e. easy_install pip).
4. git clone https://github.com/openstack/oslo.messaging.git (needed for testing clients/servers from below).
5. cd oslo.messaging && pip -r requirements.txt && python ./setup.py install
6. git clone https://github.com/kgiusti/oslo-messaging-clients.git
7. run two servers with topology=2, send a message to servers -> got duplicate delivery.

(More details on step 7 at: https://bugs.launchpad.net/oslo/+bug/1178375/comments/26)

===

Comment 12 Ihar Hrachyshka 2013-12-19 17:12:20 UTC
(I'm new to Qpid and Openstack in general, so read my comment with caution.)

I guess we may set multi-neutron setup (how?) and check that a new dhcp_agent is registered only by one of those neutron servers (meaning, new agent notification goes in round robin).

Comment 14 errata-xmlrpc 2013-12-20 00:42:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2013-1859.html