Bug 669343 - Inconsistency in management object ids due to disambiguation
Summary: Inconsistency in management object ids due to disambiguation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.3
Hardware: Unspecified
OS: Unspecified
low
high
Target Milestone: 1.3.2-RC2
: ---
Assignee: Ken Giusti
QA Contact: Frantisek Reznicek
URL:
Whiteboard:
Depends On:
Blocks: 654872
TreeView+ depends on / blocked
 
Reported: 2011-01-13 12:39 UTC by Gordon Sim
Modified: 2015-11-16 01:13 UTC (History)
7 users (show)

Fixed In Version: qpid-cpp-mrg-0.7.946106-27
Doc Type: Bug Fix
Doc Text:
A client rapidly reconnecting to the broker caused a name collision between the old connection data, which had not yet been cleaned, and the new connection. This new connection was assigned a new name, but objects referencing the new data used the previous name, thus preventing the management tools from being able to find and display attributes of the new connection. With this update, object names which might collide with older, deleted object names are not assigned, with the result that new objects are correctly named and can be displayed by the management tools.
Clone Of:
Environment:
Last Closed: 2011-02-15 12:16:16 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0217 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging and Grid bug fix and enhancement update 2011-02-15 12:10:15 UTC

Description Gordon Sim 2011-01-13 12:39:38 UTC
Description of problem:

The mechanism by which management object ids are disambiguated can leave 'references' between objects by id inconsistent with the referencing object having a non-disambiguated oid and the agent holding the object by a disambiguated key.

Version-Release number of selected component (if applicable):

1.3.0.1

How reproducible:

100%

Steps to Reproduce:
1. start broker
2. ./examples/messaging/drain -f 'amq.topic/my-subject; {link:{name:my-subscription}}'
3. then kill above client and restart it immediately
4. wait for a short while (management publish interval)
5. qpid-config -b queues
  
Actual results:

qpid-config -b queues
Queue 'amq.topic_my-subscription'
Queue 'qmfc-v2-GRST500.17756.1'
...

I.e. no bindings shown, not even to default exchange. This is due to the oid of the queue being disambiguated (has appended '_'), but the oid referencing the queue in the binding has not been.

Expected results:

qpid-config -b queues
Queue 'amq.topic_my-subscription'
    bind [amq.topic_my-subscription] => ''
    bind [my-subject] => amq.topic
...

Additional info:

ObjectIds are value objects but in ManagementAgent (addObject() and more significantly in the above case moveNewObjectsLH() disambgiuates the copy of the oid held by the agent, not that set o the object).

Note that this can also cause inconsistencies in a clustered broker where a new node joining a cluster where this problem has occured will not have the old clashing object recorded and will thus have consistent oid between queue and binding. The new node thus has a different oid for the same queue and a query by oid will not produce the same results against each cluster.

Comment 1 Ken Giusti 2011-01-13 18:00:30 UTC
The need for "disambiguation" is a bit of a hack to deal with the rapid creation, deletion, and recreation of an object.

If done within the periodic mgmt update interval, this results in two objects with the same identifier (objectid) being tracked by the mgmt agent.  One object has been marked as deleted, the other object is newly created.  Disambiguation is used to keep the object id's from clashing in the various hash maps used to store objects.

Instead, we should enable the mgmt agent to detect this scenario and deal with it cleanly.  One possibility is to cache the deleted object separately, then send all delete notifications before sending any non-deleted object notifications.  This would correctly cause the clients to see the delete event, followed by a new create event (effectively keeping clients in sync with the agent).

Comment 2 Ken Giusti 2011-01-13 18:05:25 UTC
Upstream JIRA:

https://issues.apache.org/jira/browse/QPID-2997

Comment 3 Ken Giusti 2011-01-18 14:55:19 UTC
Pushed upstream:
http://svn.apache.org/viewvc?view=revision&revision=1060401
Committed revision 1060401.

Comment 5 Ken Giusti 2011-01-24 18:01:56 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
    Rapidly reconnecting a client to the broker.  This cause a name collision between the old connection's data - which has not been cleaned up, and the new connection.
Consequence
    To avoid a naming conflict, the new connection would be assigned a new name.  However, other objects referencing the new data would use the old name. This prevented the management tools from being able to find, and thus display, attributes of the new connection.
Fix
    Do not assign a new name to objects that collide with older, deleted objects.
Result
    New objects are now correctly named, and can be displayed by the management tools.

Comment 6 Frantisek Reznicek 2011-01-27 14:20:41 UTC
The issue has been fixed, tested on RHEL 4.9 / 5.6 i386 / x86_64 on packages:
python-qpid-0.7.946106-15.el5
qpid-cpp-client-0.7.946106-27.el5
qpid-cpp-client-devel-0.7.946106-27.el5
qpid-cpp-client-devel-docs-0.7.946106-27.el5
qpid-cpp-client-ssl-0.7.946106-27.el5
qpid-cpp-mrg-debuginfo-0.7.946106-27.el5
qpid-cpp-server-0.7.946106-27.el5
qpid-cpp-server-cluster-0.7.946106-27.el5
qpid-cpp-server-devel-0.7.946106-27.el5
qpid-cpp-server-ssl-0.7.946106-27.el5
qpid-cpp-server-store-0.7.946106-27.el5
qpid-cpp-server-xml-0.7.946106-27.el5
qpid-java-client-0.7.946106-14.el5
qpid-java-common-0.7.946106-14.el5
qpid-java-example-0.7.946106-14.el5
qpid-tools-0.7.946106-12.el5


-> VERIFIED

Comment 7 Douglas Silas 2011-02-09 17:12:34 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,8 +1 @@
-Cause
+A client rapidly reconnecting to the broker caused a name collision between the old connection data, which had not yet been cleaned, and the new connection. This new connection was assigned a new name, but objects referencing the new data used the previous name, thus preventing the management tools from being able to find and display attributes of the new connection. With this update, object names which might collide with older, deleted object names are not assigned, with the result that new objects are correctly named and can be displayed by the management tools.-    Rapidly reconnecting a client to the broker.  This cause a name collision between the old connection's data - which has not been cleaned up, and the new connection.
-Consequence
-    To avoid a naming conflict, the new connection would be assigned a new name.  However, other objects referencing the new data would use the old name. This prevented the management tools from being able to find, and thus display, attributes of the new connection.
-Fix
-    Do not assign a new name to objects that collide with older, deleted objects.
-Result
-    New objects are now correctly named, and can be displayed by the management tools.

Comment 8 errata-xmlrpc 2011-02-15 12:16:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0217.html


Note You need to log in before you can comment on or make changes to this bug.