Bug 662765
| Summary: | Management broker ID should be the same for members of a cluster. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Alan Conway <aconway> | ||||
| Component: | qpid-cpp | Assignee: | Alan Conway <aconway> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Frantisek Reznicek <freznice> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | beta | CC: | esammons, freznice, gsim, iboverma, jneedle, mcressma, tross | ||||
| Target Milestone: | 1.3.2-RC2 | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qpid-cpp-mrg-0.7.946106-27 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Cause: Management broker ID was not replicated in a cluster.
Consequence: Each broker in a cluster had a different broker ID
Fix: Broker ID is replicated
Result: All brokers in a cluster have the same broker ID.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-02-15 12:11:35 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Alan Conway
2010-12-13 20:01:13 UTC
Fix on trunk r1049566 In build for 1.3.2 RC 2
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause: Management broker ID was not replicated in a cluster.
Consequence: Each broker in a cluster had a different broker ID
Fix: Broker ID is replicated
Result: All brokers in a cluster have the same broker ID.
Although this defect looks very trivial, there are couple of UUIDs for different things and logging in broker -22 and -27 is a it different.
To be able to finish this defect I need to know what ID I need to look at, preferably in QMF path like 'org.apache.qpid.broker:system'
So far I'm able to see that
clusterID is ok unique for the specific cluster and does not change.
federationTag in announced and kept unchanged and unique per broker
Raising NEEDINFO.
[root@dhcp-26-233 tmp]# awk 'BEGIN{f=1}{if(f==1){print} if($0 ~ /Broker running/){f=0} }' /tmp/qpidd.log | egrep "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
org.apache.qpid.broker:system:6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57
org.apache.qpid.broker:system:6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57
2011-01-27 13:56:54 info ManagementAgent restored broker ID: a06dd2a8-298e-421a-9239-b8c33ad53b73
2011-01-27 13:56:54 debug Management object (V1) added: org.apache.qpid.broker:system:6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57
2011-01-27 13:56:54 notice Cluster store state: clean cluster-id=43a19418-3be4-4f46-b9f7-017368fde6c5 shutdown-id=3e72a02a-d3ae-4037-a4db-acc6ee9dbf36
2011-01-27 13:56:55 notice cluster(192.168.1.4:15861 INIT) cluster-uuid = 43a19418-3be4-4f46-b9f7-017368fde6c5
[root@dhcp-26-233 tmp]# python ./qmf_list_objects.py | egrep "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
clusterID 43a19418-3be4-4f46-b9f7-017368fde6c5
federationTag a06dd2a8-298e-421a-9239-b8c33ad53b73
systemRef 0-0-1-0-org.apache.qpid.broker:system:6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57
6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57 (0-0-1-0-org.apache.qpid.broker:system:6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57)
systemId 6e3a6ddc-ac00-4de7-94ab-b4b25ede7f57
I don't think it's a qmf property. Look for info log messages like: ManagementAgent has no data directory, generated new broker ID: No stored broker ID found - ManagementAgent generated broker ID: ManagementAgent restored broker ID: ManagementAgent generated broker ID: Created attachment 476165 [details]
The clustered broker log files grepped for UUIDs
Hi Alan,
thanks for above update, unfortunatelly I'm still unable to recognize the original (bad) and fixed behavior.
I checked the broker logs multiple times in following configurations:
- standard qpidd as service with full set of modules
- standard qpidd as service with reduced set of modules (cluster.so + watchdog.so only)
- unconfined run with full set of modules
- unconfined run with reduced set of modules (cluster.so + watchdog.so only)
On all cases I saw just one message mentioned above i.e.
broker-22 (cluster X) 2 nodes:
[root@dhcp-26-227 tmp]# grep 'broker' /tmp/qpidd.log | grep ID
2011-01-31 10:35:59 info ManagementAgent has no data directory, generated new broker ID: 44f1f8c4-8e60-4335-b66b-492bee4d40a9
[root@dhcp-27-151 tmp]# grep 'broker' /tmp/qpidd.log | grep ID
2011-01-31 11:36:53 info ManagementAgent has no data directory, generated new broker ID: 6b66d4c1-1f60-4ca4-aede-43fddbee2c5d
broker-27 (cluster XX) 2 nodes:
[root@dhcp-26-234 tmp]# grep 'broker' /tmp/qpidd.log | grep ID
2011-01-31 11:44:09 info ManagementAgent has no data directory, generated new broker ID: 94f1c542-21c9-4412-aff5-f5c67538d62f
[root@dhcp-26-233 ~]# grep 'broker' /tmp/qpidd.log | grep ID
2011-01-31 11:42:48 info ManagementAgent has no data directory, generated new broker ID: 4b1c89a9-0d8d-415b-98c2-bc4383a99fd6
even if I ran multiple clients like qpid-tool / qpid-perftest / qpid-latency-test.
I'm failing to find changed UUID during the management periodic processing.
Attached you can find the UUID grepped data from two clusters X (old packages) and XX (last packages).
I believe I have to overlooking it or an important BZ step is missing.
Please find data included, so it should be now easy for you to highlight the matching lines.
The brokers were run following way:
:>/tmp/qpidd.log ;\
qpidd --no-data-dir --cluster-name=X[X] --log-enable=info+ \
--log-enable=trace+:management --log-to-file=/tmp/qpidd.log \
--mgmt-pub-interval 5 &>/dev/null
Sorry, those verify instructions are no good. Here's how to verify: - delete old data directories - start a cluster of 2 or more - check <data-dir>/.mbrokerdata for each member Prior to the fix, they will all have different UUIDs, after the fix they should be the same. The issue has been resolved, tested on RHEL 5.6 i386 / x86_64 on packages: python-qpid-0.7.946106-15.el5 qpid-cpp-client-0.7.946106-27.el5 qpid-cpp-client-devel-0.7.946106-27.el5 qpid-cpp-client-devel-docs-0.7.946106-27.el5 qpid-cpp-client-ssl-0.7.946106-27.el5 qpid-cpp-mrg-debuginfo-0.7.946106-27.el5 qpid-cpp-server-0.7.946106-27.el5 qpid-cpp-server-cluster-0.7.946106-27.el5 qpid-cpp-server-devel-0.7.946106-27.el5 qpid-cpp-server-ssl-0.7.946106-27.el5 qpid-cpp-server-store-0.7.946106-27.el5 qpid-cpp-server-xml-0.7.946106-27.el5 qpid-java-client-0.7.946106-14.el5 qpid-java-common-0.7.946106-14.el5 qpid-java-example-0.7.946106-14.el5 qpid-tools-0.7.946106-12.el5 All members of the cluster now have common UUID stored in <data-dir>/.mbrokerdata, tested for 2...4 members. -> VERIFIED An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0217.html |