| Summary: | clustered broker leaves cluster with 'catch-up connection closed prematurely' | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Frantisek Reznicek <freznice> | ||||||
| Component: | qpid-cpp | Assignee: | messaging-bugs <messaging-bugs> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | MRG Quality Engineering <mrgqe-bugs> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 1.3 | CC: | aconway, esammons, jross | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | athlon | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-08-01 15:05:30 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
Frantisek Reznicek
2011-07-28 19:44:32 UTC
Created attachment 515788 [details]
The last hundreds of log lines from every broker (debug+)
Created attachment 515789 [details]
The automated test for reproducing the issue
Test can be ran as:
./ctests.py --broker-maximize-uptime \
--broker-options=mgmt-pub-interval~1,log-enable~debug+ \
--log-to-file=110728_16h.log \
--qmf-data-timeout=120 \
--stop-on-error\
--exec-tests=testset_cluster_toggle:16h \
host_a host_b host_c host_d
If you do not set passless access to host_a host_b host_c host_d then you have to pass --ssh-pass=... as well
Following scenario seems to tend to trigger issue more easily A PPPP ! B SSSSS! C SSSSS! D SSSSS! where 'PPPP' is progress of perftest client (client A against broker on rand( A, B, C, D)), 'SSSSS' is cluster synchronization and '!' is an cluster reconfiguration/toggling nodes Other broker options which increase the reproducibility: mgmt-pub-interval=1 The issue was observed on RHEL5.6 x86_64 & -33. This is the expected behaviour if the broker giving the update is killed before it completes. When you are changing configuration, there's a chance that after starting a broker you proceed to kill the broker providing the update.
You can fix this by connecting a client to the broker you have just started. It doesn't have to do anything, the clients connection won't complete until the broker is ready. E.g.
c = messaging.Connection.establish(...broker address...)
try: c.session()
finally: c.close()
Hi Alan,
let me clarify in more detail the testing scenario to be able precisely judge whether behavior I can see is correct.
My last run on RHEL5.7 i386&x86_64 w/o loading cluster failed the same way.
The test sub procedure was:
(cluster of two nodes [rhel57i~dhcp-26-112~A, rhel57x~dhcp-26-114~A)
0. initial cluster config, no node up
1. all nodes up, first A, then B)
2. a test actions ... (not important atm)
3. reconfigure cluster to: A is up and running, B is down
4. test attempts to reconfig cluster to configuration:
- A is down, B is up
At point 4. test detects that bringing down A would result in problem as B's store is dirty and thus do following operations:
4-1] at B : service qpidd start
after 4-1 finishes:
4-2] at A : service qpidd stop
after 4-2 is done cluster state is checked and test ends with error because B node did not start and thus cluster configuration at the end of 4-2] is : A, B down.
Now to the question whether it is qpidd or test who is guilty.
Test expects that when service qpidd operation (4-1]) ends - ensures that broker is settled and synchronized to the cluster and thus next operation (4-2]) may safely shutdown all other cluster nodes.
Alan,
Could you possibly comment above scenario, please?
Raising NEEDINFO.
Test transcript from 2-node cluster (rhel5.7 i/x):
[14:32:36] DEBUG:TESTSET 'testset_cluster_toggle' loop 10/32 (10800.0) finished. [tst_cnt:759, err_cnt:0]
[14:32:36] DEBUG:TESTSET 'testset_cluster_toggle' loop 11/36 (10800.0) started. [tst_cnt:759, err_cnt:0]
[14:32:37] DEBUG:Cluster state: 1 of 2 up and running
[14:32:37] DEBUG:TEST 'test_cluster_toggle_no_load' loop 1/1 started. [tst_cnt:759, err_cnt:0]
[14:32:38] DEBUG:Cluster state: 1 of 2 up and running
[14:32:38] DEBUG:Go to cluster state: 3 [True, True]
DEBUG:ccfg:[False, True] -> tcfg:[True, True]
DEBUG:start(A ~ 0)
[14:32:48] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1
[14:32:49] INFO:localhost:'Cluster state: 2 (exp. 2) of 2 up and running', result:True, dur.:-1
[14:32:51] INFO:dhcp-26-112:'qpid-cluster', result:0, dur.:1.46567
[14:32:51] INFO:localhost:'qpid-cluster cluster size check (exp. 2)', result:True, dur.:-1
[14:32:53] INFO:dhcp-26-114:'qpid-cluster', result:0, dur.:1.64677
[14:32:53] INFO:localhost:'qpid-cluster cluster size check (exp. 2)', result:True, dur.:-1
[14:32:53] DEBUG:Go to cluster state: 2 [True, False]
DEBUG:ccfg:[True, True] -> tcfg:[True, False]
DEBUG:stop(B ~ 1)
[14:32:58] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1
[14:32:59] INFO:localhost:'Cluster state: 1 (exp. 1) of 2 up and running', result:True, dur.:-1
[14:33:01] INFO:dhcp-26-112:'qpid-cluster', result:0, dur.:1.4745
[14:33:01] INFO:localhost:'qpid-cluster cluster size check (exp. 1)', result:True, dur.:-1
[14:33:01] DEBUG:Go to cluster state: 1 [False, True]
DEBUG:ccfg:[True, False] -> tcfg:[False, True]
DEBUG:start(B ~ 1)
DEBUG:stop(A ~ 0)
[14:33:11] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1
[14:33:12] INFO:localhost:'Cluster state: 1 (exp. 1) of 2 up and running', result:True, dur.:-1
[14:33:14] INFO:dhcp-26-114:'qpid-cluster', result:0, dur.:1.38322
[14:33:14] INFO:localhost:'qpid-cluster cluster size check (exp. 1)', result:True, dur.:-1
[14:33:15] DEBUG:Cluster state: 1 of 2 up and running
[14:33:15] DEBUG:TEST 'test_cluster_toggle_no_load' loop 1/1 finished. [tst_cnt:773, err_cnt:0]
[14:33:15] DEBUG:TEST 'test_cluster_toggle_no_load_w_sesame' loop 1/1 started. [tst_cnt:773, err_cnt:0]
[14:33:16] DEBUG:Cluster state: 1 of 2 up and running
[14:33:16] DEBUG:Go to cluster state: 3 [True, True]
DEBUG:ccfg:[False, True] -> tcfg:[True, True]
DEBUG:start(A ~ 0)
[14:33:26] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1
[14:33:27] INFO:localhost:'Cluster state: 2 (exp. 2) of 2 up and running', result:True, dur.:-1
[14:33:29] INFO:dhcp-26-112:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:0.98156
[14:33:33] INFO:dhcp-26-114:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:3.58063
[14:33:35] INFO:dhcp-26-112:'qpid-cluster', result:0, dur.:1.53743
[14:33:35] INFO:localhost:'qpid-cluster cluster size check (exp. 2)', result:True, dur.:-1
[14:33:37] INFO:dhcp-26-114:'qpid-cluster', result:0, dur.:1.46806
[14:33:37] INFO:localhost:'qpid-cluster cluster size check (exp. 2)', result:True, dur.:-1
[14:33:37] DEBUG:Go to cluster state: 2 [True, False]
DEBUG:ccfg:[True, True] -> tcfg:[True, False]
DEBUG:stop(B ~ 1)
[14:33:42] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1
[14:33:43] INFO:localhost:'Cluster state: 1 (exp. 1) of 2 up and running', result:True, dur.:-1
[14:33:47] INFO:dhcp-26-112:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:3.75164
[14:33:48] INFO:dhcp-26-114:'service sesame status && service sesame stop || service sesame start', result:0, dur.:0.475331
[14:33:50] INFO:dhcp-26-112:'qpid-cluster', result:0, dur.:1.52838
[14:33:50] INFO:localhost:'qpid-cluster cluster size check (exp. 1)', result:True, dur.:-1
[14:33:50] DEBUG:Go to cluster state: 1 [False, True]
DEBUG:ccfg:[True, False] -> tcfg:[False, True]
DEBUG:start(B ~ 1)
DEBUG:stop(A ~ 0)
[14:34:04] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1
[14:34:05] ERROR:localhost:'Cluster state: 0 (exp. 1) of 2 up and running', result:False, dur.:-1
Traceback (most recent call last):
File "./ctests.py", line 239, in main
i_method[1]();
File "./ctests.py", line 406, in testset_cluster_toggle
self.test_cluster_toggle_no_load_w_sesame();
File "./ctests.py", line 893, in test_cluster_toggle_no_load_w_sesame
self.report();
File "/home/freznice/prjs/rh/rhts_tests/tests/distribution/_others/cluster_test/testlib.py", line 1430, in report
raise test_error("An error ocurred!");
test_error: 'An error ocurred!'
[14:34:05] INFO:
[14:34:05] INFO:Execution summary: TST_CNT:789, ERR_CNT:1, WRN_CNT:0, dur.:3418.27
[14:34:05] INFO:
dhcp-26-112:
2011-08-01 14:33:23 notice cluster(192.168.1.5:26282 READY) caught up.
2011-08-01 14:33:26 info Queue "qmfagent-20e864fc-5a8b-4f9e-bf7d-61fe2cd0ecda": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:30 info Queue "qmfagent-4acadabb-d7e9-4544-be78-266a7ff80ce2": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:31 info Queue "reply-dhcp-26-112.brq.redhat.com.26476.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:31 info Queue "topic-dhcp-26-112.brq.redhat.com.26476.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:31 info Queue "qmfc-v2-dhcp-26-112.brq.redhat.com.26476.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:31 info Queue "qmfc-v2-ui-dhcp-26-112.brq.redhat.com.26476.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:31 info Queue "qmfc-v2-hb-dhcp-26-112.brq.redhat.com.26476.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:33 info Queue "reply-dhcp-26-114.brq.redhat.com.28112.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:33 info Queue "topic-dhcp-26-114.brq.redhat.com.28112.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:33 info Queue "qmfc-v2-dhcp-26-114.brq.redhat.com.28112.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:33 info Queue "qmfc-v2-ui-dhcp-26-114.brq.redhat.com.28112.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:33 info Queue "qmfc-v2-hb-dhcp-26-114.brq.redhat.com.28112.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:35 notice cluster(192.168.1.5:26282 READY) configuration change: 192.168.1.5:26282
2011-08-01 14:33:35 notice cluster(192.168.1.5:26282 READY) Members left: 192.168.1.6:27764
2011-08-01 14:33:35 info cluster(192.168.1.5:26282 READY) member update: 192.168.1.5:26282(member) frameSeq=21988358
2011-08-01 14:33:35 notice cluster(192.168.1.5:26282 READY)Sole member of cluster, marking store clean.
2011-08-01 14:33:35 notice cluster(192.168.1.5:26282 READY) last broker standing, update queue policies
2011-08-01 14:33:35 info cluster(192.168.1.5:26282 READY) became the elder, active for links.
2011-08-01 14:33:36 info Activating links
2011-08-01 14:33:45 info Queue "qmfagent-89a21650-abc9-461f-9f9f-ef61a27691cf": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:47 info Queue "reply-dhcp-26-112.brq.redhat.com.26665.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:47 info Queue "topic-dhcp-26-112.brq.redhat.com.26665.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:47 info Queue "qmfc-v2-dhcp-26-112.brq.redhat.com.26665.1": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:47 info Queue "qmfc-v2-ui-dhcp-26-112.brq.redhat.com.26665.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:47 info Queue "qmfc-v2-hb-dhcp-26-112.brq.redhat.com.26665.1": Policy created: type=ring; maxCount=0; maxSize=104857600
2011-08-01 14:33:51 notice cluster(192.168.1.5:26282 READY) configuration change: 192.168.1.5:26282 192.168.1.6:28490
2011-08-01 14:33:51 notice cluster(192.168.1.5:26282 READY) Members joined: 192.168.1.6:28490
2011-08-01 14:33:52 info cluster(192.168.1.5:26282 OFFER) send update-offer to 192.168.1.6:28490
2011-08-01 14:33:52 info cluster(192.168.1.5:26282 OFFER) stall for update offer from 192.168.1.5:26282 to 192.168.1.6:28490
2011-08-01 14:33:52 notice cluster(192.168.1.5:26282 UPDATER) sending update to 192.168.1.6:28490 at amqp:tcp:10.34.26.114:5672,tcp:192.168.1.6:5672
2011-08-01 14:33:52 info SSL connector not enabled, you must set QPID_SSL_CERT_DB to enable it.
2011-08-01 14:33:52 info Loaded Module: /usr/lib/qpid/client/sslconnector.so
2011-08-01 14:33:52 info Connection connected to tcp:10.34.26.114:5672
dhcp-26-114:
2011-08-01 14:33:35 notice Shut down
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/msgstore.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/replicating_listener.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/cluster.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/ssl.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/xml.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/replication_exchange.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/acl.so
2011-08-01 14:33:51 info Loaded Module: /usr/lib64/qpid/daemon/watchdog.so
2011-08-01 14:33:51 info No message store configured, persistence is disabled.
2011-08-01 14:33:51 info Management enabled
2011-08-01 14:33:51 info ManagementAgent restored broker ID: 96cf1637-7a1b-45f2-93a0-5b41bfa2ebee
2011-08-01 14:33:51 notice Journal "TplStore": Created
2011-08-01 14:33:51 notice Store module initialized; store-dir=/var/lib/qpidd
2011-08-01 14:33:51 info > Default files per journal: 8
2011-08-01 14:33:51 info > Default journal file size: 24 (wpgs)
2011-08-01 14:33:51 info > Default write cache page size: 32 (Kib)
2011-08-01 14:33:51 info > Default number of write cache pages: 32
2011-08-01 14:33:51 info > TPL files per journal: 8
2011-08-01 14:33:51 info > TPL journal file size: 24 (wpgs)
2011-08-01 14:33:51 info > TPL write cache page size: 4 (Kib)
2011-08-01 14:33:51 info > TPL number of write cache pages: 64
2011-08-01 14:33:51 info Registered xml exchange
2011-08-01 14:33:51 info Registered replication exchange
2011-08-01 14:33:51 notice Initializing CPG
2011-08-01 14:33:51 notice Cluster store state: dirty cluster-id=a114fb06-15d8-42dd-897f-cf2b8cd2d125
2011-08-01 14:33:51 notice cluster(192.168.1.6:28490 PRE_INIT) configuration change: 192.168.1.5:26282 192.168.1.6:28490
2011-08-01 14:33:51 notice cluster(192.168.1.6:28490 PRE_INIT) Members joined: 192.168.1.6:28490
2011-08-01 14:33:51 notice Cluster recovery: recovered journal data discarded and journal files pushed down
2011-08-01 14:33:51 notice Store directory /var/lib/qpidd/rhm was pushed down (saved) into directory /var/lib/qpidd/_cluster.bak.0020.
2011-08-01 14:33:51 notice Journal "TplStore": Created
2011-08-01 14:33:51 notice SASL disabled: No Authentication Performed
2011-08-01 14:33:51 notice Listening on TCP port 5672
2011-08-01 14:33:51 info SSL plugin not enabled, you must set --ssl-cert-db to enable it.
2011-08-01 14:33:51 info Policy file not specified. ACL Disabled, no ACL checking being done!
2011-08-01 14:33:51 info cluster(192.168.1.6:28490 INIT) not active for links.
2011-08-01 14:33:51 notice cluster(192.168.1.6:28490 INIT) cluster-uuid = a114fb06-15d8-42dd-897f-cf2b8cd2d125
2011-08-01 14:33:51 notice cluster(192.168.1.6:28490 JOINER) joining cluster mycluster
2011-08-01 14:33:51 notice Broker running
2011-08-01 14:33:51 info cluster(192.168.1.6:28490 JOINER) stall for update offer from 192.168.1.5:26282 to 192.168.1.6:28490
2011-08-01 14:33:51 notice cluster(192.168.1.6:28490 UPDATEE) receiving update from 192.168.1.5:26282
2011-08-01 14:33:51 notice Journal "test_cluster_toggle_load_w_sesame-00": Created
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/msgstore.so
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/replicating_listener.so
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/cluster.so
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/ssl.so
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-0_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-0_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-0_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-0_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 notice Journal "test_cluster_toggle_load_w_sesame-10": Created
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/xml.so
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/replication_exchange.so
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/acl.so
2011-08-01 14:33:52 info Loaded Module: /usr/lib64/qpid/daemon/watchdog.so
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-1_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-1_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-1_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-1_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-2_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-2_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-2_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-2_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-3_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-3_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-3_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 info Queue "test_cluster_toggle_load_w_sesame-3_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:52 notice Journal "test_cluster_toggle_load_w_sesame-40": Created
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-4_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-4_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-4_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-4_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-5_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-5_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-5_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 info Queue "test_cluster_toggle_load_w_sesame-5_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:53 notice Journal "test_cluster_toggle_load_w_sesame-60": Created
2011-08-01 14:33:53 info Passivating links
2011-08-01 14:33:54 info Queue "test_cluster_toggle_load_w_sesame-6_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:54 info Queue "test_cluster_toggle_load_w_sesame-6_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:54 info Queue "test_cluster_toggle_load_w_sesame-6_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:54 info Queue "test_cluster_toggle_load_w_sesame-6_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:54 notice Journal "test_cluster_toggle_load_w_sesame-70": Created
2011-08-01 14:33:55 info Queue "test_cluster_toggle_load_w_sesame-7_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:55 info Queue "test_cluster_toggle_load_w_sesame-7_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:55 info Queue "test_cluster_toggle_load_w_sesame-7_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:55 info Queue "test_cluster_toggle_load_w_sesame-7_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:55 notice Journal "test_cluster_toggle_load_w_sesame-80": Created
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-8_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-8_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-8_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-8_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 notice Journal "test_cluster_toggle_load_w_sesame-90": Created
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-9_pub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-9_pub_start": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-9_sub_done": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "test_cluster_toggle_load_w_sesame-9_sub_ready": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 info Queue "qpid.cluster-update": Policy created: type=reject; maxCount=0; maxSize=104857600
2011-08-01 14:33:59 critical cluster(192.168.1.6:28490 UPDATEE) catch-up connection closed prematurely 10.34.26.114:5672-10.34.26.112:36610(192.168.1.6:28490-1 local,catchup)
2011-08-01 14:33:59 notice cluster(192.168.1.6:28490 LEFT) leaving cluster mycluster
2011-08-01 14:33:59 notice Shut down
This assumption in comment 6 is incorrect: "Test expects that when service qpidd operation (4-1]) ends - ensures that broker is settled and synchronized to the cluster and thus next operation (4-2]) may safely shutdown all other cluster nodes." When qpidd -d returns the broker has done an initial handshake with the cluster but has not yet received the full state update. To wait for the full update to complete, connect a client to the newly started broker. The connection will not complete until the broker is fully ready. Any client will do, it doesn't have to do anything other than establish a connection. Thanks for clarification Alan. The observed issue is expected and test code has to keep in mind that newbee sync takes longer time. -> CLOSED / NOT-A-BUG A note for testing: you can ensure the newbie is ready without sleeping by just making a connection with any client you like. E.g. qpid-perftest --count 1. The connection will complete when the newbie is up to date at which point it is safe to proceed with the test. |