corosync version 1.3.0: When I run “corosync-pload” it prints: # corosync-pload Init result 1 The process never stops (but I can stop it with cntrl-c), but it seems to work anyways: Dec 22 09:32:46 maui corosync[2409]: [PLOAD ] 1500000 Writes 300 bytes per write 2.495 seconds runtime, 601307.250 TP/S, 172.035 MB/S. Dec 22 09:32:53 maui corosync[2409]: [PLOAD ] 1500000 Writes 300 bytes per write 3.062 seconds runtime, 489821.674 TP/S, 140.139 MB/S. Dec 22 09:33:01 maui corosync[2409]: [PLOAD ] 1500000 Writes 300 bytes per write 4.372 seconds runtime, 343112.460 TP/S, 98.165 MB/S. Dec 22 09:33:09 maui corosync[2409]: [PLOAD ] 1500000 Writes 300 bytes per write 4.369 seconds runtime, 343358.870 TP/S, 98.236 MB/S. Dec 22 09:33:53 maui corosync[2409]: [PLOAD ] 1500000 Writes 300 bytes per write 3.475 seconds runtime, 431594.847 TP/S, 123.480 MB/S. If I now start cpgbench I get: /corosync-1.3.0/test# ./cpgbench 463802 messages received 1000 bytes per write 10.000 Seconds runtime 46380.121 TP/s 46.380 MB/s. 470350 messages received 2000 bytes per write 10.000 Seconds runtime 47034.864 TP/s 94.070 MB/s. 460633 messages received 3000 bytes per write 10.000 Seconds runtime 46063.231 TP/s 138.190 MB/s. 443571 messages received 4000 bytes per write 10.000 Seconds runtime 44357.016 TP/s 177.428 MB/s. Everything OK, but if I also start corosync-pload I get a corosync crash: /corosync-1.3.0/test# ./cpgbench … cpg dispatch returned error 2 and the syslog shows: Dec 22 09:39:45 maui corosync[2409]: [PLOAD ] 1500000 Writes 300 bytes per write 2.184 seconds runtime, 686771.055 TP/S, 196.487 MB/S. Dec 22 09:40:03 maui dlm_controld[2479]: cluster is down, exiting Dec 22 09:40:03 maui fenced[2464]: cluster is down, exiting Dec 22 09:40:05 maui kernel: dlm: closing connection to node 3
corosync-pload is a developer-only test tool, and I believe we had this discussion on the ML some time ago, so priority is low. Honza can you look into removing the segfault that occurs per this test case? Thanks -steve
Created attachment 515732 [details] Patch for cpg sent to ML totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it.
Created attachment 515733 [details] Patch for cfg sent to ML totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it.
Created attachment 515838 [details] Patch for cpg sent to ML totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. This patch differs from previous version in storing group_name + pid to be able to restore them in message_handler_req_lib_cpg_join
Patch is now included in flatiron branch, so will be included in next release.