Description of problem: The hash code of generate_cluster_id in cnxman.c is not completely satisfying because I have several cases where for a different cluster_name, the generated cluster_id is identical, and therefore, the cman fails to start . Version-Release number of selected component (if applicable): All versions (function generate_cluster_id has not been changed from Update 2 to Update 4) How reproducible: Set for example a cluster_name iocell13 for a HA pair cluster name, and iocell21 for a second one ... cluster_id will be 26773 for both. Try simulation : if you have more than about 20 HA cluster pairs on the same network, you have a high probability to get several identical cluster_id. For example, I made a programm with the generate_cluster_id function and a check on all generated cluster ids with a prefix as 1st parameter and the number of HA pair, the result is : ----------------------------------- (1st column is cluster_name, 2nd is cluster_id) CS4[bas4v3] checkclusterid iocell 30 iocell0 13360 iocell1 13361 iocell2 13362 iocell3 13363 iocell4 13364 iocell5 13365 iocell6 13366 iocell7 13367 iocell8 13368 iocell9 13369 iocell10 26770 iocell11 26771 iocell12 26772 iocell13 26773 iocell14 26774 iocell15 26775 iocell16 26776 iocell17 26777 iocell18 26778 iocell19 26779 iocell20 26772 iocell21 26773 iocell22 26774 iocell23 26775 iocell24 26776 iocell25 26777 Aiocell26 43418 Aiocell27 43419 Aiocell28 43420 Aiocell29 43421 SORRY / Some cluster ids are identical -------------------------------------- and whatever prefix you put for the cluster_name, you always got several identical cluster_id. Steps to Reproduce: See above. 1. 2. 3. Actual results: Expected results: After discussion on the ML RH CS4, the good solution would be to add a cluster_id field in cluster.conf so that we could force the cluster_id at configuration, and if field does not appear, actual generate_cluster_id function would be applied as usual. Additional info:
Created attachment 143636 [details] Barely tested patch Here's a quick patch. add <cman cluster_id="12345"/> to your cluster.conf file.
Checked in for RHEL4: Checking in cman-kernel/src/cnxman-socket.h; /cvs/cluster/cluster/cman-kernel/src/Attic/cnxman-socket.h,v <-- cnxman-socket.h new revision: 1.7.2.3; previous revision: 1.7.2.2 done Checking in cman-kernel/src/cnxman.c; /cvs/cluster/cluster/cman-kernel/src/Attic/cnxman.c,v <-- cnxman.c new revision: 1.42.2.26; previous revision: 1.42.2.25 done Checking in cman/cman_tool/cman_tool.h; /cvs/cluster/cluster/cman/cman_tool/cman_tool.h,v <-- cman_tool.h new revision: 1.3.2.5; previous revision: 1.3.2.4 done Checking in cman/cman_tool/join.c; /cvs/cluster/cluster/cman/cman_tool/join.c,v <-- join.c new revision: 1.12.2.11; previous revision: 1.12.2.10 done Checking in cman/cman_tool/join_ccs.c; /cvs/cluster/cluster/cman/cman_tool/join_ccs.c,v <-- join_ccs.c new revision: 1.7.2.8; previous revision: 1.7.2.7 done
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0134.html