Description of problem: When running the /etc/init.d/cman startup script, it fails while executing the set_networking_params() function on both of the members of my Fedora 11 based cluster cluster. Version-Release number of selected component (if applicable): cman-3.0.3-1.fc11.x86_64 openaislib-1.1.0-1.fc11.x86_64 openais-1.1.0-1.fc11.x86_64 rgmanager-3.0.3-1.fc11.x86_64 gfs2-utils-3.0.3-1.fc11.x86_64 lvm2-cluster-2.02.48-2.fc11.x86_64 corosynclib-1.1.0-1.fc11.x86_64 corosync-1.1.0-1.fc11.x86_64 kernel-2.6.30.8-64.fc11.x86_64 How reproducible: Every time Steps to Reproduce: 1. Boot cluster node with /etc/init.d/cman enabled Or 1. service cman start Actual results: "Setting network parameters... [FAILED]" and cman script stops executing resulting in the cluster member not joining the cluster. Expected results: "Setting network parameters... [OK]" and cman script completing with the node having joined the cluster. Additional info: Attaching a log file showing that because the default (existing) /proc/sys/net/core/rmem_max value is _greater_ than the expected value, setting the value to whatever the cluster needs/wants is failing. Would think the test should be to validate that the rmem_max (and rmem_default) are set to a value greater or equal to what the cluster stack needs, the startup would proceed, if not the values get elevated. This since other applications (3rd party) may require a higher default network read buffer value than what the cluster software stack needs on its own?
Created attachment 365130 [details] /etc/sysconfig/cman file
Created attachment 365131 [details] Log from "bash -x /etc/init.d/cman start" run Log file showing failed /etc/init.d/cman start.
Created attachment 365133 [details] proposed fix Please patch /etc/init.d/cman and test. The patch should address the issue Thanks
Tested the patch. The cman service now starts with set_networking_params enabled as part of the start action.
Fix is now upstream. git commit 1ece3abed41a6debf4175201c4061108e9034e68 Fabio
ok also for me, I had the same problem after updating from version 3.0.2-1.fc11.x86_64 to 3.0.3-1.fc11.x86_64 Without the proposed patch I get: [root@r]# service cman start Starting cluster: Global setup... [ OK ] Loading kernel modules... [ OK ] Mounting configfs... [ OK ] Setting network parameters... FATAL: Module lock_dlm not found. [FAILED] Now with the proposed patch all is ok. Thanks, Gianluca
update packages for F11 are available in koji and bodhi. They should be available "soonish" (it's a manual process) in f10 and f11 updates channels. Fabio