Bug 529498 - /etc/init.d/cman fails in set_networking_params with 3.0.2 and 3.0.3
Summary: /etc/init.d/cman fails in set_networking_params with 3.0.2 and 3.0.3
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: cluster
Version: 11
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Fabio Massimo Di Nitto
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-10-17 17:40 UTC by Thomas Sjolshagen
Modified: 2009-10-21 14:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-10-19 04:48:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
/etc/sysconfig/cman file (181 bytes, text/plain)
2009-10-17 17:41 UTC, Thomas Sjolshagen
no flags Details
Log from "bash -x /etc/init.d/cman start" run (6.69 KB, text/plain)
2009-10-17 17:46 UTC, Thomas Sjolshagen
no flags Details
proposed fix (561 bytes, patch)
2009-10-17 17:57 UTC, Fabio Massimo Di Nitto
no flags Details | Diff

Description Thomas Sjolshagen 2009-10-17 17:40:09 UTC
Description of problem:

When running the /etc/init.d/cman startup script, it fails while executing the set_networking_params() function on both of the members of my Fedora 11 based cluster cluster.

Version-Release number of selected component (if applicable):

cman-3.0.3-1.fc11.x86_64
openaislib-1.1.0-1.fc11.x86_64
openais-1.1.0-1.fc11.x86_64
rgmanager-3.0.3-1.fc11.x86_64
gfs2-utils-3.0.3-1.fc11.x86_64
lvm2-cluster-2.02.48-2.fc11.x86_64
corosynclib-1.1.0-1.fc11.x86_64
corosync-1.1.0-1.fc11.x86_64
kernel-2.6.30.8-64.fc11.x86_64

How reproducible:

Every time

Steps to Reproduce:
1. Boot cluster node with /etc/init.d/cman enabled

Or
1. service cman start

  
Actual results:

"Setting network parameters...        [FAILED]"

and cman script stops executing resulting in the cluster member not joining the cluster.

Expected results:

"Setting network parameters...        [OK]"

and cman script completing with the node having joined the cluster.

Additional info:

Attaching a log file showing that because the default (existing) /proc/sys/net/core/rmem_max value is _greater_ than the expected value, setting the value to whatever the cluster needs/wants is failing. 

Would think the test should be to validate that the rmem_max (and rmem_default) are set to a value greater or equal to what the cluster stack needs, the startup would proceed, if not the values get elevated. This since other applications (3rd party) may require a higher default network read buffer value than what the cluster software stack needs on its own?

Comment 1 Thomas Sjolshagen 2009-10-17 17:41:23 UTC
Created attachment 365130 [details]
/etc/sysconfig/cman file

Comment 2 Thomas Sjolshagen 2009-10-17 17:46:17 UTC
Created attachment 365131 [details]
Log from "bash -x /etc/init.d/cman start" run

Log file showing failed /etc/init.d/cman start.

Comment 3 Fabio Massimo Di Nitto 2009-10-17 17:57:05 UTC
Created attachment 365133 [details]
proposed fix

Please patch /etc/init.d/cman and test.

The patch should address the issue

Thanks

Comment 4 Thomas Sjolshagen 2009-10-18 16:49:17 UTC
Tested the patch. The cman service now starts with set_networking_params enabled as part of the start action.

Comment 5 Fabio Massimo Di Nitto 2009-10-19 04:48:31 UTC
Fix is now upstream.

git commit 1ece3abed41a6debf4175201c4061108e9034e68

Fabio

Comment 6 Gianluca Cecchi 2009-10-21 13:58:10 UTC
ok also for me, 
I had the same problem after updating from version 3.0.2-1.fc11.x86_64 to 3.0.3-1.fc11.x86_64
Without the proposed patch I get:
[root@r]# service cman start
Starting cluster:
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Setting network parameters... FATAL: Module lock_dlm not found.
                                                           [FAILED]

Now with the proposed patch all is ok.
Thanks,
Gianluca

Comment 7 Fabio Massimo Di Nitto 2009-10-21 14:11:03 UTC
update packages for F11 are available in koji and bodhi.

They should be available "soonish" (it's a manual process) in f10 and f11 updates channels.

Fabio


Note You need to log in before you can comment on or make changes to this bug.