Bug 245893

Summary: Can not start SCTP association on FC6
Product: Red Hat Enterprise Linux 5 Reporter: Frederic Hornain <fhornain>
Component: kernelAssignee: David Teigland <teigland>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: low Docs Contact:
Priority: low    
Version: 5.0CC: ccaulfie, cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-06-27 09:16:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frederic Hornain 2007-06-27 09:03:01 UTC
Description of problem:
Hi,

Well, I have tried to run cluster suite on FC6 however I still have diffculties 
to make it works. I use it as failover Ip cluster.
- I just want to ping logical ip of this 2 cluster nodes. -

To summarize :
I have two virtual nodes based on xen. 

Two FC6 with the following installed packages :
cman-2.0.18-2.fc6
rgmanager-2.0.8-1.fc6
gfs2-utils-0.1.7-1.fc6
lvm2-cluster-2.02.06-1.5

I set it up with manual fence - test purpose -. Indeed, I can not afford a 
fence device for the moment. :)
BTW, I know in this configuration I did not need clvm and gfs2 but I installed 
them cause I thought it was going to fix previous other problems. So as they 
were installed, I leave them.

Well, I can start cman, clvmd, gfs2, rgmanager on both nodes without problem 
However as soon as I am tell rgmanager to start a ressource or a service it 
hangs for a while.

So this is what I have managed to retreive from the log file from the both 
node :

dlm : Error sending to node X -32
dlm : Can't start SCTP association - retrying
dlm : Initiating association with node X

so on and so for...

oh yes, I have seen just after launching rgmanager the following message :
Module sctp cannot be unloaded due to the unsafe usage in the 
net/sctp/protocol.c:1189

Obviously the service does not work but the cluster seems active cause i have 
correct information when I ran 
cman_tool status
cman_tool services
cman_tool nodes
clustat
css_tool
group_tool
...

Version-Release number of selected component (if applicable):
cman-2.0.18-2.fc6
rgmanager-2.0.8-1.fc6
gfs2-utils-0.1.7-1.fc6
lvm2-cluster-2.02.06-1.5
kernel-xen-2.6.18-1.2798.fc6


How reproducible:
It is easyly reproductible. Well, I think.


Steps to Reproduce:
1. Well, install a basic fedora core 6 without updating it.
2. Use Xen virtual Console.
3. Create two virtual machines
4. give them an hostname and update /etc/hosts
2. Then install the previously listed packages on both node.
3. Then create the same cluster.conf on both node with manual fence and IP 
failover.
4. Start cman
5. Start gfs2
6. Start clvmd
7. Start rgmanager
  
Actual results:
On the qemu consol - text mode - you should see  
dlm : Error sending to node X -32
dlm : Can't start SCTP association - retrying
dlm : Initiating association with node X

Expected results:

The cluster does not work.

Additional info:
Firstly, you can contact me via email.
Secondly, if you are doing the same start sequence on only one node you will 
have a mono cluster which is working perfectly.

Comment 1 Christine Caulfield 2007-06-27 09:16:33 UTC
This was fixed for RHEL5, but I very much doubt the patches got into FC6.

You'll need to get the RHEL5 kernel or the latest upstream kernel if you want to
use SCTP with the DLM.

Comment 2 Nate Straz 2007-12-13 17:40:49 UTC
Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed.