Bug 613880 - cluster.conf fails to validate when <totem ... > is set.
cluster.conf fails to validate when <totem ... > is set.
Status: CLOSED DUPLICATE of bug 614697
Product: Fedora
Classification: Fedora
Component: cluster (Show other bugs)
13
All Linux
low Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-13 00:56 EDT by digimer
Modified: 2010-07-15 10:29 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-07-15 10:29:28 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
the cluster.conf file that fails to validate. (3.46 KB, text/plain)
2010-07-13 00:56 EDT, digimer
no flags Details
The cluster.rng file used to validate. (115.45 KB, application/octet-stream)
2010-07-13 00:57 EDT, digimer
no flags Details
Fixed cluster.conf which validates. (3.42 KB, text/plain)
2010-07-13 14:34 EDT, Lon Hohberger
no flags Details

  None (edit)
Description digimer 2010-07-13 00:56:30 EDT
Created attachment 431346 [details]
the cluster.conf file that fails to validate.

Description of problem:

I added the following to '/etc/cluster/cluster.conf':
-------------------------------------------------------
<cluster name="an-cluster" config_version="2">
        <totem rppmode="passive" version="2" secauth="off" threads="off">
                <interface ringnumber="0" bindnetaddr="10.0.1.0" mcastaddr="226.94.1.1" mcastport="5405" />
                <interface ringnumber="1" bindnetaddr="10.0.0.0" mcastaddr="226.94.1.2" mcastport="5405" />
        </totem>
        ...
</cluster>
-------------------------------------------------------

When I then tried to validate it with 'ccs_config_validate' I got:
-------------------------------------------------------
Relax-NG validity error : Extra element totem in interleave
tempfile:3: element totem: Relax-NG validity error : Element cluster failed to validate content
Configuration fails to validate
-------------------------------------------------------


Version-Release number of selected component (if applicable):

See attached cluster.conf and cluster.rng files. Please note that the cluster.rng in use contains a modification from source to add support for a custom fence device. The cluster.conf failed validation against a stock cluster.rng as well.

How reproducible:

Seems to be 100%

Steps to Reproduce:
1. Add the above <totem ...> syntax
2. Try to validate.
3.
  
Actual results:

Validation failed.

Expected results:

Validation passed.

Additional info:
Comment 1 digimer 2010-07-13 00:57:22 EDT
Created attachment 431347 [details]
The cluster.rng file used to validate.
Comment 2 Lon Hohberger 2010-07-13 14:28:19 EDT
The following keyword is incorrect:

   rppmode="active"

Should be:

   rrp_mode="passive"

The following two keywords are not supported at this time by cman-preconfig; hence they are not valid as part of cluster.conf at this time:

   version="2"
   threads="off"

This should work:

<cluster name="an-cluster" config_version="2">
        <totem rrp_mode="passive" secauth="off">
                <interface ringnumber="0" bindnetaddr="10.0.1.0"
mcastaddr="226.94.1.1" mcastport="5405" />
                <interface ringnumber="1" bindnetaddr="10.0.0.0"
mcastaddr="226.94.1.2" mcastport="5405" />
        </totem>
        ...
</cluster>
Comment 3 Lon Hohberger 2010-07-13 14:29:04 EDT
(In reply to comment #2)
> The following keyword is incorrect:
> 
>    rppmode="active"
> 

Oops, I meant:

     rppmode="passive"
Comment 4 Lon Hohberger 2010-07-13 14:33:35 EDT
There is also no handling of the quiet="1" parameter in the <fencedevice> tags at this point.
Comment 5 Lon Hohberger 2010-07-13 14:34:50 EDT
Created attachment 431548 [details]
Fixed cluster.conf which validates.
Comment 6 digimer 2010-07-13 14:39:14 EDT
(In reply to comment #4)
> There is also no handling of the quiet="1" parameter in the <fencedevice> tags
> at this point.    

I built a new fence device. That argument is used by the fence agent I added. There is an addition to the cluster.rng I added to properly validate against it. I am working on adding support upstream.
Comment 7 digimer 2010-07-13 14:40:28 EDT
(In reply to comment #5)
> Created an attachment (id=431548) [details]
> Fixed cluster.conf which validates.    

I will test this tonight. I suspect it will work so this bug is probably safe to close.

Is there a comprehensive list of what openais/corosync arguments are and are not currently supported by cman's cluster.conf?
Comment 8 digimer 2010-07-14 00:06:58 EDT
I made the changes (used the cluster.conf attached here and marked as fixed) and it validates. However, on starting cman, I get this:

Jul 14 00:04:20 an-node01 kernel: DLM (built Jul  6 2010 22:33:59) installed
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service.
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] Corosync built-in features: nss rdma
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] Successfully parsed cman config
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] Successfully configured openais services to load
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] parse error in config: No multicast address specified
Jul 14 00:04:20 an-node01 corosync[2364]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1430.
Comment 9 Fabio Massimo Di Nitto 2010-07-14 00:20:55 EDT
(In reply to comment #2)
> The following keyword is incorrect:
> 
>    rppmode="active"
> 
> Should be:
> 
>    rrp_mode="passive"
> 
> The following two keywords are not supported at this time by cman-preconfig;
> hence they are not valid as part of cluster.conf at this time:
> 
>    version="2"
>    threads="off"
> 

This shouldn't be a problem at all. cman-preconfig  copies corosync config bits pristine from within <cluster to the top level of the objdb where corosync can access them.

So theoretically any corosync config option can be changed from within cluster.conf, clearly the question if it makes sense still stands.
Comment 10 Lon Hohberger 2010-07-14 13:37:49 EDT
It's trivial to add corosync bits to cluster.conf schema; whatever we decide is fine.

I didn't realize cman-preconfig would just pass things up, so it's my error.
Comment 11 digimer 2010-07-14 13:57:43 EDT
(In reply to comment #10)
> It's trivial to add corosync bits to cluster.conf schema; whatever we decide is
> fine.
> 
> I didn't realize cman-preconfig would just pass things up, so it's my error.    

Will this lead to an updated cluster.rng?
Comment 12 Lon Hohberger 2010-07-15 10:29:28 EDT
As it turns out, according to bug 614697, you can't use cluster.conf to configure RRP mode using corosync directives when using a CMAN cluster:

https://bugzilla.redhat.com/show_bug.cgi?id=614697

See here for the correct way to configure RRP mode with CMAN-managed clusters:

http://sources.redhat.com/cluster/wiki/MultiHome

*** This bug has been marked as a duplicate of bug 614697 ***

Note You need to log in before you can comment on or make changes to this bug.