Bug 506776

Summary: token timeout 1000 doesn't work
Product: [Fedora] Fedora Reporter: David Teigland <teigland>
Component: corosyncAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: agk, ccaulfie, cfeist, fdinitto, mbroz, sdake, swhiteho
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-24 07:34:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Teigland 2009-06-18 16:16:54 UTC
Description of problem:

cluster.conf:  <totem token="1000"/>

# cman_tool join
corosync died: Could not read cluster configuration

Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Corosync Executive Service RELEASE 'trunk'
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Copyright (C) 2006-2008 Red Hat, Inc.
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Corosync Executive Service: started and ready to provide service.
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Successfully parsed cman config
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] Successfully configured openais services to load
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] parse error in config: The token hold timeout parameter (29 ms) may not be less then (30 ms).
Jun 18 11:11:22 bull-01 corosync[26149]:   [MAIN  ] AIS Executive exiting with status -9 at main.c:823.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Steven Dake 2009-06-18 16:28:28 UTC
cman bug.  We can talk on irc chrissie.

Regards
-steve

Comment 2 Christine Caulfield 2009-06-19 07:50:59 UTC
This also happens if I add the line to corosync.conf, with no cman loaded ;-)

Comment 3 Steven Dake 2009-07-16 19:34:54 UTC
What line did you add?

I added token: 1000 to corosync.conf and it seems to work fine standalone.  The problem is the retrans_before_loss_const is too high in cman configuration loader (its something like 20) for a token timeout period of 1000.  that value is used to determine the value of token_hold_time automatically and in this case, it is below a reasonable value (30 msec is the lowest timer value available).

Regards
-steve

Comment 4 Christine Caulfield 2009-07-17 07:32:30 UTC
I'd guessed it would be something like that. In which case we simply need to make sure that people change the parameters properly - as they would have to when changing corosync.conf.

So, can we automatically adjust retrans_before_loss_const when token changes ? If not then I think we should just close this NOTABUG and let people adjust parameters themselves. You ARE supposed to know what you're doing when changing these things anyway, not blindly adjusting them until something visible changes.

If it is possible to adjust the parameters then it ought to happen in corosync, not cman anyway...