Bug 556804 - cman sets token_retransmits_before_loss_const wrongly
Summary: cman sets token_retransmits_before_loss_const wrongly
Keywords:
Status: CLOSED DUPLICATE of bug 623176
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-19 13:34 UTC by Christine Caulfield
Modified: 2011-03-21 23:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-03-21 23:26:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Christine Caulfield 2010-01-19 13:34:36 UTC
Description of problem:

cman currently sets the corosync parameter totem.token_retransmits_before_loss_const to 20. The reasons for this seem to be lost in the mists of time, but recent testing of openais timeouts suggest that it is unhelpful at best, and may actually impede operation of larger clusters

With the normal cman defaults I could get a 32 node cluster up and running with no trouble. But reducing the token timeout much below that would cause failures. By removing this rogue value from cman and letting corosync calculate it, the token value could be reduced quite significantly and keep a stable cluster.

That might not sound very useful in itself (though it always handy to reduce node failure detection times), but it could have implications for running normally configured clusters on busy LANs or systems. So I recommend we remove this configuration value.

Comment 1 Lon Hohberger 2011-02-15 21:16:58 UTC
So, Steve thinks this may be related to bug 623176.

What did you mean by "reducing token timeout much below that" -- below what, 10,000ms ?

Comment 2 Lon Hohberger 2011-03-21 23:26:20 UTC

*** This bug has been marked as a duplicate of bug 623176 ***


Note You need to log in before you can comment on or make changes to this bug.