Bug 166652 - Barriers are broken
Summary: Barriers are broken
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cman   
(Show other bugs)
Version: 4
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
Depends On:
TreeView+ depends on / blocked
Reported: 2005-08-24 13:14 UTC by Christine Caulfield
Modified: 2009-04-16 20:00 UTC (History)
1 user (show)

Fixed In Version: RHBA-2006-0166
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-01-06 20:28:30 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2006:0166 normal SHIPPED_LIVE cman-kernel bug fix update 2006-01-06 05:00:00 UTC

Description Christine Caulfield 2005-08-24 13:14:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6

Description of problem:
Barriers in RHEL4 cman are completely broken. The only reason the thing works at all is 

1. Most of the time the two clients of them (membership & sm) are largely synchronised already and
2. the upper layers are tolerant of timeout and will retry or ignore the failure

The observable symptoms of this are few, but usually manifest themselves as very slow transition times (membership retrying until everyone is synchronised) or slow joining: lots of "CMAN sending membership request" messages on the new node.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Bring up a lot of nodes into a cluster at the same time OR
2. Bring a new node into a large(ish) cluster

Actual Results:  These things should happen fairly quickly, but not instantly.

Expected Results:  They can take some while to settle.

Additional info:

I'll check in a fix into the STABLE branch but hold off RHEL4 until we get any evidence it is annoying customers.

Nobody outside of the RHCS code is using barriers anyway - if they were we'd have noticed this sooner!

Comment 1 Christine Caulfield 2005-09-02 12:31:49 UTC
After further thoughts & testing, do NOT apply this to U2. It's faulty in
several ways.

I'll do a proper fix when I return from holiday.

Comment 2 Christine Caulfield 2005-09-14 08:17:32 UTC
Looks like I needed that holiday.

Barriers were broken, but not nearly as badly as I'd thought. The fix is simple
and now applied to STABLE & RHEL4 branches. It's obviously too late for U2 but
the problem is nowhere near serious enough to warrant any panic.

Comment 3 Red Hat Bugzilla 2006-01-06 20:28:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.