This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 444751 - CMAN: Initiating transition, generation 18
CMAN: Initiating transition, generation 18
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: cman-kernel (Show other bugs)
4
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Christine Caulfield
Cluster QE
GSSApproved
: ZStream
: 449961 (view as bug list)
Depends On:
Blocks: 447955
  Show dependency treegraph
 
Reported: 2008-04-30 10:13 EDT by Shane Bradley
Modified: 2010-10-22 20:37 EDT (History)
6 users (show)

See Also:
Fixed In Version: RHBA-2008-0800
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-25 15:09:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Patch (2.45 KB, patch)
2008-05-12 11:56 EDT, Christine Caulfield
no flags Details | Diff
Program to send a message to all nodes (702 bytes, text/plain)
2008-05-13 06:38 EDT, Christine Caulfield
no flags Details

  None (edit)
Description Shane Bradley 2008-04-30 10:13:37 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.13) Gecko/20080325 Fedora/2.0.0.13-1.fc8 Firefox/2.0.0.13

Description of problem:
RHEL4u6 cluster suite with clvmd/gfs will randomly get messages like this below
over and over:
Mar 29 13:04:37 su010033 kernel: CMAN: Initiating transition, generation 18


I have seen this issue in two cases. In both cases they say the network is fine.
They have verified their network infustructure.

It appears that CMAN cannot talk with any of the other nodes.
The issue is random and I have not found reproducer steps.

I will attach sosreports from a case that is in dev enviroment.

Version-Release number of selected component (if applicable):
cman cman-kernel

How reproducible:
Couldn't Reproduce


Steps to Reproduce:
The issue is random, have not found reproducer steps.

Actual Results:
cluster will hang until it is rebooted/restarted.

Expected Results:
That there should be at most 1 "transition restart" messages.

Additional info:
Comment 7 RHEL Product and Program Management 2008-05-09 11:40:44 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 9 Christine Caulfield 2008-05-12 11:56:33 EDT
Created attachment 305142 [details]
Patch

Here's the patch I'm testing. Do NOT apply this to the RHEL4.6 code as it will
break things even more. It should be applied to the RHEL4 or RHEL47 branches.
Comment 10 Christine Caulfield 2008-05-13 06:38:54 EDT
Created attachment 305228 [details]
Program to send a message to all nodes

It occurs to me that there is an obvious workaround for those people who don't
want to wait for a patched kernel or a reboot, and that is to initiate some
CMAN message activity every so often. GFS mount/umount requests do this, as do
several clvmd requests. Here is a small program that could be run from crontab
that will send a single message to all nodes in a cluster to keep the ack
numbers up-to-date.

I do strongly recommend upgrading to 4.7 along with this though.
Comment 11 Christine Caulfield 2008-05-13 11:39:09 EDT
Added to the RHEL4 branch:

commit 59ba19aa6b41cad189af153e27469056206e782d
Author: Christine Caulfield <ccaulfie@redhat.com>
Date:   Tue May 13 16:37:11 2008 +0100
Comment 12 Chris Feist 2008-05-22 12:18:45 EDT
Requesting 4.6.z stream.
Comment 16 Christine Caulfield 2008-06-05 10:29:22 EDT
*** Bug 449961 has been marked as a duplicate of this bug. ***
Comment 20 errata-xmlrpc 2008-07-25 15:09:55 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0800.html

Note You need to log in before you can comment on or make changes to this bug.