Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 443358

Summary:

merge of openais partitions and disallowed cman nodes

Product:

Red Hat Enterprise Linux 5

Reporter:

Andrew Ryan <aryan>

Component:

cman

Assignee:

Christine Caulfield <ccaulfie>

Status:

CLOSED NOTABUG

QA Contact:

GFS Bugs <gfs-bugs>

Severity:

high

Docs Contact:

Priority:

urgent

Version:

5.2

CC:

ccaulfie, cluster-maint, edamato, jplans, kanderso, nstraz, sdake, tao, teigland

Target Milestone:

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2008-08-01 09:10:40 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

251966, 460190

Bug Blocks:

391501

Attachments:

Description	Flags
allow a clean node to merge with a dirty node	none

Description David Robinson 2008-04-21 05:18:11 UTC

+++ This bug was initially created as a clone of Bug #251966 +++

Description of problem:

Customer reports a possible split brain condition caused by a malfunction of
fence. Under some conditions the fence cluster group (the one displayed by
"group_tool -v" command) can go into JOIN_START_WAIT state and stays there
forever. This means that when a fence action is required its silently discarded
and other cluster group are allowed to perform their recovery steps. This can
easily lead to a split brain condition in a two node cluster, where the fence
action is not performed and the two nodes may recover the same GFS journal or
mount the same ext3 fs.

How reproducible:
Always

Steps to Reproduce:
1) Configure post_join_delay="60", this is not required but makes it easier to
reproduce the problem
2) Start both nodes at the same time, but keep the network interface for the
heartbeat channel disconnected
3) When both nodes are waiting at the fencing startup, wait a few seconds then
connect the network interface

(This is simple to reproduce with 2 xen guests. Configure 2 xen guests as a
cluster. Shutdown the network bridge from the host then boot both guests. whilst
fenced is waiting enable up the bridge.)

Actual results:
4) the nodes have fencing stuck in JOIN_START_WAIT using two distinct id, from
now on every fence action will be silently discarded, the other clustered
services will perform their recovery action as the fence action was performed

The output of "group_tool -v" shows the services stuck on JOIN_START_WAIT and
using two distinct group id on each node:

Node 1:
type             level name     id       state node id local_done
fence            0     default  00010001 JOIN_START_WAIT 2 200020001 1
[1 2]
dlm              1     clvmd    00020001 none
[1 2]

Node 2:
type             level name     id       state node id local_done
fence            0     default  00010002 JOIN_START_WAIT 1 100020001 1
[1 2]
dlm              1     clvmd    00020001 none
[1 2]

Expected results:
4) either the cluster should not form, or the two clusters should be merged
successfully.

Additional info:
When the nodes start up, they each form a 1-node openais cluster independent of
the other. fence_tool join is run on each node which creates group state in both
clusters. In the situation this bug describes, the dirty flag will not prevent
the clusters from merging because NODE_FLAGS_BEENDOWN is not set:

if (msg->flags & NODE_FLAGS_DIRTY && node->flags & NODE_FLAGS_BEENDOWN)

The attached patch modifies the dirty flag test so that its possible for a
"clean" node (one without state) to join a dirty node regardless of whether its
BEENDOWN.

Comment 1 David Robinson 2008-04-21 05:18:11 UTC

Created attachment 303098 [details]
allow a clean node to merge with a dirty node

Comment 3 Christine Caulfield 2008-04-21 10:42:21 UTC

That patch looks good to me. I've committed it to the master and STABLE branches.

Comment 5 Christine Caulfield 2008-04-28 15:08:29 UTC

Committed to RHEL5 branch

commit 4cd89a0d7eef3c0a8f02517957b393a5be736f46
Author: Christine Caulfield <ccaulfie>
Date:   Mon Apr 28 16:07:08 2008 +0100

Comment 8 Christine Caulfield 2008-08-01 09:10:40 UTC

I'm going to close this NOTABUG and revert the commit as it causes a serious bug
(see 457107). 

There's a misunderstanding of how TRANSITION messages work for a start (which I
should have spotted before I applied the change). And if the DLM can start
without fencing then that's a different problem (if it IS a problem at all)
which isn't related.

Comment 9 Kiersten (Kerri) Anderson 2008-08-29 18:41:01 UTC

Further updates, while this bug is now closed, the story continues in bug 460190.  We now believe there are network switches that end up delaying initial connections for up to 60 or more seconds.  This ends up with a situation where we end up with split fence domains during initial cluster startup.