Bug 162014 - gfs mounts allowed before fencing completes
Summary: gfs mounts allowed before fencing completes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cman
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: David Teigland
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-06-29 04:11 UTC by David Teigland
Modified: 2009-04-16 20:30 UTC (History)
1 user (show)

Fixed In Version: RHBA-2005-734
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-10-07 16:46:44 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2005:734 0 normal SHIPPED_LIVE cman-kernel bug fix update 2005-10-07 04:00:00 UTC

Description David Teigland 2005-06-29 04:11:54 UTC
Description of problem:
There is a corner case not handled by SM which can
result in new gfs mounts happening before a fencing
is completed for a failed node that had the fs mounted.
There is a chance of this leading to fs corruption.

- Nodes A, B, C are cluster members and all have joined
  the fence domain.

- A has gfs mounted, B and C do not.

- A fails.

- B and C begin fencing A.  This can take some time,
  especially noticable when using fence_manual.

- B and/or C mount gfs.

- The new mount by B/C is allowed to go ahead before
  the fencing for A has completed.

- If A is still writing to gfs and fencing has not
  completed before B/C do initial gfs recovery, the
  fs can be corrupted.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.
  
Actual results:

Expected results:

Additional info:

Comment 1 David Teigland 2005-06-30 05:30:00 UTC
Fixed in STABLE and RHEL4 branches:
SM should wait for all recoveries to complete before it processes
any group joins/leaves.  Fixes bz 162014.


Comment 3 Red Hat Bugzilla 2005-10-07 16:46:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-734.html



Note You need to log in before you can comment on or make changes to this bug.