Red Hat Bugzilla – Bug 162014
gfs mounts allowed before fencing completes
Last modified: 2009-04-16 16:30:32 EDT
Description of problem:
There is a corner case not handled by SM which can
result in new gfs mounts happening before a fencing
is completed for a failed node that had the fs mounted.
There is a chance of this leading to fs corruption.
- Nodes A, B, C are cluster members and all have joined
the fence domain.
- A has gfs mounted, B and C do not.
- A fails.
- B and C begin fencing A. This can take some time,
especially noticable when using fence_manual.
- B and/or C mount gfs.
- The new mount by B/C is allowed to go ahead before
the fencing for A has completed.
- If A is still writing to gfs and fencing has not
completed before B/C do initial gfs recovery, the
fs can be corrupted.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Fixed in STABLE and RHEL4 branches:
SM should wait for all recoveries to complete before it processes
any group joins/leaves. Fixes bz 162014.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.