Description of problem:
fence_scsi needs to check to make sure a node is actually registered with a
device before it attempts to 'unregister' from it. The problem lies in that a
node could have multiple clustered volumes that reside on different SCSI
devices. If one of the SCSI devices supports persistent reservations and the
other does not, the fence agent (fence_scsi) will attempt to unregister from
both and, of course, will fail when attempting to unregister from the device
that does not support persistent reservations. As a result, fencing appears to
have failed. This is easily fixed by first checking that the registration we are
trying to revoke is actually registered with the given device. If it is not,
there is nothing to do.
With that said, it seems that it would be unwise to run this configuration. This
is due to the fact that if a node needs to be fenced and it is configured to use
fence_scsi, the fencing would succeed but the node would not be rebooted as with
traditional fence agents. If said node had other filesystems on devices that did
not support SCSI persistent reservations, the node would continue to use that
filesystem as if nothing had happened. That would be bad.
Always, so long as you have the node configured to use fence_scsi and have 2
SCSI devices: one that support persistent reservations and another that does not.
After some discussion, not sure we will change this behavior. The reason is that
if at fence time we find a cluster filesystem that exists on SCSI devices that
do not support SCSI persistent reservations, this need to appear as a fence
failure. If said device(s) don't support SCSI reservations, then they were never
registered to begin with. If we were to check to make sure a node is registered
with a device before we unregister, this would prevent the fence agent from
attempting to unregister from those devices .. and .. fencing would appear to be
successul. Of course, it is no succesful because there would be filesystems
still accessible. This needs to be reported.
To help prevent this from happening, a script (fence_scsi_test) was written.
This should be run prior to configuring a cluster to use fence_scsi. It will
report the devices that support SCSI reservations, as well as those that do not.
Any devices that do not support SCSI persistent reservation should not be used
in a cluster filesystem (if fence_scsi is to be used for fencing). If those
unsupported devices must be used for a cluster filesystem, fence_scsi should not
Using the script mentioned above should help prevent misuse of fence_scsi. With
that said, if at fence time we find a cluster filesystem on devices that do not
support SCSI persistent reservations, fencing will fail (fence_scsi).
Fixing Product Name. Cluster Suite was merged into Red Hat Enterpise Linux for
5.0. In addition dlm, fence and ccs were merged into the cman package, so
bugzilla should reflect package name where those utilities are located.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.