This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 613064 - Method to cause one node to delay fencing in a two node cluster
Method to cause one node to delay fencing in a two node cluster
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman (Show other bugs)
5.5
All Linux
low Severity medium
: rc
: ---
Assigned To: Marek Grac
Cluster QE
: FutureFeature
Depends On:
Blocks: 614046
  Show dependency treegraph
 
Reported: 2010-07-09 11:54 EDT by Lon Hohberger
Modified: 2016-04-26 10:15 EDT (History)
10 users (show)

See Also:
Fixed In Version: cman-2.0.115-47.el5
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 614046 (view as bug list)
Environment:
Last Closed: 2011-01-13 17:35:25 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
proposed patch (1.86 KB, patch)
2010-07-13 07:20 EDT, Fabio Massimo Di Nitto
fdinitto: review? (teigland)
Details | Diff

  None (edit)
Description Lon Hohberger 2010-07-09 11:54:03 EDT
Description of problem:

Currently, there is no easy way to enable one host to "delay" fencing.  Users often simply craft a "fence_sleep", which works just fine, but requires a whole lot more work than should be necessary.  (Obviously, the agent "fence_sleep" itself is not suitable for inclusion in linux-cluster because it doesn't actually take any fencing actions; it simply sleeps and returns 0...)

Background:

Some fencing devices, such as HP iLO, take a long time to process requests.  On a cluster which partitions but still has access to the iLO devices, this can be problematic.  Because it takes a long time for iLO to process the requests, there is a window where two in-flight 'power-off' requests can cause the cluster to turn itself off - but not back on.

While this is great from a power-saving perspective, it is not very good from an availability perspective.

Ordinarily, this can be resolved using a quorum disk, however, a quorum disk is a fair bit of additional complexity and wholly unnecessary (or even undesirable) in many instances - for example, in clusters which serve data via NFS instead of a SAN, a quorum disk may not even be an option.

Proposal:

The proposal here is to add a method to make one node delay fencing for a period of time in order to allow the other node to "win" in the case of a network partition of the cluster intraconnect.  In the event that the "primary" node goes down, the "backup" node will, indeed, take longer to fence - but at a benefit of reduced complexity and highly deterministic behavior (which can't currently be achieved using qdiskd).

Fortunately, all of the core code required exists in the cman package today.  All we have to do is enable it on a per-host basis.

The specific proposal here, after talking with others, is to simply expose post_fail_delay via /etc/sysconfig/cman, and add it to the list of options when we start fenced.

For example, adding the following to /etc/sysconfig/cman on one host:

   POST_FAIL_DELAY=30

... and then, in the cman initscript, calling fenced with the corresponding -f option:

   fenced -f $POST_FAIL_DELAY

... should have the desired effect.
Comment 2 Fabio Massimo Di Nitto 2010-07-10 00:20:29 EDT
The only problem I see with this suggestion is that the delay is not immediately visible in cluster.conf.

My suggestion would be to have a generic/reserved keyword that fenced would process and consider as a sleep($time) directly.

We only need to make sure the keyword is not currently use by any fence agents.

fenced already does some parsing of fence agents options, so adding one keyword should be fairly simple and non-intrusive.
Comment 6 David Teigland 2010-07-12 14:09:18 EDT
Making post_fail_delay configurable in /etc/sysconfig doesn't preclude also adding delay args to agents where they are useful like ilo.  Both seem fine to me.

The /etc/sysconfig settings are obvious when you run ps, so they are not hidden.
Comment 7 Fabio Massimo Di Nitto 2010-07-13 07:20:34 EDT
Created attachment 431421 [details]
proposed patch

proposed patch in attachment.

    <clusternode name="rhel6-node1" votes="1" nodeid="1">
      <fence>
        <method name="single">
          <device name="virsh_fence" port="rhel6-node1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rhel6-node2" votes="1" nodeid="2">
      <fence>
        <method name="single">
          <device name="virsh_fence" port="rhel6-node2" delay="20"/>
        </method>
      </fence>
    </clusternode>


[root@rhel6-node2 libfence]# fence_node rhel6-node1
fence rhel6-node1 success
[root@rhel6-node2 libfence]# fence_node rhel6-node2
Delay execution by 20 seconds
fence rhel6-node2 success
[root@rhel6-node2 libfence]# 

The keyword "delay" is currently unused and I briefly spoke to Marek on IRC that agrees it can be used as reserved word (since it won´t hit any agent).

David, I commented out the test code I used, I don´t plan to commit it in the final patch (assuming the patch is ok with you. this is mostly to prove that it works as we expect.
Comment 8 David Teigland 2010-07-13 09:35:43 EDT
Oh, dear, sorry, I completely misunderstood.  I thought you were talking about adding "delay" as a fence agent arg.  That would be ok with me.  I don't like at all hijacking one of the node args like comment 7 does.

So the two options which are both ok with me are
1. using post_fail_delay, with local config in /etc/sysconfig
2. adding delay args to fence agents where it's useful, like ilo
Comment 9 Marek Grac 2010-07-13 09:40:39 EDT
I agree with "delay" as reserved word
Comment 10 Perry Myers 2010-07-13 11:04:48 EDT
Ok reassigning to Marek since we'll do this as a delay option to the core python fencing library, and then on an as needed basis extend to other fences that are outside of the core fencing library.
Comment 17 errata-xmlrpc 2011-01-13 17:35:25 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0036.html

Note You need to log in before you can comment on or make changes to this bug.