Bug 418541 - RFE: Make fence_ack_manual in RHEL5 branch talk to manual override socket
Summary: RFE: Make fence_ack_manual in RHEL5 branch talk to manual override socket
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.0
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: Lon Hohberger
QA Contact: GFS Bugs
Depends On:
TreeView+ depends on / blocked
Reported: 2007-12-10 18:02 UTC by Lon Hohberger
Modified: 2011-06-13 21:58 UTC (History)
2 users (show)

Clone Of:
Last Closed: 2008-05-21 15:58:34 UTC

Attachments (Terms of Use)
Makes fence_ack_manual work as override (needs -e flag) (4.87 KB, text/plain)
2007-12-14 16:57 UTC, Lon Hohberger
no flags Details

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0347 normal SHIPPED_LIVE cman bug fix and enhancement update 2008-05-20 12:39:41 UTC

Description Lon Hohberger 2007-12-10 18:02:31 UTC
Description of problem:

In RHEL5, we introduced a socket whereby administrators could issue commands to
unstick cluster nodes where fencing has failed.  It looks like this:

   echo "nodename.mydomain.com" > /var/run/cluster/fenced_override

The problem is that this is highly timing dependent - that is, an administrator
must hit it within the 5-second fence retry window.

In the head branch of CVS, fence_ack_manual is a script which waits for
/var/run/cluster/fenced_override to exist before issuing the command.

fence_ack_manual in the RHEL5 branch should also be able to do this.  This will
enable administrators to fix broken clusters with less difficulty.

Comment 1 RHEL Product and Program Management 2007-12-10 18:54:24 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update

Comment 2 Lon Hohberger 2007-12-14 16:57:14 UTC
Created attachment 289271 [details]
Makes fence_ack_manual work as override (needs -e flag)

Comment 3 Lon Hohberger 2007-12-14 16:57:41 UTC
Worked for me.

Dec 14 11:53:53 molly fenced[434]: frederick not a cluster member after 6 sec
Dec 14 11:53:53 molly fenced[434]: fencing node "frederick"
Dec 14 11:53:53 molly fenced[434]: fence "frederick" failed
Dec 14 11:53:54 molly fenced[434]: fence "frederick" overridden by administrator

Comment 4 Lon Hohberger 2007-12-17 20:05:06 UTC
Patch in CVS

Checking in agents/manual/Makefile;
/cvs/cluster/cluster/fence/agents/manual/Makefile,v  <--  Makefile
new revision:; previous revision: 1.7
Checking in agents/manual/ack.c;
/cvs/cluster/cluster/fence/agents/manual/Attic/ack.c,v  <--  ack.c
new revision:; previous revision: 1.3

Comment 6 Lon Hohberger 2008-03-27 19:43:26 UTC
I have tested this using the following command with cman-2.0.81:

  fence_ack_manual -e -n frederick

It works as expected; fence_ack_manual waits for fencing to fail (as is
expected; I simply did a 'reboot -fn' while disabling the fencing device) and
issues the override for us.  It's far easier to use than timing the "echo" method.

Comment 8 errata-xmlrpc 2008-05-21 15:58:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.