Bug 418541 - RFE: Make fence_ack_manual in RHEL5 branch talk to manual override socket
RFE: Make fence_ack_manual in RHEL5 branch talk to manual override socket
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman (Show other bugs)
5.0
All Linux
low Severity low
: ---
: ---
Assigned To: Lon Hohberger
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-12-10 13:02 EST by Lon Hohberger
Modified: 2011-06-13 17:58 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2008-0347
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-21 11:58:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Makes fence_ack_manual work as override (needs -e flag) (4.87 KB, text/plain)
2007-12-14 11:57 EST, Lon Hohberger
no flags Details

  None (edit)
Description Lon Hohberger 2007-12-10 13:02:31 EST
Description of problem:

In RHEL5, we introduced a socket whereby administrators could issue commands to
unstick cluster nodes where fencing has failed.  It looks like this:

   echo "nodename.mydomain.com" > /var/run/cluster/fenced_override

The problem is that this is highly timing dependent - that is, an administrator
must hit it within the 5-second fence retry window.

In the head branch of CVS, fence_ack_manual is a script which waits for
/var/run/cluster/fenced_override to exist before issuing the command.

fence_ack_manual in the RHEL5 branch should also be able to do this.  This will
enable administrators to fix broken clusters with less difficulty.
Comment 1 RHEL Product and Program Management 2007-12-10 13:54:24 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 2 Lon Hohberger 2007-12-14 11:57:14 EST
Created attachment 289271 [details]
Makes fence_ack_manual work as override (needs -e flag)
Comment 3 Lon Hohberger 2007-12-14 11:57:41 EST
Worked for me.

Dec 14 11:53:53 molly fenced[434]: frederick not a cluster member after 6 sec
post_join_delay
Dec 14 11:53:53 molly fenced[434]: fencing node "frederick"
Dec 14 11:53:53 molly fenced[434]: fence "frederick" failed
Dec 14 11:53:54 molly fenced[434]: fence "frederick" overridden by administrator
intervention
Comment 4 Lon Hohberger 2007-12-17 15:05:06 EST
Patch in CVS

Checking in agents/manual/Makefile;
/cvs/cluster/cluster/fence/agents/manual/Makefile,v  <--  Makefile
new revision: 1.7.2.1; previous revision: 1.7
done
Checking in agents/manual/ack.c;
/cvs/cluster/cluster/fence/agents/manual/Attic/ack.c,v  <--  ack.c
new revision: 1.3.16.1; previous revision: 1.3
done
Comment 6 Lon Hohberger 2008-03-27 15:43:26 EDT
I have tested this using the following command with cman-2.0.81:

  fence_ack_manual -e -n frederick

It works as expected; fence_ack_manual waits for fencing to fail (as is
expected; I simply did a 'reboot -fn' while disabling the fencing device) and
issues the override for us.  It's far easier to use than timing the "echo" method.
Comment 8 errata-xmlrpc 2008-05-21 11:58:34 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0347.html

Note You need to log in before you can comment on or make changes to this bug.