Bug 241216

Summary: fence_apc 1.32.45 doesn't work
Product: [Retired] Red Hat Cluster Suite Reporter: Jonny <jschulz>
Component: fenceAssignee: Jim Parsons <jparsons>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-27 13:57:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Jonny 2007-05-24 13:33:31 UTC
Description of problem:

The cluster manager try to fence a cluster node and it fails with the following
error (from /var/log/messages):

May 24 14:26:26 hostname fenced[5139]: fencing node "cl-node-74"
May 24 14:26:41 hostname fenced[5139]: agent "fence_apc" reports: Traceback 
(most recent call last):   File "/sbin/fence_apc", line 798, in ?     mai
n()   File "/sbin/fence_apc", line 345, in main     do_power_off(sock)   File "/
sbin/fence_apc", line 782, in do_power_off     x = do_power_switch(sock, "o
ff")   File "/sbi
May 24 14:26:41 hostname fenced[5139]: agent "fence_apc" reports: n/fence_apc", 
line 590, in do_power_switch     result_code, response = power_off(tx
t + ndbuf)   File "/sbin/fence_apc", line 786, in power_off     x = 
power_switch(buffer, False, "2", "3");   File "/sbin/fence_apc", line 779, in 
tch     raise "un
May 24 14:26:41 hostname fenced[5139]: agent "fence_apc" reports: known screen 
encountered in \n" + str(lines) + "\n" unknown screen encountered in  
['3', '', '', '------- Power Supply Status -------------------------------------
--------------', '', '          Primary Power Supply Status: OK', '        
Secondary Power S
May 24 14:26:41 hostname fenced[5139]: agent "fence_apc" reports: upply Status: 
OK', '', '', '     <ESC>- Back, <ENTER>- Refresh', '> ']  

Then I copied an older version (1.32.25) from another cluster into /sbin at it 
works really nice but that cannot be the solution.

Version-Release number of selected component (if applicable):


How reproducible:

Just test it with "ifdown bond0" and take a look to /var/log/messages. Th 
cluster manager will try to fence the node because he missed him to many 
heartbeats ... and so on ...

Additional info:

uname -a
Linux hostname 2.6.9-55.ELsmp #1 SMP Fri Apr 20 16:36:54 EDT 2007 x86_64 x86_64 
x86_64 GNU/Linux

Comment 1 Jonny 2007-05-24 14:25:51 UTC
double post... delete it please

Comment 2 Jonny 2007-05-24 14:26:16 UTC
double post ... delete it please

Comment 3 Jim Parsons 2007-07-11 00:48:29 UTC
So, this agent has been worked on, as there were a couple of tickets against it,
but I am uncertain if fix will cover your issue - is it possible to test the new
agent on your cluster? 

Comment 4 Jim Parsons 2007-09-12 19:40:27 UTC
Does this work for you in the current release?

Comment 5 Jim Parsons 2007-12-12 16:33:28 UTC
Jonny - are there still issues with this? What is the exact model of the apc
device? I'm visiting apc soon and will check scripts against your exact
model...Should this ticket be closed?

Comment 6 Jim Parsons 2008-02-27 13:57:32 UTC
Closing due to no response from ticket creator.

In addition, a trip to the APC plant in February of this year revealed that
certain versions of 3.x APC firmware was buggy, and APC users should be running
the very latest. Our current APC script was tested with the latest firmware and
worked without fault.