Bug 245675
| Summary: | fence_apc script does not work on 3.x firmware. | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Cluster Suite | Reporter: | Norm Murray <nmurray> | ||||||||||||
| Component: | fence | Assignee: | Jim Parsons <jparsons> | ||||||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> | ||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||
| Priority: | urgent | ||||||||||||||
| Version: | 4 | CC: | cfeist, cluster-maint, jko, jplans, mkearey, tao | ||||||||||||
| Target Milestone: | --- | ||||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | All | ||||||||||||||
| OS: | Linux | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | Environment: | ||||||||||||||
| Last Closed: | 2009-02-05 00:22:51 UTC | Type: | --- | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Bug Depends On: | |||||||||||||||
| Bug Blocks: | 248566 | ||||||||||||||
| Attachments: |
|
||||||||||||||
Created attachment 157841 [details]
Modified fence_apc.py to fence 3.3
I believe this applies to rhel 5 cluster suite also. I think you're the fencing person, Jim ? Requesting z-stream release for this bug. This is a dire problem, and needs a fix pushed at once. New agent is checked in and awaiting errata. Changes:
--- cluster/fence/agents/apc/fence_apc.py
+++ cluster/fence/agents/apc/fence_apc.py
@@ -59,6 +59,7 @@
CONTROL_CONSOLE = "Control Console -----"
DEVICE_MANAGER = "Device Manager -----"
OUTLET_CONTROL = "- Outlet Control/Configuration -----"
+OUTLET_MANAGE = "- Outlet Management -----"
CONTROL_OUTLET = "- Control Outlet -----"
CONTROL_OUTLET_2 = "- Outlet Control "
COMMAND_SUCCESS = "Command successfully issued."
@@ -70,9 +71,12 @@
USERNAME = "User Name :"
PASSWORD = "Password :"
MASTER = "------- MasterSwitch"
+FIRMWARE_STR = "Rack PDU APP"
CONTINUE_INDEX = 0
+FIRMWARE_REV = 2
+
regex_list = list()
regex_list.append(CONTINUE)
regex_list.append(SCREEN_END)
@@ -468,6 +472,7 @@
sys.exit(1)
def log_in(buffer):
+ global FIRMWARE_REV
lines = buffer.splitlines()
for i in lines:
@@ -480,6 +485,17 @@
logit("Sending password: %s\n" % passwd)
return (NOT_COMPLETE, passwd + "\r")
elif i.find(CONTROL_CONSOLE) != (-1):
+ #while we are here, grab the firmware revision
+ rev_search_lines = buffer.splitlines()
+ for rev_search_line in rev_search_lines: #search screen again
+ rev_dex = rev_search_line.find(FIRMWARE_STR)
+ if rev_dex != (-1): #found revision line
+ scratch_rev = rev_search_line[rev_dex:]
+ v_dex = scratch_rev.find("v")
+ if v_dex != (-1):
+ if scratch_rev[v_dex + 1] == "3": #format is v3.3.4
+ FIRMWARE_REV = 3
+ break
return (COMPLETE, "1\r")
def do_status_check(sock):
@@ -537,7 +553,12 @@
if switchnum != "":
res = switchnum + "\r"
else:
- res = "3\r"
+ if FIRMWARE_REV == 2:
+ res = "3\r"
+ elif FIRMWARE_REV == 3:
+ res = "2\r1\r"
+ else: #placeholder for future revisions
+ res = "3\r"
return (NOT_COMPLETE, res, "Status Unknown")
elif i.find(OUTLET_CONTROL) != (-1):
ls = buffer.splitlines()
@@ -639,7 +660,12 @@
if switchnum != "":
res = switchnum + "\r"
else:
- res = "3\r"
+ if FIRMWARE_REV == 2:
+ res = "3\r"
+ elif FIRMWARE_REV == 3:
+ res = "2\r1\r"
+ else: #placeholder for future revisions - sheesh
+ res = "3\r"
return (NOT_COMPLETE, res)
elif (i.find(master_search_str1) != (-1)):
@@ -660,6 +686,11 @@
elif i == outlet_search_str5:
return (NOT_COMPLETE, "1\r")
+ elif i.find(OUTLET_MANAGE) != (-1):
+ #return (NOT_COMPLETE, "1\r")
+ return (NOT_COMPLETE, "\r")
+
+ #elif i.find(OUTLET_CONTROL) != (-1) or i.find(OUTLET_MANAGE) != (-1):
elif i.find(OUTLET_CONTROL) != (-1):
ls = buffer.splitlines()
portval = port.strip()
Created attachment 159397 [details]
corrected agent
This is the complete, fixed agent. To use, mv to fence_apc and drop in /sbin
with permissions set to rwxr-xr-x as root
Hot fix requested, so adding the z-stream flag so the process gets started to create the z-stream errata request. Created attachment 272281 [details]
Allow the menu to fall through to the correct place.
I believe that this may also need to be applied to RHEL 4 as well.
User wmealing's account has been closed Created attachment 304382 [details]
fence_apc for rhel4
Created attachment 304383 [details]
fence_apc for rhel5
I have verified that the patch worked. I had a customer test the above fence_apc scripts I have uploaded and they work. Will this be included in next release of fence_apc? --sbradley |
Description of problem: The fence_apc script works fine on the 2.7, 2.8 version of the firmware for apc fencing devices, but seems to fail on 3.x upwards due to a menu change in telnet interface. Version-Release number of selected component (if applicable): Tested on cman-2.0.64-1.el5 (but this bug is for a customer on 4) How reproducible: Every time Steps to Reproduce: 0. Install cman, configure an apc to fence a single machine. 1. Upgrade an APC power supply to firmware version 3.x from 2.X 2. Fence_apc fails, because of unexpected (new) menu options. 3. command /sbin/fence_apc.py -v -a 10.64.69.44 -l username -n 7 -p password -o reboot Actual results: fence_apc fails with a traceback.. (cut for sanity) agent "fence_apc" reports: known screen encountered in \n" + str(lines) + "\n" unknown screen encountered in ['', '> 1', '', '', '------- Phase Management ------------------------------------------------------', '', ' Phase Load : 2.4', ' Phase State: Normal Load ', agent "fence_apc" reports: '', ' 1- Overload Alarm Threshold(amps) : 16', ' 2- Near Overload Warning Threshold(amps): 12', ' 3- Low LoadWarning Threshold(amps) : 0', ' 4- Accept Changes : ', '', ' ?- Help, <ESC>- Back, <ENTER>- Expected results: Power supply to be dropped to the node, then rebooted.\ Additional info: I made some changes to accomodate this menu, file to come.