Description of problem: The fence_apc script works fine on the 2.7, 2.8 version of the firmware for apc fencing devices, but seems to fail on 3.x upwards due to a menu change in telnet interface. Version-Release number of selected component (if applicable): Tested on cman-2.0.64-1.el5 (but this bug is for a customer on 4) How reproducible: Every time Steps to Reproduce: 0. Install cman, configure an apc to fence a single machine. 1. Upgrade an APC power supply to firmware version 3.x from 2.X 2. Fence_apc fails, because of unexpected (new) menu options. 3. command /sbin/fence_apc.py -v -a 10.64.69.44 -l username -n 7 -p password -o reboot Actual results: fence_apc fails with a traceback.. (cut for sanity) agent "fence_apc" reports: known screen encountered in \n" + str(lines) + "\n" unknown screen encountered in ['', '> 1', '', '', '------- Phase Management ------------------------------------------------------', '', ' Phase Load : 2.4', ' Phase State: Normal Load ', agent "fence_apc" reports: '', ' 1- Overload Alarm Threshold(amps) : 16', ' 2- Near Overload Warning Threshold(amps): 12', ' 3- Low LoadWarning Threshold(amps) : 0', ' 4- Accept Changes : ', '', ' ?- Help, <ESC>- Back, <ENTER>- Expected results: Power supply to be dropped to the node, then rebooted.\ Additional info: I made some changes to accomodate this menu, file to come.
Created attachment 157841 [details] Modified fence_apc.py to fence 3.3
I believe this applies to rhel 5 cluster suite also.
I think you're the fencing person, Jim ?
Requesting z-stream release for this bug. This is a dire problem, and needs a fix pushed at once. New agent is checked in and awaiting errata.
Changes: --- cluster/fence/agents/apc/fence_apc.py +++ cluster/fence/agents/apc/fence_apc.py @@ -59,6 +59,7 @@ CONTROL_CONSOLE = "Control Console -----" DEVICE_MANAGER = "Device Manager -----" OUTLET_CONTROL = "- Outlet Control/Configuration -----" +OUTLET_MANAGE = "- Outlet Management -----" CONTROL_OUTLET = "- Control Outlet -----" CONTROL_OUTLET_2 = "- Outlet Control " COMMAND_SUCCESS = "Command successfully issued." @@ -70,9 +71,12 @@ USERNAME = "User Name :" PASSWORD = "Password :" MASTER = "------- MasterSwitch" +FIRMWARE_STR = "Rack PDU APP" CONTINUE_INDEX = 0 +FIRMWARE_REV = 2 + regex_list = list() regex_list.append(CONTINUE) regex_list.append(SCREEN_END) @@ -468,6 +472,7 @@ sys.exit(1) def log_in(buffer): + global FIRMWARE_REV lines = buffer.splitlines() for i in lines: @@ -480,6 +485,17 @@ logit("Sending password: %s\n" % passwd) return (NOT_COMPLETE, passwd + "\r") elif i.find(CONTROL_CONSOLE) != (-1): + #while we are here, grab the firmware revision + rev_search_lines = buffer.splitlines() + for rev_search_line in rev_search_lines: #search screen again + rev_dex = rev_search_line.find(FIRMWARE_STR) + if rev_dex != (-1): #found revision line + scratch_rev = rev_search_line[rev_dex:] + v_dex = scratch_rev.find("v") + if v_dex != (-1): + if scratch_rev[v_dex + 1] == "3": #format is v3.3.4 + FIRMWARE_REV = 3 + break return (COMPLETE, "1\r") def do_status_check(sock): @@ -537,7 +553,12 @@ if switchnum != "": res = switchnum + "\r" else: - res = "3\r" + if FIRMWARE_REV == 2: + res = "3\r" + elif FIRMWARE_REV == 3: + res = "2\r1\r" + else: #placeholder for future revisions + res = "3\r" return (NOT_COMPLETE, res, "Status Unknown") elif i.find(OUTLET_CONTROL) != (-1): ls = buffer.splitlines() @@ -639,7 +660,12 @@ if switchnum != "": res = switchnum + "\r" else: - res = "3\r" + if FIRMWARE_REV == 2: + res = "3\r" + elif FIRMWARE_REV == 3: + res = "2\r1\r" + else: #placeholder for future revisions - sheesh + res = "3\r" return (NOT_COMPLETE, res) elif (i.find(master_search_str1) != (-1)): @@ -660,6 +686,11 @@ elif i == outlet_search_str5: return (NOT_COMPLETE, "1\r") + elif i.find(OUTLET_MANAGE) != (-1): + #return (NOT_COMPLETE, "1\r") + return (NOT_COMPLETE, "\r") + + #elif i.find(OUTLET_CONTROL) != (-1) or i.find(OUTLET_MANAGE) != (-1): elif i.find(OUTLET_CONTROL) != (-1): ls = buffer.splitlines() portval = port.strip()
Created attachment 159397 [details] corrected agent This is the complete, fixed agent. To use, mv to fence_apc and drop in /sbin with permissions set to rwxr-xr-x as root
Hot fix requested, so adding the z-stream flag so the process gets started to create the z-stream errata request.
Created attachment 272281 [details] Allow the menu to fall through to the correct place. I believe that this may also need to be applied to RHEL 4 as well.
User wmealing's account has been closed
Created attachment 304382 [details] fence_apc for rhel4
Created attachment 304383 [details] fence_apc for rhel5
I have verified that the patch worked. I had a customer test the above fence_apc scripts I have uploaded and they work. Will this be included in next release of fence_apc? --sbradley