Description of problem:
When I tried fence_apc_snmp on our 64 node cluster I found that it gets confused
by symbolic names for states. Where fence_apc_snmp was expecting "2" it got
instead "outletStatusOff." This can be fixed by adding "-Oe" to the snmpget
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. /sbin/fence_apc_snmp -a east-apc -n 10
Apr 22 13:38:27 north-01 fenced: fencing node "east-08"
Apr 22 13:38:27 north-01 fenced: agent "fence_apc_snmp" reports: invalid
Apr 22 13:38:27 north-01 fenced: fence "east-08" failed
Apr 22 13:38:32 north-01 fenced: fencing node "east-08"
Apr 22 13:38:33 north-01 fenced: fence "east-08" success
Created attachment 311239 [details]
Patch fence_apc_snmp to accept symbolical and numeric values
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Done, with 456058
I don't see the patch in the latest cman package and it is still failing.
[root@dash-01 ~]# fence_node dash-03
agent "fence_apc_snmp" reports: invalid status outletStatusMSPOn
Please update status on this bug for 5.3.
This is not ready yet - testing the very latest MIB from APC today.
Marking this as a regression and flagging an exception to get it into rhel 5.3 prior to RC release.
This bugzilla has Keywords: Regression.
Since no regressions are allowed between releases,
it is also being marked as a blocker for this release.
Please resolve ASAP.
I've run with the latest pkgs, this still fails. I talked with Chris, he can not find the commit which would have included a fix.
Tested on AP7941 (v3.5.6), AP7951 (v2.7), AP7901 (v3.3.4), AP7901 (v3.5.7) on fence_apc_snmp from RHEL53 branch. Fence agent was executed directly from command line.
* SNMP on APC 7901 (v3.3.4/3.3.3) - doesn't work correctly, pointing to oid which does not exist. Upgrade to latest firmware v3.5.7 helps.
* On every other configuration power on & off / reboot / status works correctly
The fix for this issue is to upgrade to APC AOS v3.5.7 and APC rpdu v3.5.6 firmware. These firmware versions should be used on any AP79XX series switch in order to use fence_apc_snmp.
NOTE: The above should be release noted, methinks.
Marking as MODIFIED as this is a regression in APC code.
this bug is now documented in the RHEL5.3 release notes. please refer to the following link within the next 24 hours or so to view the most current build:
Release note added. If any revisions are required, please set the
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.
The Cluster Manager utility (cman) has been updated to version 2.0.97. This applies several bug fixes and enhancements, most notably:
* cman now uses the following firmware versions: APC AOS v3.5.7 and APC rpdu v3.5.6. This fixes a bug that prevented the APC 7901 from using simple network management protocol (SNMP) properly.
* fence_drac, fence_ilo, fence_egenera, and fence_bladecenter agents now support ssh.
* fence_xvmd key files can now be reloaded without restarting.
* A single fence method can now support up to 8 fence devices.
This issue still exists in RH5.3 Beta U3.
The previously attached patch no longer works as new changes have been incorporated. New patch attached.
(In reply to comment #19)
> This issue still exists in RH5.3 Beta U3.
> The previously attached patch no longer works as new changes have been
> incorporated. New patch attached.
Previously attached patch does work, and should be included in RH5.3
Attached patch is missing :)
The version of fence_apc_snmp in cman-2.0.98-1.el5 (5.3 GA) still has the originally reported problem. I can reproduce it on every one of the 30-some APC switches in our lab. These are various models (7911, 7931, 7941, 7952) running various releases from 3.3.3/3.3.4 to 3.5.8/3.5.9.
In every case I've found the 5.3 version fails as shipped, but works after applying the patch from comment #1.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
Bug cannot be reopened as errata was published. Bug was cloned instead (bug #484095)