Red Hat Bugzilla – Bug 655764
[RFE] Add "diag" option to fence_ipmilan to support ipmi chassis power diag option
Last modified: 2017-01-04 06:55:39 EST
Description of problem: To enhance the fence_ipmilan agent it could be useful to add a new operation to the '-o' option for diagnostic purposes. Available operations on current release are: -o <op> Operation to perform. Valid operations: on, off, reboot, status, list or monitor A new operation 'diag' would be very helpful to allow fence_ipmilan to forward the request "ipmitool chassis power diag" to the remote host. This request will force the node's kernel to go into dump mode. If the node is already in the dump process the DIAG signal will be ignored. Additional info: This feature request will be very helpful in our large cluster environment.
Created attachment 473316 [details] Proposed patch Add option "diag" as new operation. On my machine I got: Uhhuh. NMI received for unknown reason 31. Do you have a strange power saving mode enabled? Dazed and confused, but trying to continue but machine is still up and running. I believe that signal was send correctly but my machine is not configured to support it. @Gary: Does this patch do what you expect?
(In reply to comment #2) > Uhhuh. NMI received for unknown reason 31. > Do you have a strange power saving mode enabled? > Dazed and confused, but trying to continue > > but machine is still up and running. I believe that signal was send correctly > but my machine is not configured to support it. > > @Gary: Does this patch do what you expect? I'm still waiting for an explanation from them as to how exactly they've configured their hardware and the OS to make this function as they expect. However, they have confirmed that they've tested this functionality from the command line with fence_ipmilan and it works for them as expected.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: It is now possible to issue a diagnostic pulse using the IPMI interface using the fence_ipmilan agent. This is not a substitute for the 'off' operation in a production cluster, but may be used to force a kernel dump of a host if that host is configured to perform dumps. This feature is considered a Technology Preview.
Created attachment 490284 [details] RHEL6 merged/tested patch
Verified in fence-agents-3.0.12-23.el6.x86_64
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -It is now possible to issue a diagnostic pulse using the IPMI interface using the fence_ipmilan agent. This is not a substitute for the 'off' operation in a production cluster, but may be used to force a kernel dump of a host if that host is configured to perform dumps. This feature is considered a Technology Preview.+A diagnostic pulse can now be issued on the IPMI interface using the fence_ipmilan agent. This new Technology Preview is used to force a kernel dump of a host if the host is configured to do so. Note that this feature is not a substitute for the 'off' operation in a production cluster.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0745.html