Bug 655764 - [RFE] Add "diag" option to fence_ipmilan to support ipmi chassis power diag option
Summary: [RFE] Add "diag" option to fence_ipmilan to support ipmi chassis power diag o...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: fence-agents
Version: 6.1
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: 6.1
Assignee: Marek Grac
QA Contact: Cluster QE
Jana Heves
URL:
Whiteboard:
Keywords: FutureFeature, TechPreview
Depends On:
Blocks: 676286 678061 679847 702988
TreeView+ depends on / blocked
 
Reported: 2010-11-22 11:51 UTC by Gary Smith
Modified: 2018-11-14 18:43 UTC (History)
6 users (show)

(edit)
Diagnostic pulse can now be issued

A diagnostic pulse can now be issued on the IPMI interface using the fence_ipmilan agent. This new Technology Preview is used to force a kernel dump of a host if the host is configured to do so. Note that this feature is not a substitute for the `off` operation in a production cluster.
Clone Of:
: 678061 (view as bug list)
(edit)
Last Closed: 2011-05-19 14:21:53 UTC


Attachments (Terms of Use)
Proposed patch (4.04 KB, patch)
2011-01-13 13:21 UTC, Marek Grac
no flags Details | Diff
RHEL6 merged/tested patch (1.20 KB, patch)
2011-04-06 13:48 UTC, Lon Hohberger
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0745 normal SHIPPED_LIVE fence-agents bug fix and enhancement update 2011-05-19 09:37:09 UTC

Description Gary Smith 2010-11-22 11:51:15 UTC
Description of problem:

To enhance the fence_ipmilan agent it could be useful to add a new operation to the '-o' option for diagnostic purposes.
  
Available operations on current release are:

-o <op> Operation to perform.
Valid operations: on, off, reboot, status, list or monitor

A new operation 'diag' would be very helpful to allow fence_ipmilan to forward the request "ipmitool chassis power diag" to the remote host.

This request will force the node's kernel to go into dump mode. If the node is already in the dump process the DIAG signal will be ignored.

Additional info:

This feature request will be very helpful in our large cluster environment.

Comment 2 Marek Grac 2011-01-13 13:21:19 UTC
Created attachment 473316 [details]
Proposed patch

Add option "diag" as new operation. On my machine I got:

Uhhuh. NMI received for unknown reason 31.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue

but machine is still up and running. I believe that signal was send correctly but my machine is not configured to support it. 

@Gary: Does this patch do what you expect?

Comment 7 Gary Smith 2011-03-07 09:21:34 UTC
(In reply to comment #2)

> Uhhuh. NMI received for unknown reason 31.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
> 
> but machine is still up and running. I believe that signal was send correctly
> but my machine is not configured to support it. 
> 
> @Gary: Does this patch do what you expect?

I'm still waiting for an explanation from them as to how exactly they've configured their hardware and the OS to make this function as they expect. However, they have confirmed that they've tested this functionality from the command line with fence_ipmilan and it works for them as expected.

Comment 24 Lon Hohberger 2011-04-05 18:45:30 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
It is now possible to issue a diagnostic pulse using the IPMI interface using the fence_ipmilan agent.  This is not a substitute for the 'off' operation in a production cluster, but may be used to force a kernel dump of a host if that host is configured to perform dumps.  This feature is considered a Technology Preview.

Comment 26 Lon Hohberger 2011-04-06 13:48:18 UTC
Created attachment 490284 [details]
RHEL6 merged/tested patch

Comment 27 Dean Jansa 2011-04-19 15:55:44 UTC
Verified in fence-agents-3.0.12-23.el6.x86_64

Comment 30 Ryan Lerch 2011-05-10 03:43:57 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-It is now possible to issue a diagnostic pulse using the IPMI interface using the fence_ipmilan agent.  This is not a substitute for the 'off' operation in a production cluster, but may be used to force a kernel dump of a host if that host is configured to perform dumps.  This feature is considered a Technology Preview.+A diagnostic pulse can now be issued on the IPMI interface using the fence_ipmilan agent. This new Technology Preview is used to force a kernel dump of a host if the host is configured to do so. Note that this feature is not a substitute for the 'off' operation in a production cluster.

Comment 31 errata-xmlrpc 2011-05-19 14:21:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0745.html


Note You need to log in before you can comment on or make changes to this bug.