Bug 585266 - Add coordination between Kdump and Cluster Fencing for long kernel panic dumps
Summary: Add coordination between Kdump and Cluster Fencing for long kernel panic dumps
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kexec-tools
Version: 5.8
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Ryan O'Hara
QA Contact: Red Hat Kernel QE team
Depends On: 309991 461948
Blocks: 585332
TreeView+ depends on / blocked
Reported: 2010-04-23 15:08 UTC by Lon Hohberger
Modified: 2016-04-26 16:28 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 309991
: 585332 (view as bug list)
Last Closed: 2011-07-28 16:25:15 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Lon Hohberger 2010-04-23 15:08:13 UTC
+++ This bug was initially created as a clone of Bug #309991 +++

With large memory configurations, some machines take a long time to dump state
when a panic occurs.  The cluster software may well force a reboot as a fence
operation before the dump completes.  This cause the loss of important data to
diagnose the root problem.

Cluster fencing needs a mechanism to hold off fencing until the dump completes
or assurance from the failed node that it will not re-awaken and cause data
corruption of shared information.

--- Additional comment from nhorman@redhat.com on 2007-09-28 07:56:30 EDT ---

I've added, as part of bz 269761, the ability to run an arbitrary script from
the kdump initrd prior to capturing a vmcore.  My thought was that we could use
this ability to fork a process that spoke to the cluster suite peer daemons in
such a way as to stall the fencing process.  This obviously requires that the
fencing suite contain some utility to drive the communication appropriately,
which can then be added to kdump via /etc/kdump.conf.  Thoughts Jim?

This is a clone to address the bits in kexec-tools which need to be modified / admended in order to provide the required functionality.

Comment 2 RHEL Program Management 2011-07-28 16:25:15 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.