+++ This bug was initially created as a clone of Bug #309991 +++ With large memory configurations, some machines take a long time to dump state when a panic occurs. The cluster software may well force a reboot as a fence operation before the dump completes. This cause the loss of important data to diagnose the root problem. Cluster fencing needs a mechanism to hold off fencing until the dump completes or assurance from the failed node that it will not re-awaken and cause data corruption of shared information. --- Additional comment from nhorman on 2007-09-28 07:56:30 EDT --- I've added, as part of bz 269761, the ability to run an arbitrary script from the kdump initrd prior to capturing a vmcore. My thought was that we could use this ability to fork a process that spoke to the cluster suite peer daemons in such a way as to stall the fencing process. This obviously requires that the fencing suite contain some utility to drive the communication appropriately, which can then be added to kdump via /etc/kdump.conf. Thoughts Jim? This is a clone to address the bits in kexec-tools which need to be modified / admended in order to provide the required functionality.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.