Red Hat Bugzilla – Bug 1467469
Coordinate fence test documentation and add a fence test section to fence configuration procedure
Last modified: 2017-12-12 11:36:06 EST
Section Number and Name:
1.3. Fencing Configuration
Describe the issue:
In this section, there is a way to configure the fencing. But there is no description for how to test it.
Some customers tend to do the network restart to confirm that the cluster doesn't affect the network restart because the network restart will not exceed the timeout in corosync.conf.
But corosync is monitoring the network interfaces used for monitoring other nodes.
Once the network devices are down, corosync will do recovery process(node restart).
So, the network restart while the corosync is working should not be executed.
And it is not a good way for fencing test.
Suggestions for improvement:
We should add following sentence into the guide.
1. Network restart will trigger fencing the node which restarts the network even though the timeout is not exceeded.
2. Blocking the incoming/outgoing packet is one of the proper ways to test fencing.
The original issue -- noting that network restart causes fencing -- has been addressed but I'm moving this to 7.5 and changing the title to note that the focus of this BZ is now better fence test documentation.
I will modify the note about network restart as part of the general update to this BZ -- coordinating the fence test documentation. This is now noted as 7.5, but it can be updated on the Portal whenever we complete it.
The current note about network restart says this:
Once fencing is configured and a cluster has been started, a network restart will trigger fencing for the node which restarts the network even when the timeout is not exceeded. For this reason, testing your fence device by disabling the network interface will not properly test fencing. For information on testing a fence device, see Fencing in a Red Hat High Availability Cluster. and How to test fence devices and fencing configuration in a RHEL 5, 6, or 7 High Availability cluster?.
Would this work as a rewrite, with two bullets phrased as instructions for a user?
Once fencing is configured and a cluster has been started, a network restart will trigger fencing for the node which restarts the network even when the timeout is not exceeded. For this reason, you should keep the following in mind:
* Do not restart the network service while the cluster service is running because it will trigger an unintentional fencing on the node.
* Do not test your fence device by disabling the network interface,as this will not properly test fencing. For information on testing a fence device, see Fencing in a Red Hat High Availability Cluster. and How to test fence devices and fencing configuration in a RHEL 5, 6, or 7 High Availability cluster?.
The updated note is on the Portal here:
Not closing this yet, though, because we still might move the testing info from the Portal to this document, although we currently do point to it and that might remain the best place for it.
New section on testing a fence device is on the Portal:
Updated fence testing section now on Portal: