Description of problem:
In the event of a full host power outage (including fence devices) a user must wait 19 mins (3 x 3 minute timeouts + 10 minutes for the transaction reaper) until they can manually fence a host to relocate guests.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Remove all power to an active host, including any fence agents that are configured.
2. Attempt to manually fence the host to relocated guests.
The guests are only relocated once the host has moved to a state of 'non-responsive'. This can take 19 minutes if the fencing is configured but not available.
The guests are relocated if the user confirms the host is down.
lee - dup of bug 1044089?
(In reply to Itamar Heim from comment #3)
> lee - dup of bug 1044089?
Bug 1044089 is about allowing acknowledgment that a host has been rebooted to allow VMs in it to failover to remaining hosts while host is in "Connecting" status.
This bug 1044091 is about allowing acknowledgement that host has been rebooted to allow VMs in it to failover to remaining hosts while host is in "Reboot" status.
Lee, do you agree?
Marek, is there a special status retuned from the fence-agents package when the agent power has been switched off as described in this BZ?
How can we distinguish that the PM agent card have no power so we can stop retrying the operation ?
if the agent can not do do a 'monitor' action that you can consider it is a dead one - we do not distinguish if it is problem with login/pass;firmware or power outage
(In reply to Julio Entrena Perez from comment #4)
> (In reply to Itamar Heim from comment #3)
> > lee - dup of bug 1044089?
No, I created BZ#1044089 as a manual fencing while the host is 'connecting' fails to failover the SPM role. This bug, BZ#1044091, was created as manual fencing fails to refresh/relocate guests while the host is 'connecting' or 'rebooting'.
> Bug 1044089 is about allowing acknowledgment that a host has been rebooted
> to allow VMs in it to failover to remaining hosts while host is in
> "Connecting" status.
Nope, BZ#1044089 covers the failure to failover the SPM with a manual fence while the host is connecting.
ovirt 3.4.0 alpha has been released
verified tested on ovirt-engine-3.4.0-0.7.beta2.el6.noarch
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.