+++ This bug was initially created as a clone of Bug #1316358 +++ Description of problem: While testing a hyperconverged set up, I set vm's to highly available and have defined the fencing agent (idrac7) to the hosts. When a host running vm's is powered off through the DRAC, the vm's do not restart on one of the other nodes. When the host down is detected, the event is shown in the UI and log as "User shutdown from within the guest" - which is clearly wrong. Version-Release number of selected component (if applicable): RHEV 3.6.3.4-0.1 How reproducible: This is observed each time. Steps to Reproduce: 1. Hyperconverged setup with RHEV 3.6 and Gluster 3.7 2. set vm's to be highly available 3. power off a host running a vm that is tagged as highly available 4. confirm that a) message shows that the engine believes the vm to have shut itself down b) vm is not restarted Actual results: VM's marked as highly available do NOT get restarted on the other nodes in the cluster Expected results: VM's with the highly available attribute should be restarted. Additional info: This issue was reported informally to Doron Fediuck and Roy Golan a couple of weeks ago. Attaching engine.log and vdsm.log from the node shutdown for analysis --- Additional comment from Paul Cuzner on 2016-03-09 22:29 EST --- --- Additional comment from Yaniv Dary on 2016-03-10 04:09:45 EST --- If this reproduces on non HC setup, please move it to SLA.
Created attachment 1135041 [details] engine log
Created attachment 1135042 [details] vdsm log from the host that was powered off during the test
Paul, can you explain why this was cloned from Bug #1316358 ? Is there an additional action item here?
I initially raised the BZ against RHEV, since I was testing against downstream - but Yaniv D requested that the BZ should be opened against ovirt not rhev. HTH
Need changes to fencing logic, to consider gluster running on the nodes.
Fencing policies related to gluster hosts have been merged. HA for VMs can now be tested by enabling power management on HC nodes.
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.
Verified and works fine with build ovirt-engine-4.1.1.2-0.1.el7.noarch While testing a hyperconverged set up, I set vm's to highly available and have defined the fencing agent to the hosts. When a host running vm's is powered off the vm's gets restarted on one of the other nodes and the host gets fenced and comes up after a while.
Should be included in 4.1 GA d/s.