Fencing is disabled within 5 minutes interval from engine startup (interval can be changed using engine-config option DisableFenceAtStartupInSec). If some host become NonResponsive during that interval, it will not be fenced automatically and administrators are required to fence it manually (audit log error message is displayed for that) or the host needs to become responsive again by itself. The DisableFenceAtStartupInSec option exists from 3.1 to prevent fencing storms after whole data center outage, because hosts are usually booting much longer than engine, so we need to give them time to recover and not fence them during booting up. Unfortunately this option doesn't work well with hosted engine, especially with scenario described in [1]. To solve this issue we will schedule a job to start after DisableFenceAtStartupInSec interval is over and which will execute fencing on all NonResponsive hosts. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1506217#c4
We could easily change the default on hosted engine?
(In reply to Yaniv Kaul from comment #1) > We could easily change the default on hosted engine? What do you mean by that? Enable that feature only on hosted engine? If so then yes, we could introduce an option do enable/disable that feature, so HE setup can change the default if needed
Verified on ovirt-engine-4.2.2-0.1.el7.noarch Non responsive hosts are fenced after grace period after engine startup sequence.
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.