Created attachment 970654 [details]
Description of problem:
fault behaviour when hosts work with json rpc .
reboot/shutdown a host in 'up' state resulted in few wrong behaviours:
a. rebooting/shutting down a host which is SPM - host becomes "connecting", I can't 'confirm host has been rebooted' and it's stuck that way until engine restart.
b. rebooting/shutting down a host which is HSM - host stays up regardless to the fact it should be non responsive. only upon engine reboot does this modify.
this behaviour doesn't occur if updating the hosts to use xml rpc and re-installing them.
Version-Release number of selected component (if applicable):
vdsm-jsonrpc-java-1.0.12-1.el6ev.noarch (this is the version from the next release vt13.4 which was patched for testing for bz: https://bugzilla.redhat.com/show_bug.cgi?id=1149832.
vdsm-184.108.40.206-4.el7ev.x86_64 (all the hosts are rh7)
Steps to Reproduce:
1. setup rh7 hosts on 3.5 engine set to work with json.
2. (in my setup I have two hosts on one dc 3rd on another - this doesn't seem crucial)
3. choose the spm host, shut it down manually (make sure no power management is configured) - scenario (a)
4. after all host are back up again - choose an HSM host and do the same - scenario (b)
according to my description above.
in both case only resolution is service ovirt-engine restart.
host becomes non responsive upon shutdown.
switching the host on and choosing 'confirm host has been rebooted' option in the ui to invoke manual fencing flow results in the host back to 'up' state.
Can you please retest with newer build?
Verified with vt13.4 + http://gerrit.ovirt.org/#/c/36332/ patch, so practically this will work on next release.
verified according to the same two scenarios as in description.
Waiting for you guys to move the bz to Post and then on_qa and I will verify
*** Bug 1176527 has been marked as a duplicate of this bug. ***
Verified with rhevm-3.5.0-0.27.el6ev.noarch according to scenario in description.
rhev 3.5.0 was released. closing.