Bug 1175824 - [JSON RPC] shutdown/reboot a host on state 'up' result in fault behaviour which is resolved only by engine restart
Summary: [JSON RPC] shutdown/reboot a host on state 'up' result in fault behaviour whi...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm-jsonrpc-java
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.0
Assignee: Piotr Kliczewski
QA Contact: sefi litmanovich
URL:
Whiteboard: infra
: 1176527 (view as bug list)
Depends On:
Blocks: rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2014-12-18 16:37 UTC by sefi litmanovich
Modified: 2016-02-10 19:10 UTC (History)
18 users (show)

Fixed In Version: vt13.5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 17:07:05 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine log (7.28 MB, text/plain)
2014-12-18 16:37 UTC, sefi litmanovich
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 36332 0 master MERGED NPE when removing empty tracked request Never
oVirt gerrit 36335 0 ovirt-3.5 MERGED NPE when removing empty tracked request Never
oVirt gerrit 36340 0 ovirt-engine-3.5 MERGED jsonrpc version bump Never

Description sefi litmanovich 2014-12-18 16:37:00 UTC
Created attachment 970654 [details]
engine log

Description of problem:

fault behaviour when hosts work with json rpc .

reboot/shutdown a host in 'up' state resulted in few wrong behaviours:

a. rebooting/shutting down a host which is SPM -  host becomes "connecting", I can't 'confirm host has been rebooted' and it's stuck that way until engine restart.

b. rebooting/shutting down a host which is HSM - host stays up regardless to the fact it should be non responsive. only upon engine reboot does this modify.

this behaviour doesn't occur if updating the hosts to use xml rpc and re-installing them.

Version-Release number of selected component (if applicable):

rhevm-3.5.0-0.25.el6ev.noarch
vdsm-jsonrpc-java-1.0.12-1.el6ev.noarch (this is the version from the next release vt13.4 which was patched for testing for bz: https://bugzilla.redhat.com/show_bug.cgi?id=1149832.

vdsm-4.16.8.1-4.el7ev.x86_64 (all the hosts are rh7)


Steps to Reproduce:
1. setup rh7 hosts on 3.5 engine set to work with json.
2. (in my setup I have two hosts on one dc 3rd on another - this doesn't seem crucial)
3. choose the spm host, shut it down manually (make sure no power management is configured) - scenario (a)
4. after all host are back up again - choose an HSM host and do the same - scenario (b)

Actual results:

according to my description above.
in both case only resolution is service ovirt-engine restart.


Expected results:

host becomes non responsive upon shutdown.
switching the host on and choosing 'confirm host has been rebooted' option in the ui to invoke manual fencing flow results in the host back to 'up' state. 


Additional info:

Comment 1 Piotr Kliczewski 2014-12-19 07:51:09 UTC
Can you please retest with newer build?

Comment 3 sefi litmanovich 2014-12-22 10:29:10 UTC
Verified with vt13.4 +  http://gerrit.ovirt.org/#/c/36332/ patch, so practically this will work on next release.

verified according to the same two scenarios as in description.
Waiting for you guys to move the bz to Post and then on_qa and I will verify

Comment 4 Piotr Kliczewski 2014-12-22 12:17:31 UTC
*** Bug 1176527 has been marked as a duplicate of this bug. ***

Comment 5 sefi litmanovich 2014-12-31 12:28:49 UTC
Verified with rhevm-3.5.0-0.27.el6ev.noarch according to scenario in description.

on hosts:
vdsm- vdsm-4.16.8.1-4.el7ev.x86_64.

Comment 6 Eyal Edri 2015-02-17 17:07:05 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.