Bug 1360986

Summary: [z-stream clone - 3.6.9] VMs are not reported as non-responding even though qemu process does not responds.
Product: Red Hat Enterprise Virtualization Manager Reporter: rhev-integ
Component: vdsmAssignee: Francesco Romani <fromani>
Status: CLOSED ERRATA QA Contact: sefi litmanovich <slitmano>
Severity: high Docs Contact:
Priority: medium    
Version: 3.6.7CC: bazulay, bgraveno, fromani, gklein, lsurette, mgoldboi, michal.skrivanek, mkalinin, rhodain, srevivo, ycui, ykaul
Target Milestone: ovirt-3.6.9Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
This update fixes a error in the monitoring code that caused the VDSM to incorrectly report that a QEMU process has recovered and is responsive after being unavailable for a short amount of time, while it was actually unresponsive.
Story Points: ---
Clone Of: 1357798 Environment:
Last Closed: 2016-09-21 18:07:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1357798    
Bug Blocks:    

Comment 2 sefi litmanovich 2016-08-28 11:35:18 UTC
Verified with rhevm-3.6.8.1-0.1.el6.noarch, host with vdsm-4.17.34-1.el7ev.noarch.

Verified according to steps in description.
Result after kill -19 <qemu-pid> :
vdsClient is reporting the vm in state 'UP' but in engine the vm is reported as not responding which is the expected result.
Not sure I got the implementation of this fix, let me know if I missed something that should be checked as well.

Comment 4 Francesco Romani 2016-08-29 08:43:47 UTC
(In reply to sefi litmanovich from comment #2)
> Verified with rhevm-3.6.8.1-0.1.el6.noarch, host with
> vdsm-4.17.34-1.el7ev.noarch.
> 
> Verified according to steps in description.
> Result after kill -19 <qemu-pid> :
> vdsClient is reporting the vm in state 'UP'

In vdsClient:
1. if you "list" VMs, you should see them marked as 'UP*' <- please note the asterisk

2. if you use 'getAllVmStats', you should see 'monitorResponse' = -1

This is how Vdsm reports unresponsive VMs.

Comment 6 errata-xmlrpc 2016-09-21 18:07:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1925.html