Bug 1360986 - [z-stream clone - 3.6.9] VMs are not reported as non-responding even though qemu process does not responds.
Summary: [z-stream clone - 3.6.9] VMs are not reported as non-responding even though ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.6.7
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ovirt-3.6.9
: ---
Assignee: Francesco Romani
QA Contact: sefi litmanovich
URL:
Whiteboard:
Depends On: 1357798
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-28 06:13 UTC by rhev-integ
Modified: 2020-07-16 08:53 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
This update fixes a error in the monitoring code that caused the VDSM to incorrectly report that a QEMU process has recovered and is responsive after being unavailable for a short amount of time, while it was actually unresponsive.
Clone Of: 1357798
Environment:
Last Closed: 2016-09-21 18:07:15 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1925 0 normal SHIPPED_LIVE vdsm 3.6.9 bug fix and enhancement update 2016-09-21 21:58:32 UTC
oVirt gerrit 61309 0 master MERGED tests: sampling: add FakeClock helper 2016-07-28 06:14:18 UTC
oVirt gerrit 61310 0 master MERGED vm: periodic: fix stats age reporting 2016-08-02 10:11:05 UTC
oVirt gerrit 61420 0 master MERGED virt: sampling: add is_empty() method to StatsSample 2016-08-02 10:06:49 UTC
oVirt gerrit 62416 0 ovirt-3.6 MERGED tests: sampling: add FakeClock helper 2016-08-19 12:07:51 UTC
oVirt gerrit 62417 0 ovirt-3.6 MERGED virt: sampling: add is_empty() method to StatsSample 2016-08-19 12:08:00 UTC
oVirt gerrit 62418 0 ovirt-3.6 MERGED vm: periodic: fix stats age reporting 2016-08-19 12:08:52 UTC

Comment 2 sefi litmanovich 2016-08-28 11:35:18 UTC
Verified with rhevm-3.6.8.1-0.1.el6.noarch, host with vdsm-4.17.34-1.el7ev.noarch.

Verified according to steps in description.
Result after kill -19 <qemu-pid> :
vdsClient is reporting the vm in state 'UP' but in engine the vm is reported as not responding which is the expected result.
Not sure I got the implementation of this fix, let me know if I missed something that should be checked as well.

Comment 4 Francesco Romani 2016-08-29 08:43:47 UTC
(In reply to sefi litmanovich from comment #2)
> Verified with rhevm-3.6.8.1-0.1.el6.noarch, host with
> vdsm-4.17.34-1.el7ev.noarch.
> 
> Verified according to steps in description.
> Result after kill -19 <qemu-pid> :
> vdsClient is reporting the vm in state 'UP'

In vdsClient:
1. if you "list" VMs, you should see them marked as 'UP*' <- please note the asterisk

2. if you use 'getAllVmStats', you should see 'monitorResponse' = -1

This is how Vdsm reports unresponsive VMs.

Comment 6 errata-xmlrpc 2016-09-21 18:07:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1925.html


Note You need to log in before you can comment on or make changes to this bug.