Bug 1389090

Summary: VM's transition to not-responding state then back to UP state.
Product: Red Hat Enterprise Virtualization Manager Reporter: Bimal Chollera <bcholler>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED NOTABUG QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.9CC: bazulay, fromani, gklein, lsurette, michal.skrivanek, srevivo, tjelinek, ycui, ykaul
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-02 08:28:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Bimal Chollera 2016-10-26 20:25:25 UTC
Description of problem:

VM's transition to not-responding state then back to UP state.
This problem is seen on end-user system when taking VM snapshots.
The VM's it-self doesn't have any problems and continue to operate.

The VDSM logs contains "monitor become unresponsive (command timeout, age=89.0200000005)" messages for the VM's.

Version-Release number of selected component (if applicable):

rhevm-3.6.9.2-0.1.el6.noarch 
vdsm-4.17.35-1.el7ev.noarch
libvirt-daemon-1.2.17-13.el7_2.5.x86_64 

How reproducible:
1.  100% reproducible on end-user system.  

Steps to Reproduce:

1.  Take and snapshot of the VM and observe the VM's momentarily transition to not responding state then back to Up state.

2.
3.

Actual results:

VM's momentarily transition to not responding state then back to Up state.

Expected results:

VM's shouldn't transition to not responding state.  

Additional info:

Comment 4 Michal Skrivanek 2016-10-27 05:58:34 UTC
Francesco, os it the same as the race on shutdown?

Comment 5 Yaniv Kaul 2016-10-27 07:16:33 UTC
(In reply to Michal Skrivanek from comment #4)
> Francesco, os it the same as the race on shutdown?

I doubt that - just by looking at the logs - if it takes you >1m to take a snapshot, something's wrong (storage-side performance?). I'd look at VDSM log for storage issues.