Bug 1163073

Summary: VM fail to launch or get stuck on launching state (VDSM does not recieve VM start due to timeout on VM start request).
Product: Red Hat Enterprise Virtualization Manager Reporter: Nisim Simsolo <nsimsolo>
Component: ovirt-engineAssignee: Nobody <nobody>
Status: CLOSED DUPLICATE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: ecohen, gklein, iheim, lpeer, lsurette, nsimsolo, ofrenkel, oourfali, pkliczew, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-11-17 13:17:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log
none
vdsm.log
none
libvirtd.log
none
sanlock.log
none
earlier vdsm log
none
vdsm.log.13.xz none

Description Nisim Simsolo 2014-11-12 10:39:29 UTC
Description of problem:
VM get stuck on launching state or does not start at all.
VDSM does not recieve VM start due to timeout on VM start request.


Version-Release number of selected component (if applicable):
engine: rhevm-3.4.4-2.2.el6ev.noarch
Host: libvirt-0.10.2-46.el6_6.1.x86_64
sanlock-2.8-1.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
vdsm-4.16.7.3-1.el6ev.x86_64


How reproducible:
Inconsistently.

Steps to Reproduce:
1. Add few VMs. 
2. Start VM.

Actual results:
From time to time, VM failed to start or get stuck in launching state.

Expected results:
VM should start properly.

Additional info:
engine and host logs attached.

Comment 1 Nisim Simsolo 2014-11-12 12:18:17 UTC
Created attachment 956712 [details]
engine log

engine log

Comment 2 Nisim Simsolo 2014-11-12 12:19:58 UTC
Created attachment 956713 [details]
vdsm.log

Comment 3 Nisim Simsolo 2014-11-12 12:20:46 UTC
Created attachment 956714 [details]
libvirtd.log

Comment 4 Nisim Simsolo 2014-11-12 12:21:06 UTC
Created attachment 956715 [details]
sanlock.log

Comment 5 Omer Frenkel 2014-11-12 14:22:42 UTC
looks like a temp network error, in the log it seems this happened only once..
does this happen all the time?
does this happen to all vms or just one?

also, vdsm log does not correspond to the engine log,
the failure is around 
2014-11-12 09:28:53,037 (engine log)

but vdsm log only starts 2 hours later at 
Dummy-70::DEBUG::2014-11-12 11:01:02,785..

please attach the vdsm.log for the same time of the error.

Comment 6 Nisim Simsolo 2014-11-12 14:40:46 UTC
Created attachment 956758 [details]
earlier vdsm log

Comment 7 Omer Frenkel 2014-11-13 08:53:03 UTC
still not the right one...
we need the log that contains the time of the error, which is 
2014-11-12 09:28:53


anyway might be duplicate of Bug 1143968

Comment 8 Nisim Simsolo 2014-11-13 10:11:54 UTC
Created attachment 957086 [details]
vdsm.log.13.xz

Comment 9 Omer Frenkel 2014-11-17 12:02:45 UTC
thanks, i dont see anything on vdsm.log during this time, also if the vm doesnt start on vdsm at all, i'm not sure this fit the scenario of Bug 1143968

not sure what can cause this:
2014-11-12 09:31:53,536 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) Connecting to /10.35.4.65

if this doesn't happen consistently, not sure how interesting it is.
on the other hand, i don't see any other comunication errors with this host during this time..

Oved, can someone take a look? it looks more around engine-vdsm communication

Comment 10 Oved Ourfali 2014-11-17 12:52:51 UTC
(In reply to Omer Frenkel from comment #9)
> thanks, i dont see anything on vdsm.log during this time, also if the vm
> doesnt start on vdsm at all, i'm not sure this fit the scenario of Bug
> 1143968
> 
> not sure what can cause this:
> 2014-11-12 09:31:53,536 INFO 
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
> Connecting to /10.35.4.65
> 

Doesn't seem problematic, but perhaps Piotr can take a look.
Piotr?

Comment 11 Piotr Kliczewski 2014-11-17 13:17:13 UTC
This bug is duplicate of 1148583

*** This bug has been marked as a duplicate of bug 1148583 ***