Bug 1171824

Summary: VM stuck in "Waiting for launch" in vt13.1 on HE with CPU SLA policy configured on 10%.
Product: Red Hat Enterprise Virtualization Manager Reporter: Nikolai Sednev <nsednev>
Component: ovirt-engineAssignee: Nobody <nobody>
Status: CLOSED DUPLICATE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: dfediuck, ecohen, gklein, iheim, lpeer, lsurette, mavital, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---Keywords: Regression, Reopened, Triaged
Target Release: 3.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: sla
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-15 13:59:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1164308, 1164311    
Attachments:
Description Flags
logs none

Description Nikolai Sednev 2014-12-08 16:55:50 UTC
Created attachment 965918 [details]
logs

Description of problem:
VM stuck in "Waiting for launch" in vt13.1 on HE with CPU SLA policy configured on 10%.

I'm simply trying to start one guest VM and it's being stuck in "Waiting for launch" in WEBUI, while shown on host as running.

# vdsClient -s 0 list table
6ffc24c4-40bd-46e9-a655-34e94d76cd1b   7089  RHEL6_5VM1           Up                   10.35.102.125


From engine I see this:
2014-11-13 18:48:33,227 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-7) [2246090c] VM RHEL6_5VM1 (6ffc24c4-4
0bd-46e9-a655-34e94d76cd1b) is running in db and not running in VDS hosted_engine_1


VDSM and libvirt services are running on both hosts OK.


Version-Release number of selected component (if applicable):
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
libvirt-0.10.2-46.el6_6.2.x86_64
ovirt-hosted-engine-setup-1.2.1-7.el6ev.noarch
ovirt-hosted-engine-ha-1.2.4-2.el6ev.noarch
sanlock-2.8-1.el6.x86_64
vdsm-4.16.8.1-2.el6ev.x86_64
ovirt-host-deploy-1.3.0-1.el6ev.noarch
rhevm-3.5.0-0.23.beta.el6ev.noarch

How reproducible:
50%

Steps to Reproduce:
1.On HE environment remove JSon RPC from both hosts, you may also leave it, won't work either.
2.Create CPU profile of 10% at DC, and implement it on created VM with RHEL6.5.
3.Try starting VM.

Actual results:
VM being stuck in "Waiting for launch" in WEBUI, while running on host.

Expected results:
VM should run and be shown properly in both WEBUI and host.

Additional info:
logs attached.

Comment 1 Nikolai Sednev 2014-12-09 08:12:18 UTC
engine=# select vds_name,pending_vmem_size,pending_vcpus_count from vds;
vds_name     | pending_vmem_size | pending_vcpus_count 
-----------------+-------------------+---------------------
hosted_engine_2 |              4096 |                   4
hosted_engine_1 |              7168 |                   7
(2 rows)
engine=# 

Looks like some resources not freed?
I tried also rebooting both hosts with HE over them, not helped.

I also saw some errors in libvirt and vdsm.

Comment 2 Doron Fediuck 2014-12-09 08:16:47 UTC
Looking at the logs you hit Bug 1171491 and your engine is not refreshing.

Re-open if you're able to reproduce without Bug 1171491 effects on a clean
setup.

Comment 3 Nikolai Sednev 2014-12-09 08:20:20 UTC
Please review these bugs, they might have the same root cause:
1157211
1169854
1163142

Comment 4 Nikolai Sednev 2014-12-09 13:16:45 UTC
Issue caused while ballooning functionality is enabled on host cluster, 1157211
1169854,1163142 are not related. In case ballooning is enabled, VMs won't start and will stuck in "waiting for launch".
To work around the issue, ballooning have to be disabled.

Issue firstly appeared after upgrading to vt13.1, before that everything worked just fine.

Comment 5 Doron Fediuck 2014-12-09 15:41:44 UTC
Nikolai,
this is not bug.
Please add any other comments in Bug 1171491 which is prevents your engine
from refreshing the vm status.

Comment 6 Gil Klein 2014-12-10 10:49:32 UTC
Droron, while this BZ was marked as an RC blocker, I suggest we keep it open for verification purpose, till we get the fix for #1171491.

If it won't be reproducible we will close it as a duplicate.

Comment 7 Nikolai Sednev 2014-12-14 18:24:27 UTC
Works for me with these components, while ballooning is enabled:
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
libvirt-0.10.2-46.el6_6.2.x86_64
vdsm-4.16.8.1-3.el6ev.x86_64
ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch
sanlock-2.8-1.el6.x86_64
ovirt-host-deploy-1.3.0-2.el6ev.noarch
ovirt-hosted-engine-ha-1.2.4-3.el6ev.noarch
rhevm-3.5.0-0.25.el6ev.noarch
ovirt-host-deploy-1.3.0-2.el6ev.noarch
ovirt-host-deploy-java-1.3.0-2.el6ev.noarch
mom-0.4.1-4.el6ev.noarch

Please close as duplicate to 1171491.

Comment 8 Doron Fediuck 2014-12-15 13:59:11 UTC

*** This bug has been marked as a duplicate of bug 1171491 ***