Created attachment 965918 [details] logs Description of problem: VM stuck in "Waiting for launch" in vt13.1 on HE with CPU SLA policy configured on 10%. I'm simply trying to start one guest VM and it's being stuck in "Waiting for launch" in WEBUI, while shown on host as running. # vdsClient -s 0 list table 6ffc24c4-40bd-46e9-a655-34e94d76cd1b 7089 RHEL6_5VM1 Up 10.35.102.125 From engine I see this: 2014-11-13 18:48:33,227 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-7) [2246090c] VM RHEL6_5VM1 (6ffc24c4-4 0bd-46e9-a655-34e94d76cd1b) is running in db and not running in VDS hosted_engine_1 VDSM and libvirt services are running on both hosts OK. Version-Release number of selected component (if applicable): qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 libvirt-0.10.2-46.el6_6.2.x86_64 ovirt-hosted-engine-setup-1.2.1-7.el6ev.noarch ovirt-hosted-engine-ha-1.2.4-2.el6ev.noarch sanlock-2.8-1.el6.x86_64 vdsm-4.16.8.1-2.el6ev.x86_64 ovirt-host-deploy-1.3.0-1.el6ev.noarch rhevm-3.5.0-0.23.beta.el6ev.noarch How reproducible: 50% Steps to Reproduce: 1.On HE environment remove JSon RPC from both hosts, you may also leave it, won't work either. 2.Create CPU profile of 10% at DC, and implement it on created VM with RHEL6.5. 3.Try starting VM. Actual results: VM being stuck in "Waiting for launch" in WEBUI, while running on host. Expected results: VM should run and be shown properly in both WEBUI and host. Additional info: logs attached.
engine=# select vds_name,pending_vmem_size,pending_vcpus_count from vds; vds_name | pending_vmem_size | pending_vcpus_count -----------------+-------------------+--------------------- hosted_engine_2 | 4096 | 4 hosted_engine_1 | 7168 | 7 (2 rows) engine=# Looks like some resources not freed? I tried also rebooting both hosts with HE over them, not helped. I also saw some errors in libvirt and vdsm.
Looking at the logs you hit Bug 1171491 and your engine is not refreshing. Re-open if you're able to reproduce without Bug 1171491 effects on a clean setup.
Please review these bugs, they might have the same root cause: 1157211 1169854 1163142
Issue caused while ballooning functionality is enabled on host cluster, 1157211 1169854,1163142 are not related. In case ballooning is enabled, VMs won't start and will stuck in "waiting for launch". To work around the issue, ballooning have to be disabled. Issue firstly appeared after upgrading to vt13.1, before that everything worked just fine.
Nikolai, this is not bug. Please add any other comments in Bug 1171491 which is prevents your engine from refreshing the vm status.
Droron, while this BZ was marked as an RC blocker, I suggest we keep it open for verification purpose, till we get the fix for #1171491. If it won't be reproducible we will close it as a duplicate.
Works for me with these components, while ballooning is enabled: qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 libvirt-0.10.2-46.el6_6.2.x86_64 vdsm-4.16.8.1-3.el6ev.x86_64 ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch sanlock-2.8-1.el6.x86_64 ovirt-host-deploy-1.3.0-2.el6ev.noarch ovirt-hosted-engine-ha-1.2.4-3.el6ev.noarch rhevm-3.5.0-0.25.el6ev.noarch ovirt-host-deploy-1.3.0-2.el6ev.noarch ovirt-host-deploy-java-1.3.0-2.el6ev.noarch mom-0.4.1-4.el6ev.noarch Please close as duplicate to 1171491.
*** This bug has been marked as a duplicate of bug 1171491 ***