Bug 1157211

Summary: Engine does not free pending_vmem_size and pending_vcpus_count on migrate host, in case of VM migration failure.
Product: Red Hat Enterprise Virtualization Manager Reporter: Nisim Simsolo <nsimsolo>
Component: ovirt-engineAssignee: Arik <ahadas>
Status: CLOSED ERRATA QA Contact: Nisim Simsolo <nsimsolo>
Severity: urgent Docs Contact:
Priority: high    
Version: unspecifiedCC: ahadas, dfediuck, ecohen, iheim, juwu, lpeer, lsurette, mavital, michal.skrivanek, mkalinin, ofrenkel, rbalakri, Rhev-m-bugs, sherold, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sla
Fixed In Version: org.ovirt.engine-root-3.5.0-19 Doc Type: Bug Fix
Doc Text:
Previously, memory and CPU resources that were reserved for a migrated virtual machine on the destination host were not cleared when a migration failed. With this update, the reserved memory and CPU resources are now cleared properly upon migration failure.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-11 18:10:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Scenario started at 2014-10-26 10:15:25,570 none

Description Nisim Simsolo 2014-10-26 08:54:37 UTC
Description of problem:
When VM migration failed, migrate host pending_vmem_size and pending_vcpus_count is not cleared although VM does not migrate to this host.

Version-Release number of selected component (if applicable):
Engine: rhevm-3.5.0-0.17.beta.el6ev.noarch
Host: libvirt-0.10.2-46.el6.x86_64
      vdsm-4.16.7.1-1.el6ev.x86_64
      sanlock-2.8-1.el6.x86_64
      qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64

How reproducible:
Constantly

Steps to Reproduce:
1. Add 2 hosts and 1 VMs to engine.
2. Migrate VM from host 1 to host 2.
3. During migration progress, switch host 2 to maintenance mode (In order to 
   simulate migration failure).
4. From engine DB, Observe engine pending_vmem_size and pending_vcpus_count of  
   host 2:
   # su postgres
   bash-4.1$ psql engine
   psql (8.4.20)
   Type "help" for help.
   engine=# select vds_name,pending_vmem_size,pending_vcpus_count from vds;
5. Activate host 2 and repeat steps 2-4 few times. 

Actual results:
pending_vmem_size and pending_vcpus_count increased on each migration failue, eventually migration cannot be performed due to host insufficent memory.

Expected results:
Engine should clear pending_vmem_size and pending_vcpus_count after migration failure.

Additional info:
Engine log attached (scenario started at 2014-10-26 10:15:25)

Comment 1 Nisim Simsolo 2014-10-26 09:08:07 UTC
Created attachment 950750 [details]
Scenario started at 2014-10-26 10:15:25,570

Comment 3 Nisim Simsolo 2014-11-09 12:55:37 UTC
Verified using:
rhevm-3.5.0-0.19.beta.el6ev.noarch

Comment 5 errata-xmlrpc 2015-02-11 18:10:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0158.html