This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1302530 - Live migration should use the same memory over subscription logic as instance boot
Live migration should use the same memory over subscription logic as instance...
Status: CLOSED CURRENTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
5.0 (RHEL 7)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 8.0 (Liberty)
Assigned To: Vladik Romanovsky
nlevinki
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-28 00:11 EST by Vladik Romanovsky
Modified: 2017-06-22 17:01 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-06-22 17:01:12 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1214943 None None None 2016-01-28 00:11 EST

  None (edit)
Description Vladik Romanovsky 2016-01-28 00:11:27 EST
Cloned from launchpad bug 1214943

Description of problem:

I encounter an issue when live migrate an instance specified the target host, i think the operation will be successes , but it is failed for below reason:

MigrationPreCheckError: Migration pre-check error: Unable to migrate a34f9b88-1e07-4798-af46-ca3b3dbaceda to hchenos2: Lack of memory(host:336 <= instance:512)

  1 . My OpenStack cluster information :

1). There are two compute nodes in my cluster, and i created 4 instance(1vcpu/512Mmemory) on these hosts

-----------
mysql> select hypervisor_hostname,vcpus,vcpus_used,running_vms,memory_mb,memory_mb_used,free_ram_mb,deleted from compute_nodes where deleted=0;
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hypervisor_hostname | vcpus | vcpus_used | running_vms | memory_mb | memory_mb_used | free_ram_mb | deleted |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hchenos1.eng.platformlab.ibm.com | 2 | 2 | 2 | 1872 | 1536 | 336 | 0 |
| hchenos2.eng.platformlab.ibm.com | 2 | 2 | 2 | 1872 | 1536 | 336 | 0 |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
2 rows in set (0.00 sec)

mysql>
------------------------
[root@hchenos ~]# nova list
+--------------------------------------+------+--------+----------+
| ID | Name | Status | Networks |
+--------------------------------------+------+--------+----------+
| a34f9b88-1e07-4798-af46-ca3b3dbaceda | vm1 | ACTIVE | | >>> on host 'hchenos1'
| f6aaeff9-2220-4693-8e5a-710f4c52b774 | vm2 | ACTIVE | | >>>> on host 'hchenos2'
| bbee57a2-81cd-4933-a943-1c2272f7f550 | vm4 | ACTIVE | | >>>> on host 'hchenos1'
| 74fe26ec-919c-4fa7-890f-f59abe09ef4f | vm5 | ACTIVE | | >>>> on host 'hchenos2'
+--------------------------------------+------+--------+----------+
[root@hchenos ~]#

 2). I also enable the ComputeFilter,RamFilter and CoreFilter in nova.conf, but don't config over commit ratio for both vcpu and memory, so the default ratio will be used.

2. In the above conditions, live migrate instance vm1 to hchenos2 failed:

[root@hchenos ~]# nova live-migration vm1 hchenos2
ERROR: Live migration of instance a34f9b88-1e07-4798-af46-ca3b3dbaceda to host hchenos2 failed (HTTP 400) (Request-ID: req-68244b99-e438-4000-8bdb-cc43b275c018)

 conductor log:
...
ckages/nova/conductor/tasks/live_migrate.py", line 87, in _check_requested_destination\n self._check_destination_has_enough_memory()\n\n File "/usr/lib/python2.6/site-packages/nova/conductor/tasks/live_migrate.py", line 108, in _check_destination_has_enough_memory\n mem_inst=mem_inst))\n\nMigrationPreCheckError: Migration pre-check error: Unable to migrate a34f9b88-1e07-4798-af46-ca3b3dbaceda to hchenos2: Lack of memory(host:336 <= instance:512)\n\n']

I think the reason for above as below:

the free_ram_mb for 'hchenos2 ' is 336M, the request memory is 512M, so the operation is failed.

free_ram_mb = memory_mb (1872) - 512(reserved_host_memory_mb) - 2*512(instance consume) = 336

3. But successfully boot an instance on 'hchenos2'

[root@hchenos ~]# nova boot --image cirros-0.3.0-x86_64 --flavor 1 --availability-zone nova:hchenos2 xhu

[root@hchenos ~]# nova list
+--------------------------------------+------+--------+----------+
| ID | Name | Status | Networks |
+--------------------------------------+------+--------+----------+
| a34f9b88-1e07-4798-af46-ca3b3dbaceda | vm1 | ACTIVE | |
| f6aaeff9-2220-4693-8e5a-710f4c52b774 | vm2 | ACTIVE | |
| bbee57a2-81cd-4933-a943-1c2272f7f550 | vm4 | ACTIVE | |
| 74fe26ec-919c-4fa7-890f-f59abe09ef4f | vm5 | ACTIVE | |
| 364d1a01-67ed-4966-bbfd-d21b6bc3067c | xhu | ACTIVE | | >>>> is active
+--------------------------------------+------+--------+----------+
[root@hchenos ~]#

mysql> select hypervisor_hostname,vcpus,vcpus_used,running_vms,memory_mb,memory_mb_used,free_ram_mb,deleted from compute_nodes where deleted=0;
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hypervisor_hostname | vcpus | vcpus_used | running_vms | memory_mb | memory_mb_used | free_ram_mb | deleted |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hchenos1.eng.platformlab.ibm.com | 2 | 2 | 2 | 1872 | 1536 | 336 | 0 |
| hchenos2.eng.platformlab.ibm.com | 2 | 3 | 3 | 1872 | 2048 | -176 | 0 |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
2 rows in set (0.00 sec)

mysql>

So, I'm very confused for above test result, why boot an instance is OK on 'hchenos2', but live migration an instance to this host failed due to "not enough memory" ?

After carefully go through NOVA source code (live_migrate.py: execute()) , i think below will cause this issue:

1). The function '_check_destination_has_enough_memory' doesn't consider the ram allocation ratio(default value is 1.5) when calculate host free memory('free_ram_mb'), it is inconsistent with 'RamFilter' for memory check when boot instance.

I think the free memory of host 'hchenos2' should be:

free_ram_mb = memory_mb (1872) * ram_allocation_ratio(1.5) - memory_mb_used('1536') = 1272

2) why not check vcpu for live migration target host, only check memory is enough?

live_migrate.py: execute

        self._check_instance_is_running()
        self._check_host_is_up(self.source)

        if not self.destination:
            self.destination = self._find_destination()
        else:
            self._check_requested_destination() >>>>

    def _check_requested_destination(self):
        self._check_destination_is_not_source()
        self._check_host_is_up(self.destination)
        self._check_destination_has_enough_memory() >>>> Only check memory, why not check vcpu together?
        self._check_compatible_with_source_hypervisor(self.destination)
        self._call_livem_checks_on_host(self.destination)

3) The VM status need to be considering as well, for example, if the instance is off, it doesn't consume compute node resource anymore on KVM platform(is different form IBM PowerVM), but in resource_tracker.py:_update_usage_from_instances() , only instance 'deleted' flag
is taken into account when calculate resource usage.

Note You need to log in before you can comment on or make changes to this bug.