Bug 1657391
Summary: | Host Evacuate fails for CPU Weigher filter due to moving smallest instance first | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | awaugama |
Component: | openstack-nova | Assignee: | OSP DFG:Compute <osp-dfg-compute> |
Status: | CLOSED NOTABUG | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 14.0 (Rocky) | CC: | dasmith, eglynn, jhakimra, kchamart, sbauza, sgordon, stephenfin, vromanso |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-12-10 15:03:46 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1469073 |
Description
awaugama
2018-12-07 21:18:01 UTC
Sorry hit submit on accident before updating the bug. When trying to run a host evacuate with 2 instances of different sizes, the larger instance fails to evacuate. This seems to be because the smaller instance is evacuated first and takes up the "Slot" the larger instance could use. Layout before Host Evacuate Compute-0: 2 instances, 1 with 2 vCPU, 1 with 1 vCPU Compute-1: 1 vCPU free (4 vCPU total, 1 instance using 3 on the node) Compute-2: 2 vCPU free (4 vCPU total, 1 instance using 2 on the node) Expected after Host Evacuate: Compute-0: 0 instances Compute-1: The instance from Compute Node 0 using 1 vCPU, the instance using 3 that was there before Compute-2: The instance from Compute Node 0 using 2 vCPU, the instance using 2 that was there before Actual Results: Compute-0: 2 vCPU instance failed to evacuate, listed as error Compute-1: The instance that was using 3 vCPU only Compute-2: The instance from Compute Node 0 using 1 vCPU, the instance using 2 that was there before. Logs below: ()[root@compute-0 /]# yum info openstack-nova-common.noarch Version : 18.0.3 Release : 0.20181011032838.d1243fe.el7ost ()[root@compute-0 /]# yum info openstack-nova-compute.noarch Version : 18.0.3 Release : 0.20181011032838.d1243fe.el7ost ()[root@compute-0 /]# yum info openstack-nova-migration.noarch Version : 18.0.3 Release : 0.20181011032838.d1243fe.el7ost (overcloud) [stack@undercloud-0 ~]$ openstack server list --long +--------------------------------------+--------------------------+--------+------------+-------------+-------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor Name | Flavor ID | Availability Zone | Host | Properties | +--------------------------------------+--------------------------+--------+------------+-------------+-------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | 2fe86736-4073-427e-84da-074dc78705e6 | compute_0_large_instance | ACTIVE | None | Running | public=10.0.0.216 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_2_vcpu | ee083ec8-f154-44a6-a0e1-4b8d32f02561 | nova | compute-0.localdomain | | | 7f741183-b831-4331-a1e2-fa5a630f92a9 | compute_0_small_instance | ACTIVE | None | Running | public=10.0.0.213 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_1_vcpu | 52854d2b-613f-43e8-ab19-93b9b6c1abe0 | nova | compute-0.localdomain | | | 10527e30-7fe4-4ed3-bc18-c0a3bdaa260e | compute_1_use_3_vcpu | ACTIVE | None | Running | public=10.0.0.231 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_3_vcpu | eb8d60d6-8787-436c-a0a5-eb2f5d1930ec | nova | compute-1.localdomain | | | 2ff31a01-b233-4688-83e6-7ad1e5b8b330 | compute_2_use_2_vcpu | ACTIVE | None | Running | public=10.0.0.218 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_2_vcpu | ee083ec8-f154-44a6-a0e1-4b8d32f02561 | nova | compute-2.localdomain | | +--------------------------------------+--------------------------+--------+------------+-------------+-------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor show 1 | grep vcpus | vcpus | 4 | | vcpus_used | 3 | (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor show 2 | grep vcpus | vcpus | 4 | | vcpus_used | 3 | (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor show 3 | grep vcpus | vcpus | 4 | | vcpus_used | 2 | (overcloud) [stack@undercloud-0 ~]$ nova service-force-down 4e52c4fb-7bc3-4638-98d5-58a64e85be97 +--------------------------------------+-----------------------+--------------+-------------+ | ID | Host | Binary | Forced down | +--------------------------------------+-----------------------+--------------+-------------+ | 4e52c4fb-7bc3-4638-98d5-58a64e85be97 | compute-0.localdomain | nova-compute | True | +--------------------------------------+-----------------------+--------------+-------------+ (overcloud) [stack@undercloud-0 ~]$ nova host-evacuate compute-0.localdomain +--------------------------------------+-------------------+---------------+ | Server UUID | Evacuate Accepted | Error Message | +--------------------------------------+-------------------+---------------+ | 7f741183-b831-4331-a1e2-fa5a630f92a9 | True | | | 2fe86736-4073-427e-84da-074dc78705e6 | True | | +--------------------------------------+-------------------+---------------+ (overcloud) [stack@undercloud-0 ~]$ openstack server list --long +--------------------------------------+--------------------------+--------+------------+-------------+-------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor Name | Flavor ID | Availability Zone | Host | Properties | +--------------------------------------+--------------------------+--------+------------+-------------+-------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | 2fe86736-4073-427e-84da-074dc78705e6 | compute_0_large_instance | ERROR | None | Running | public=10.0.0.216 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_2_vcpu | ee083ec8-f154-44a6-a0e1-4b8d32f02561 | nova | compute-0.localdomain | | | 7f741183-b831-4331-a1e2-fa5a630f92a9 | compute_0_small_instance | ACTIVE | None | Running | public=10.0.0.213 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_1_vcpu | 52854d2b-613f-43e8-ab19-93b9b6c1abe0 | nova | compute-2.localdomain | | | 10527e30-7fe4-4ed3-bc18-c0a3bdaa260e | compute_1_use_3_vcpu | ACTIVE | None | Running | public=10.0.0.231 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_3_vcpu | eb8d60d6-8787-436c-a0a5-eb2f5d1930ec | nova | compute-1.localdomain | | | 2ff31a01-b233-4688-83e6-7ad1e5b8b330 | compute_2_use_2_vcpu | ACTIVE | None | Running | public=10.0.0.218 | cirros | 0a411db3-dd1a-41ea-b348-bf15077c749b | use_2_vcpu | ee083ec8-f154-44a6-a0e1-4b8d32f02561 | nova | compute-2.localdomain | | +--------------------------------------+--------------------------+--------+------------+-------------+-------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor show 1 | grep vcpus Compute service of compute-0.localdomain is unavailable at this time. (HTTP 400) (Request-ID: req-be965b15-b043-429b-8cb5-b0e6ca109782) (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor show 2 | grep vcpus | vcpus | 4 | | vcpus_used | 3 | (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor show 3 | grep vcpus | vcpus | 4 | | vcpus_used | 3 | This is behaving as expected. Host evacuate makes no guarantees about the order that things are evacuated in. You could use the CPUWeigher with a negative value to cause "stacking" of the instances (so each instance will go on the host with the least amount of free CPUs) but this will affect all scheduling operations. |