+++ This bug was initially created as a clone of Bug #1664698 +++ Description of problem: As described in [1], the fix to [2] appears to have inadvertently broken oversubscription of memory for instances with a NUMA topology but no hugepages. Version-Release number of selected component (if applicable): N/A How reproducible: Always. Steps to Reproduce: 1. Create a flavor that will consume > 50% available memory for your host(s) and specify an explicit NUMA topology. For example, on my all-in-one deployment where the host has 32GB RAM, we will request a 20GB instance: $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa $ openstack flavor set test.numa --property hw:numa_nodes=2 2. Boot an instance using this flavor: $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test 3. Boot another instance using this flavor: $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2 Actual results: The second instance fails to boot. We see the following error message in the logs. nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}} nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}} If we revert the patch that addressed the bug [3] then we revert to the correct behaviour and the instance boots. With this though, we obviously lose whatever benefits that change gave us. Expected results: The second instance should boot. Additional info: [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html [2] https://bugs.launchpad.net/nova/+bug/1734204 [3] https://review.openstack.org/#/c/532168
* Puddle version [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 13 -p 2019-04-18.2 * Compute nodes (overcloud) [stack@undercloud-0 ~]$ openstack resource provider inventory show bb4c605c-2fa6-4cf6-bdf4-b05e6b157d33 MEMORY_MB +------------------+-------+ | Field | Value | +------------------+-------+ | allocation_ratio | 3.0 | | max_unit | 6143 | | reserved | 4096 | | step_size | 1 | | min_unit | 1 | | total | 6143 | +------------------+-------+ (overcloud) [stack@undercloud-0 ~]$ openstack resource provider inventory show 5b51b165-82ce-42d9-80e6-9874a7f4b0ab MEMORY_MB +------------------+-------+ | Field | Value | +------------------+-------+ | allocation_ratio | 1.0 | | max_unit | 6143 | | reserved | 4096 | | step_size | 1 | | min_unit | 1 | | total | 6143 | +------------------+-------+ * Instance flavor (overcloud) [stack@undercloud-0 ~]$ openstack flavor create --vcpu 2 --disk 0 --ram 3000 half_node_flavor +----------------------------+--------------------------------------+ | Field | Value | +----------------------------+--------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | disk | 0 | | id | 64918380-b183-43f5-9896-8b59020d1a3a | | name | half_node_flavor | | os-flavor-access:is_public | True | | properties | | | ram | 3000 | | rxtx_factor | 1.0 | | swap | | | vcpus | 2 | +----------------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack flavor set half_node_flavor --property hw:numa_nodes=1 * Boot 3 instances (overcloud) [stack@undercloud-0 os-smoke]$ for i in {1..3}; do openstack server create --flavor half_node_flavor --image cirros-0.3.5-x86_64-disk.img --nic net-id=fd62fe37-9669-4720-b8da-5574c09d0fc2 $i --wait; done +-------------------------------------+---------------------------------------------------------------------+ | Field | Value | +-------------------------------------+---------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | compute-0.localdomain | | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-0000000b | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2019-04-29T15:59:21.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | devstack=192.168.100.11 | | adminPass | oMPhrkvN9iCm | | config_drive | | | created | 2019-04-29T15:59:10Z | | flavor | half_node_flavor (64918380-b183-43f5-9896-8b59020d1a3a) | | hostId | 6d6b6aa9727938b4d53fe771464bd22e688b32326d21866e0130240a | | id | 94d8c935-131b-4a39-b899-a076c1ce273a | | image | cirros-0.3.5-x86_64-disk.img (020c49be-3cb5-40ec-acad-8d6f67663785) | | key_name | None | | name | 1 | | progress | 0 | | project_id | 19a12ec527c649b2928fadb009f84196 | | properties | | | security_groups | name='default' | | status | ACTIVE | | updated | 2019-04-29T15:59:21Z | | user_id | 47becba725b741b8800ca5a15591924b | | volumes_attached | | +-------------------------------------+---------------------------------------------------------------------+ +-------------------------------------+---------------------------------------------------------------------+ | Field | Value | +-------------------------------------+---------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | compute-0.localdomain | | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-0000000e | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2019-04-29T15:59:43.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | devstack=192.168.100.6 | | adminPass | yJrMvrSKndL9 | | config_drive | | | created | 2019-04-29T15:59:33Z | | flavor | half_node_flavor (64918380-b183-43f5-9896-8b59020d1a3a) | | hostId | 6d6b6aa9727938b4d53fe771464bd22e688b32326d21866e0130240a | | id | 86217109-9003-45d5-ac4a-59952378bbe7 | | image | cirros-0.3.5-x86_64-disk.img (020c49be-3cb5-40ec-acad-8d6f67663785) | | key_name | None | | name | 2 | | progress | 0 | | project_id | 19a12ec527c649b2928fadb009f84196 | | properties | | | security_groups | name='default' | | status | ACTIVE | | updated | 2019-04-29T15:59:43Z | | user_id | 47becba725b741b8800ca5a15591924b | | volumes_attached | | +-------------------------------------+---------------------------------------------------------------------+ +-------------------------------------+---------------------------------------------------------------------+ | Field | Value | +-------------------------------------+---------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | compute-0.localdomain | | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-0000000e | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2019-04-29T15:59:43.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | devstack=192.168.100.6 | | adminPass | yJrMvrSKndL9 | | config_drive | | | created | 2019-04-29T15:59:33Z | | flavor | half_node_flavor (64918380-b183-43f5-9896-8b59020d1a3a) | | hostId | 6d6b6aa9727938b4d53fe771464bd22e688b32326d21866e0130240a | | id | 86217109-9003-45d5-ac4a-59952378bbe7 | | image | cirros-0.3.5-x86_64-disk.img (020c49be-3cb5-40ec-acad-8d6f67663785) | | key_name | None | | name | 2 | | progress | 0 | | project_id | 19a12ec527c649b2928fadb009f84196 | | properties | | | security_groups | name='default' | | status | ACTIVE | | updated | 2019-04-29T15:59:43Z | | user_id | 47becba725b741b8800ca5a15591924b | | volumes_attached | | +-------------------------------------+---------------------------------------------------------------------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0924