Bug 1664702

Summary:	[OSP10] Oversubscription broken for instances with NUMA topologies
Product:	Red Hat OpenStack	Reporter:	Stephen Finucane <stephenfin>
Component:	openstack-nova	Assignee:	Stephen Finucane <stephenfin>
Status:	CLOSED ERRATA	QA Contact:	OSP DFG:Compute <osp-dfg-compute>
Severity:	high	Docs Contact:
Priority:	high
Version:	10.0 (Newton)	CC:	dasmith, eglynn, jhakimra, kchamart, lyarwood, mbooth, mgeary, sbauza, sgordon, vromanso
Target Milestone:	async	Keywords:	Triaged, ZStream
Target Release:	10.0 (Newton)
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:	openstack-nova-14.1.0-43.el7ost	Doc Type:	Known Issue
Doc Text:	Previously, due to an update that made memory allocation pagesize aware, you cannot oversubscribe memory for instances with NUMA topologies. With this update, memory oversubscription is disabled for all NUMA topology instances, including implicit NUMA topologies, such as hugepages or CPU pinning.	Story Points:	---
Clone Of:	1664701	Environment:
Last Closed:	2019-04-30 16:59:16 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1519540, 1664698, 1664701
Bug Blocks:

Description Stephen Finucane 2019-01-09 13:38:19 UTC

+++ This bug was initially created as a clone of Bug #1664701 +++

Description of problem:

As described in [1], the fix to [2] appears to have inadvertently broken oversubscription of memory for instances with a NUMA topology but no hugepages.

Version-Release number of selected component (if applicable):

N/A

How reproducible:

Always.

Steps to Reproduce:

1. Create a flavor that will consume > 50% available memory for your host(s) and specify an explicit NUMA topology. For example, on my all-in-one deployment where the host has 32GB RAM, we will request a 20GB instance:

   $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
   $ openstack flavor set test.numa --property hw:numa_nodes=2

2. Boot an instance using this flavor:

   $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test

3. Boot another instance using this flavor:

   $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2

Actual results:

The second instance fails to boot. We see the following error message in the logs.

  nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
  nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}}

If we revert the patch that addressed the bug [3] then we revert to the correct behaviour and the instance boots. With this though, we obviously lose whatever benefits that change gave us.

Expected results:

The second instance should boot.

Additional info:

[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
[2] https://bugs.launchpad.net/nova/+bug/1734204
[3] https://review.openstack.org/#/c/532168

Comment 6 Joe H. Rahme 2019-04-09 12:55:49 UTC

Verification steps:

# 2 compute nodes with ~6GB of memory each
 
        [stack@undercloud-0 ~]$ for i in 6 8; do ssh heat-admin.24.$i 'echo $(hostname) $(grep MemTotal /proc/meminfo)'; done
        compute-1 MemTotal: 5944884 kB
        compute-0 MemTotal: 5944892 kB
 
# Create a large flavor with numa_nodes
 
        [stack@undercloud-0 ~]$ openstack flavor create --vcpu 2 --disk 0 --ram 4096 test.numa
        [stack@undercloud-0 ~]$ openstack flavor set test.numa --property hw:numa_nodes=1
 
 
# boot 2 instances with this flavor. Works because each instance goes on a separate compute
 
        [stack@undercloud-0 ~]$ nova boot --poll --image cirros --flavor test.numa test1 --nic net-id=353d787b-7788-40b0-aaff-a0ab2325b64e
        [stack@undercloud-0 ~]$ nova boot --poll --image cirros --flavor test.numa test2 --nic net-id=353d787b-7788-40b0-aaff-a0ab2325b64e
 
 
# Negative test, booting a third instance will fail with the 'No valid host error'
 
        [stack@undercloud-0 ~]$ nova boot --poll --image cirros --flavor test.numa test3 --nic net-id=353d787b-7788-40b0-aaff-a0ab2325b64e
       
# Modify `ram_allocation_ratio` in nova.conf on the compute node

	[heat-admin@compute-1 ~]$ sudo grep ram_allocation_ratio /etc/nova/nova.conf
	ram_allocation_ratio=2.0

# Boot a 4th instance, it boots successfully

	[stack@undercloud-0 ~]$ nova boot --poll --image cirros --flavor test.numa test4 --nic net-id=353d787b-7788-40b0-aaff-a0ab2325b64e
	[stack@undercloud-0 ~]$ nova list
	+--------------------------------------+-------+--------+------------+-------------+------------------------+
	| ID                                   | Name  | Status | Task State | Power State | Networks               |
	+--------------------------------------+-------+--------+------------+-------------+------------------------+
	| 4baccd63-0a8e-4288-97a0-b2b449d45a39 | test1 | ACTIVE | -          | Running     | private=192.168.100.9  |
	| ff0a5dd2-a1b8-4937-a3e9-c8a45f5253dd | test2 | ACTIVE | -          | Running     | private=192.168.100.6  |
	| 5bb3597c-a193-479a-9292-6d652b799a66 | test3 | ERROR  | -          | NOSTATE     |                        |
	| 81ce205a-1a15-48f6-8055-3c1a39334602 | test4 | ACTIVE | -          | Running     | private=192.168.100.16 |
	+--------------------------------------+-------+--------+------------+-------------+------------------------+


# Package version:
 openstack-nova-common.noarch     1:14.1.0-44.el7ost    @rhos-10.0-signed

Comment 8 errata-xmlrpc 2019-04-30 16:59:16 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0923