1446311 – [RFE] Optional NUMA affinity for PCI devices

Bug 1446311 - [RFE] Optional NUMA affinity for PCI devices

Summary: [RFE] Optional NUMA affinity for PCI devices

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	13.0 (Queens)
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	Upstream M2
Target Release:	13.0 (Queens)
Assignee:	Stephen Finucane
QA Contact:	awaugama
Docs Contact:
URL:	https://blueprints.launchpad.net/nova...
Whiteboard:	upstream_milestone_none upstream_defi...
Duplicates (1):	1663653 (view as bug list)
Depends On:	1366208
Blocks:	1188000 1419231 1419948 1422243 1427361 1442136 1561961 1647536 1650606 1757886 1775575 1775576 1783354 1791991
TreeView+	depends on / blocked

Reported:	2017-04-27 16:15 UTC by Stephen Finucane
Modified:	2022-03-13 14:16 UTC (History)
CC List:	33 users (show)
Fixed In Version:	openstack-nova-17.0.1-0.20180302144923.9ace6ed.el7ost
Doc Type:	Technology Preview
Doc Text:	This release adds support for PCI device NUMA affinity policies, which are configured as part of the “[pci]alias” configuration options. Three policies are supported: “required” (must have) “legacy” (default; must have, if available) “preferred” (nice to have) In all cases, strict NUMA affinity is provided, if possible. These policies allow you to configure how strict your NUMA affinity should be per PCI alias to maximize resource utilization. The key difference between the policies is how much NUMA affinity you're willing to forsake before failing to schedule. When the “preferred” policy is configured for a PCI device, nova uses CPUs on a different NUMA node from the NUMA node of the PCI device, if it is available. This results in increased resource utilization, but performance is reduced for these instances.
Clone Of:	1366208
Clones:	1561961 1647536 (view as bug list)
Environment:
Last Closed:	2019-02-05 10:42:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1614882	None	None	None	2017-04-27 16:15:19 UTC
OpenStack gerrit	361140	'None'	MERGED	PCI NUMA Policies	2021-02-01 17:56:38 UTC
Red Hat Product Errata	RHEA-2018:2086	None	None	None	2018-06-27 13:33:17 UTC

Comment 1 Stephen Finucane 2017-04-27 16:17:30 UTC

This RFE focuses on making NUMA affinity for SR-IOV/PCI devices optional. The spec missed the Pike deadline so this has been deferred to Queens.

Comment 2 Stephen Finucane 2017-08-30 16:02:26 UTC

As (hopefully) noted previously, this is being taken care of by a Mirantis guy. I plan to keep an eye on this this cycle and step in if necessary.

Comment 8 errata-xmlrpc 2018-06-27 13:31:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086

Comment 13 Stephen Finucane 2018-11-07 17:20:15 UTC

A mistake was made during implementation of this feature. While this RFE specifically called out support for optional NUMA affinity for SR-IOV devices, what was implemented upstream was support for optional NUMA affinity of standard PCI passthrough devices. These are handled differently. SR-IOV devices are created by neutron  and attached as network devices. For example:

  openstack port create ...
  openstack server create --nic port-id=$port_id ...

PCI passthrough devices, by comparison, are attached by specifying PCI aliases in the flavor and attached by nova at boot time:

  openstack flavor set m1.large --property "pci_passthrough:alias"="a1:2"
  openstack server create --flavor m1.large ...

The feature, as currently implemented, allows PCI policies to be defined in the alias configuration in 'nova.conf' and therefore only supports the latter type of attachment.

Clearly some additional work is required here, however, given that the feature as implemented has use (FPGAs jump to mind), we should build upon what's been done rather than replace it. As a result, I'm going to clone this BZ. The cloned BZ will focus on closing the SR-IOV gap, while this BZ will be renamed to handle the PCI passthrough case that has already been addressed.

Comment 14 Stephen Finucane 2019-01-11 11:50:00 UTC

*** Bug 1663653 has been marked as a duplicate of this bug. ***

Comment 15 Vinayak 2019-01-16 14:02:28 UTC

Can you please review https://access.redhat.com/support/cases/#/case/02255851? This issue was observed in RH OSP13. Has it been fixed now?

Comment 16 Stephen Finucane 2019-02-05 10:42:20 UTC

(In reply to Vinayak from comment #15)
> Can you please review
> https://access.redhat.com/support/cases/#/case/02255851? This issue was
> observed in RH OSP13. Has it been fixed now?

I fail to see what hugepage allocation issues have to do with this feature. Could you elaborate (via a new bug), please?

Note You need to log in before you can comment on or make changes to this bug.

adakopou
atelang
awaugama
berrange
dasmith
egallen
eglynn
fbaudin
jhakimra
jniu
jraju
kchamart
lmarsh
lyarwood
markmc
mburns
mdeng
mschuppe
oblaut
panbalag
rhos-maint
sbauza
sclewis
sferdjao
sgordon
sputhenp
srevivo
stephenfin
supadhya
vinayak.ram
vromanso
yohmura
yrachman