Bug 1647536

Summary: [RFE] Optional NUMA affinity for SR-IOV devices
Product: Red Hat OpenStack Reporter: Stephen Finucane <stephenfin>
Component: openstack-novaAssignee: smooney
Status: CLOSED CURRENTRELEASE QA Contact: James Parker <jparker>
Severity: high Docs Contact:
Priority: low    
Version: 13.0 (Queens)CC: brault, broose, cswanson, dasmith, dhill, egallen, eglynn, gkadam, jhakimra, jjoyce, jniu, jparker, jraju, kchamart, lmarsh, lyarwood, marjones, markmc, mburns, mdeng, mschuppe, mvalsecc, oblaut, sbauza, sclewis, sferdjao, sgordon, smooney, spower, sputhenp, srevivo, stephenfin, supadhya, vromanso, weiyongjun, yrachman
Target Milestone: AlphaKeywords: FutureFeature, Patch, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
URL: https://blueprints.launchpad.net/nova/+spec/share-pci-device-between-numa-nodes
Whiteboard:
Fixed In Version: openstack-nova-21.1.0-0.20200425164546.347d656.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1446311
: 1757886 1775576 (view as bug list) Environment:
Last Closed: 2022-10-20 10:20:06 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: Ussuri
Embargoed:
Bug Depends On: 1366208, 1446311    
Bug Blocks: 1188000, 1419231, 1419948, 1422243, 1427361, 1442136, 1561961, 1650606, 1653846, 1756916, 1757886, 1775575, 1775576, 1783354, 1791991    

Comment 9 Stephen Finucane 2019-08-02 14:13:46 UTC
As noted in [1], this RFE is necessary to close a gap where it's possible to configure a NUMA affinity policy for PCI passthrough devices but not SR-IOV devices. To restate what's described there, NUMA policies are currently configured as part of the PCI alias configuration in 'nova.conf', and by requesting a PCI device using the given alias you also get the NUMA affinity policy associated with that alias. However, SR-IOV devices are not typically attached to an instance using PCI aliases but rather by configuring a neutron port and attaching that on instance boot. This means the PCI alias-based approach is of no use for SR-IOV devices.

There are two possible approaches we can pursue to resolve this. The first approach is to use flavor extra specs and image metadata to configure instance-wide PCI policies that would apply to all PCI devices attached to the instance including SR-IOV devices. This was the approach first proposed in the 'share-pci-between-numa-nodes' blueprint [2], before this was modified to use PCI aliases instead [3]. The other approach is to provide a new QoS policy in neutron that nova could consume. This was the approach that was discussed and essentially approved at the most recent Denver PTG. The flavor/image-based approach has the advantage of being much simpler to implement and mostly backportable, but it is very broad and prevents us from specifying NUMA affinity policies on a per port basis. The neutron QoS policy approach, by comparison, involves API and object changes in neutron, which make it more difficult to implement and prevent us from backporting it, but it does allow for very fine grained control over the affinity policy of each device.

We propose pursuing both approaches in succession. We will first pursue the flavor extra spec/image metadata-based approach for OSP 16, backporting this to OSP 13 once complete. In a later cycle, we will pursue the neutron QoS policy-based approach. This BZ is tracking the first approach.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1446311#c13
[2] https://review.opendev.org/#/c/361140/30/
[3] https://review.opendev.org/#/c/555000/3/

Comment 26 spower 2022-06-03 14:30:24 UTC
This RFE was not marked MVP for OSP 17.0 and so will be moved to OSP 17.1 for verification and docs. Contact rhos-trac if a tech preview is needed for OSP 17.0

Comment 32 Red Hat Bugzilla 2023-12-02 04:25:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days