Bug 2037761 - [NIC Partitioning] PCI Passthrough derivation is not translated to corresponding nova.conf
Summary: [NIC Partitioning] PCI Passthrough derivation is not translated to correspond...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Vijayalakshmi Candappa
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-06 14:00 UTC by Karthik Sundaravel
Modified: 2022-12-07 19:22 UTC (History)
6 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20221010235131.e0d438c.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-07 19:21:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 824402 0 None master: MERGED tripleo-heat-templates: Check if passthrough user_config is decoded properly from hiera data (Iedb3e2b8c503999aec8fde647... 2022-10-31 15:36:02 UTC
OpenStack gerrit 826218 0 None stable/wallaby: MERGED tripleo-heat-templates: Check if passthrough user_config is decoded properly from hiera data (Iedb3e2b8c503999aec8fde647... 2022-10-31 15:36:08 UTC
OpenStack gerrit 826452 0 None stable/victoria: MERGED tripleo-heat-templates: Check if passthrough user_config is decoded properly from hiera data (Iedb3e2b8c503999aec8fde647... 2022-10-31 15:36:13 UTC
OpenStack gerrit 826600 0 None MERGED Check if passthrough user_config is decoded properly from hiera data 2022-11-15 06:40:25 UTC
Red Hat Issue Tracker NFV-2388 0 None None None 2022-01-11 07:45:31 UTC
Red Hat Issue Tracker OSP-12046 0 None None None 2022-01-06 14:03:43 UTC
Red Hat Product Errata RHBA-2022:8794 0 None None None 2022-12-07 19:22:06 UTC

Description Karthik Sundaravel 2022-01-06 14:00:43 UTC
For a user configuration like below
  NovaPCIPassthrough:
  - devname: "eno3"
    trusted: "true"
    physical_network: "sriov1"
  - devname: "eno4"
    trusted: "true"
    physical_network: "sriov2"

The VF's used for NIC partitioning are with pci addresses 0000:18:0a.0, 0000:18:0a.1, 0000:18:0e.1

The expected passthrough_whitelist shall list all the VFs excluding the ones used by NIC Partitioning. 

Expected
passthrough_whitelist={"address": "0000:18:0a.2","physical_network":"sriov1","trusted":"true"}
passthrough_whitelist={"address": "0000:18:0a.3","physical_network":"sriov1","trusted":"true"}
passthrough_whitelist={"address": "0000:18:0e.0","physical_network":"sriov2","trusted":"true"}
passthrough_whitelist={"address": "0000:18:0e.2","physical_network":"sriov2","trusted":"true"}
passthrough_whitelist={"address": "0000:18:0e.3","physical_network":"sriov2","trusted":"true"}


Actual 
passthrough_whitelist={"devname":"eno3","physical_network":"sriov1","trusted":"true"}
passthrough_whitelist={"devname":"eno4","physical_network":"sriov2","trusted":"true"}

Comment 1 Artom Lifshitz 2022-01-10 19:05:28 UTC
I'm not sure I understand the request. Are you asking that, given the name of a PF in the `devname` element, the deployment tooling gets a list of PCI addresses belonging to the VFs on that PF, then remove the PCI addresses of VFs assigned to NIC partitioning (where are those defined?), and write the resulting set of addresses to Nova's passthrough_whitelist?

In a parallel to getting your answer to the above clarification, I'd like to state that using `devname` in this context is strongly discouraged. A particular `devname` like `eno4` has no guarantees of being stable, and the underlying device can change after a reboot. PCI addresses and/or vendor and function ID are the preferred way to identify PCI devices here.

Comment 2 Karthik Sundaravel 2022-01-11 06:26:43 UTC
In Tripleo, we have a script which derives the PCI passthrough after excluding the NIC Partitioning VF [1]. The derived results are expected to override the puppet hiera data in service_configs. But for some reason, the derived values are not getting precedence and the user configuration is taken by puppet-nova while configuring nova.conf.
We are yet to investigate the root cause.

I understand we had issues with devnames changing from RHEL7 to RHEL, but IMHO the leap upgrades takes care of the devnames during upgrades.
Nevertheless we shall prefer PCI address / Product ID for identifying PCI devices as recommended by RedHat documentation.

[1] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/neutron/neutron-sriov-agent-container-puppet.yaml#L211

Comment 4 Alex Schultz 2022-01-11 14:44:55 UTC
ExtraConfig has precedence over the values generated from that script per https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/overcloud.j2.yaml#L616  If the values in pci_passthrough_whitelist need to always be used, it needs to be moved up in the list as the hieradata files are tried top down and stop when a value is found.

Comment 5 Karthik Sundaravel 2022-01-11 16:16:57 UTC
Thanks Alex.

The root cause is identified. In [1], the expectation is to have a list of user configurations, while a string is being returned.
As a result the derivation did not yield any results to override the user configuration.

[1] https://github.com/openstack/tripleo-heat-templates/blob/1c71d0e3e1db56b0533f68aaf2acf4640b146c57/deployment/neutron/derive_pci_passthrough_whitelist.py#L149

Comment 16 errata-xmlrpc 2022-12-07 19:21:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794


Note You need to log in before you can comment on or make changes to this bug.