Bug 2101409 - resource_provider_hypervisors not included in the sriov agent configuration
Summary: resource_provider_hypervisors not included in the sriov agent configuration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-neutron
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: beta
: 17.0
Assignee: Miro Tomaska
QA Contact: Fiorella Yanac
URL:
Whiteboard:
Depends On: 2102466 2103019
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-27 12:10 UTC by Eduardo Olivares
Modified: 2022-09-21 12:23 UTC (History)
12 users (show)

Fixed In Version: puppet-neutron-18.5.1-0.20220428001500.3bdf311.el9ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-21 12:23:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-16040 0 None None None 2022-06-27 12:16:33 UTC
Red Hat Product Errata RHEA-2022:6543 0 None None None 2022-09-21 12:23:24 UTC

Internal Links: 2102466

Description Eduardo Olivares 2022-06-27 12:10:55 UTC
Description of problem:
The following parameter is included in a THT file:
  ExtraConfig:
    neutron::agents::ml2::sriov::resource_provider_hypervisors: "enp7s0f3:%{hiera('fqdn_canonical')},enp5s0f0:%{hiera('fqdn_canonical')}"

Link to the THT file:
https://code.engineering.redhat.com/gerrit/plugins/gitiles/Neutron-QE/+/refs/heads/master/BM_heat_template/ospd-17-vlan-sriov-hybrid-ha-ovn-squad-titan09/network-environment.yaml#48

On OSP16.2, that value is added to the sriov_nic section within the sriov agent config file on the compute nodes:
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv4-vlan-sriov/54/computesriov-0/var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/sriov_agent.ini.gz
[sriov_nic]
physical_device_mappings=datacentre:enp7s0f3,datacentre:enp5s0f0
resource_provider_bandwidths=enp7s0f3:10000000:10000000,enp5s0f0:10000000:10000000
resource_provider_hypervisors=enp7s0f3:computesriov-0.localdomain,enp5s0f0:computesriov-0.localdomain

On OSP17, that value is not present in the sriov agent configuration:
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-17.0_director-rhel-virthost-3cont_2comp-ipv4-vlan-sriov/3/compute-0/var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/sriov_agent.ini.gz
[sriov_nic]
physical_device_mappings=datacentre:enp7s0f3,datacentre:enp5s0f0
resource_provider_bandwidths=enp7s0f3:10000000:10000000,enp5s0f0:10000000:10000000


Without that configuration, the sriov tests covering the maximum bandwidth placement enforcement fail when they try to obtain the max bw values from the placement API:
https://code.engineering.redhat.com/gerrit/plugins/gitiles/rhos-qe-tests/tempest_neutron_plugin/+/master/neutron_plugin/tests/scenario/test_qos.py#1267




Version-Release number of selected component (if applicable):
RHOS-17.0-RHEL-9-20220615.n.2

How reproducible:
100%

Steps to Reproduce:
1. tempest run -r test_minbw_placement_enforcement_sriov_egress (or "openstack resource provider list" and check no entries include "NIC Switch agent")

Comment 3 Takashi Kajinami 2022-06-30 02:27:18 UTC
I've checked the hieradata file in that node but could not find the key defined in ExtraConfig.

/etc/puppet/hieradata/extraconfig.json.gz
~~~
{
    "neutron::agents::l3::extensions": "fip_qos"
}
~~~

I think this is not the puppet issue but something caused by the change in TripleO/Heat
about the way how it merges template files. (it no longer do deep-merge by default)

What you can try would be adding

parameter_merge_strategies
  ExtraConfig: merge

in that template file.

By the way that [sriov_nic] resource_provider_hypervisors was required to workaround bz 1989820
and my understanding is that it is no longer required to use minimum qos rule.
If not then we'd need to look into the problem from Neutron's PoV.

Comment 4 Takashi Kajinami 2022-06-30 02:50:36 UTC
As I mentioned in comment:3, the issue is supposed to be fixed in Neutron and I'm not sure whether overriding hypervisor is still required,
but you might want to backport
 https://review.opendev.org/c/openstack/tripleo-heat-templates/+/796402
which is merged in master as a "safe-guard"

Comment 15 Fiorella Yanac 2022-07-13 09:30:03 UTC
OSP17 environment with OVN+SRIOV configured
verified with puddle:RHOS-17.0-RHEL-9-20220711.n.1

 [stack@undercloud-0 tempest-dir]$ openstack resource provider list
+--------------------------------------+------------------------------------------------------+------------+
| uuid                                 | name                                                 | generation |
+--------------------------------------+------------------------------------------------------+------------+
| ed237545-5a39-40ae-8983-9e27cd9afb1a | computesriov-0.localdomain                           |        465 |
| d9f94b62-15c7-44fc-9872-345f6a04fae1 | computesriov-1.localdomain                           |        419 |
| 376826fd-f904-58b4-a553-ba9001f5c537 | computesriov-1.localdomain:NIC Switch agent          |          0 |
| 35b1343b-08d8-5bcf-8fa6-2fba99cef411 | computesriov-1.localdomain:NIC Switch agent:enp7s0f3 |         31 |
| 57a35465-c0c7-595e-b103-2ba7d769285a | computesriov-1.localdomain:NIC Switch agent:enp5s0f0 |         31 |
| ac24419c-77b2-5951-b8ad-2dc0e5f9ca3e | computesriov-0.localdomain:NIC Switch agent          |          0 |
| 79db8911-b47c-5246-b55e-94da0c3688f5 | computesriov-0.localdomain:NIC Switch agent:enp7s0f3 |         32 |
| 95fc4865-db33-5f68-bc72-051086f7d6be | computesriov-0.localdomain:NIC Switch agent:enp5s0f0 |         32 |
+--------------------------------------+------------------------------------------------------+------------+

On each compute: /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/sriov_agent.ini

compute-0
[sriov_nic]
physical_device_mappings=datacentre:enp7s0f3,datacentre:enp5s0f0
resource_provider_bandwidths=enp7s0f3:10000000:10000000,enp5s0f0:10000000:10000000
resource_provider_hypervisors=enp7s0f3:computesriov-0.localdomain,enp5s0f0:computesriov-0.localdomain


compute-1 
[sriov_nic]
physical_device_mappings=datacentre:enp7s0f3,datacentre:enp5s0f0
resource_provider_bandwidths=enp7s0f3:10000000:10000000,enp5s0f0:10000000:10000000
resource_provider_hypervisors=enp7s0f3:computesriov-1.localdomain,enp5s0f0:computesriov-1.localdomain


the test_minbw_placement_enforcement_sriov_egress[1] is Passed
[1] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-networking-ovn-17.0_director-rhel-virthost-3cont_2comp-ipv4-vlan-sriov/10/testReport/neutron_plugin.tests.scenario.test_qos/QosTestSriovMinBwPlacementEnforcementTest/test_minbw_placement_enforcement_sriov_egress_id_ad4d9c2a_de45_4a05_a70e_78e953a8463d_/

Comment 20 errata-xmlrpc 2022-09-21 12:23:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543


Note You need to log in before you can comment on or make changes to this bug.