Bug 1824248 - Passthrough whitelist not generated properly from templates
Summary: Passthrough whitelist not generated properly from templates
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Karthik Sundaravel
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-15 15:54 UTC by Miguel Angel Nieto
Modified: 2023-01-02 15:13 UTC (History)
16 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.4.1-61.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-02 15:13:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 729891 0 None MERGED Consider user configuration during the derivation of passthrough whitelist 2021-02-01 22:14:06 UTC
Red Hat Issue Tracker NFV-1538 0 None None None 2022-04-14 07:52:41 UTC
Red Hat Issue Tracker OSP-14690 0 None None None 2022-04-14 07:52:43 UTC

Internal Links: 1843316

Description Miguel Angel Nieto 2020-04-15 15:54:42 UTC
Description of problem:
Trying to deploy the following templates: 
https://gitlab.cee.redhat.com/mnietoji/deployment_templates/tree/master/ospd-13-vxlan-dpdk-sriov-ctlplane-dataplane-bonding-nic-partitioning-hybrid-panther04


in which pci passthrough is defined in the following way:
  NovaPCIPassthrough:
    - devname: "p7p3"
      trusted: "true"
      physical_network: "sriov-1"
    - devname: "p7p4"
      trusted: "true"
      physical_network: "sriov-2"
    - address: {"domain": ".*", "bus": "05", "slot": "02", "function": "[5-9]"}
      trusted: "true"
      physical_network: "sriov-partitioned-1"
    - address: {"domain": ".*", "bus": "05", "slot": "06", "function": "[5-9]"}
      trusted: "true"
      physical_network: "sriov-partitioned-2"

if I check nova.conf generated:
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"0"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"1"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"2"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"3"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"4"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"5"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"6"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"06","bus":"05","domain":".*","function":"7"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"07","bus":"05","domain":".*","function":"0"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"07","bus":"05","domain":".*","function":"1"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"07","bus":"05","domain":".*","function":"2"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"07","bus":"05","domain":".*","function":"3"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"07","bus":"05","domain":".*","function":"4"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"07","bus":"05","domain":".*","function":"5"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"0"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"1"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"2"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"3"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"4"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"5"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"6"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"02","bus":"05","domain":".*","function":"7"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"03","bus":"05","domain":".*","function":"0"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"03","bus":"05","domain":".*","function":"1"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"03","bus":"05","domain":".*","function":"2"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"03","bus":"05","domain":".*","function":"3"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"03","bus":"05","domain":".*","function":"4"}}
nova.conf:passthrough_whitelist={"vendor_id":"0x8086","product_id":"0x154c","address":{"slot":"03","bus":"05","domain":".*","function":"5"}}

According to "bus": "05", "slot": "02", "function": "[5-9]"}, only functions 5 to 9 should be included, but it has been included 0 to 7


Version-Release number of selected component (if applicable):
puddle 2020-04-01.3

How reproducible:
Run this job, most of the cases will fail
http://rhos-runner1.qa.lab.tlv.redhat.com:8282/job/DFG-nfv-13-director-3cont-2comp-ipv4-vxlan-dpdk-sriov-ctlplane-dataplane-bonding-nic-partitioning-hybrid-panther04/


Actual results:


Expected results:


Additional info:
I will generare and update sos report

I am having this error when trying to spawn a vm
Traceback (most recent call last):
  File "/home/stack/tempest/venv/lib/python2.7/site-packages/nfv_tempest_plugin/tests/scenario/test_nfv_basic.py", line 148, in test_numa0_provider_network
    servers, key_pair = self.create_and_verify_resources(test=test)
  File "/home/stack/tempest/venv/lib/python2.7/site-packages/nfv_tempest_plugin/tests/scenario/base_test.py", line 105, in create_and_verify_resources
    **kwargs)
  File "/home/stack/tempest/venv/lib/python2.7/site-packages/nfv_tempest_plugin/tests/scenario/baremetal_manager.py", line 1633, in create_server_with_resources
    **kwargs)
  File "/home/stack/tempest/venv/lib/python2.7/site-packages/nfv_tempest_plugin/tests/scenario/baremetal_manager.py", line 1503, in create_server_with_fip
    raise_on_error=raise_on_error)
  File "tempest/common/waiters.py", line 76, in wait_for_server_status
    server_id=server_id)
tempest.exceptions.BuildErrorException: Server 69a71dee-516a-4808-a6bd-7c7396c54360 failed to build and is in ERROR status
Details: {u'message': u'No valid host was found. There are not enough hosts available.', u'code': 500, u'details': u'Traceback (most recent call last):\n  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1165, in schedule_and_build_instances\n    instance_uuids, return_alternates=True)\n  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 760, in _schedule_instances\n    return_alternates=return_alternates)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line 793, in wrapped\n    return func(*args, **kwargs)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 53, in select_destinations\n    instance_uuids, return_objects, return_alternates)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method\n    return getattr(self.instance, __name)(*args, **kwargs)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations\n    instance_uuids, return_objects, return_alternates)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py", line 158, in select_destinations\n    return cctxt.call(ctxt, \'select_destinations\', **msg_args)\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 174, in call\n    retry=self.retry)\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 131, in _send\n    timeout=timeout, retry=retry)\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 625, in send\n    retry=retry)\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 616, in _send\n    raise result\nNoValidHost_Remote: No valid host was found. There are not enough hosts available.\nTraceback (most recent call last):\n\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 229, in inner\n    return func(*args, **kwargs)\n\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 154, in select_destinations\n    allocation_request_version, return_alternates)\n\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 91, in select_destinations\n    allocation_request_version, return_alternates)\n\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 244, in _schedule\n    claimed_instance_uuids)\n\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 281, in _ensure_sufficient_hosts\n    raise exception.NoValidHost(reason=reason)\n\nNoValidHost: No valid host was found. There are not enough hosts available.\n\n', u'created': u'2020-04-15T15:43:34Z'}


Checking scheduler
2020-04-15 15:43:34.196 8 INFO nova.filters [req-7ca286ea-804e-45af-a452-e0e8e7c15953 0d2f8f99934b42e1b9bada48e79abe07 b13772157fd948f48bbb017f1f0176de - default default] Filtering removed all hosts for the request with instance ID '69a71dee-516a-4808-a6bd-7c7396c54360'. Filter results: ['RetryFilter: (start: 2, end: 2)', 'AvailabilityZoneFilter: (start: 2, end: 2)', 'RamFilter: (start: 2, end: 2)', 'ComputeFilter: (start: 2, end: 2)', 'ComputeCapabilitiesFilter: (start: 2, end: 2)', 'ImagePropertiesFilter: (start: 2, end: 2)', 'ServerGroupAntiAffinityFilter: (start: 2, end: 2)', 'ServerGroupAffinityFilter: (start: 2, end: 2)', 'PciPassthroughFilter: (start: 2, end: 0)']

Checking computes
6: p7p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:31:25:e0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust on
    vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust on
    vf 3 MAC 00:00:00:00:00:00, spoof checking off, link-state auto, trust on
    vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 5 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 6 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 7 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 8 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 9 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 10 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 11 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 12 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 13 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
7: p7p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:31:25:e2 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust on
    vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust on
    vf 3 MAC 00:00:00:00:00:00, spoof checking off, link-state auto, trust on
    vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 5 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 6 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 7 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 8 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 9 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 10 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 11 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 12 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
    vf 13 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off

2020-04-15 15:52:08.006 8 WARNING nova.pci.utils [req-fc3ab5c1-c4d2-442f-a323-bf52e9cb5d83 - - - - -] No net device was found for VF 0000:05:06.3: PciDeviceNotFoundById: PCI device 0000:05:06.3 not found
2020-04-15 15:52:08.284 8 WARNING nova.pci.utils [req-fc3ab5c1-c4d2-442f-a323-bf52e9cb5d83 - - - - -] No net device was found for VF 0000:05:02.3: PciDeviceNotFoundById: PCI device 0000:05:02.3 not found

It is like it is taking some vfs that are used by ovs, so it generates that error

Comment 1 Karthik Sundaravel 2020-04-16 06:55:14 UTC
The THT configuration is
  NovaPCIPassthrough:
    - devname: "p7p3"
      trusted: "true"
      physical_network: "sriov-1"
    - devname: "p7p4"
      trusted: "true"
      physical_network: "sriov-2"
    - address: {"domain": ".*", "bus": "05", "slot": "02", "function": "[5-9]"}
      trusted: "true"
      physical_network: "sriov-partitioned-1"
    - address: {"domain": ".*", "bus": "05", "slot": "06", "function": "[5-9]"}
      trusted: "true"
      physical_network: "sriov-partitioned-2"

Comment 2 Karthik Sundaravel 2020-04-16 07:00:04 UTC
The hieradata "nova::compute::pci::passthrough" is incorrectly generated by the derive_pci_passthrough_whitelist.py in openstack-tripleo-heat-templates
If user supplied configuration is available, then it shall be given precedence over parameters derived by automation.
In this case the automation also removed the pci passthrough configurations for p7p3 and p7p4.

Comment 3 Christophe Fontaine 2020-04-16 07:11:32 UTC
In order to supply your parameter, you should set the following parameter in OSP16:

  <RoleName>Parameters:
    DerivePciWhitelistEnabled: false


Could you check you can also set it on OSP13 ?

Comment 4 Miguel Angel Nieto 2020-04-16 10:46:49 UTC
Using "DerivePciWhitelistEnabled: false", configuration was ok and all testcases passes. So, I think we have to add that parameter to our templates.

Comment 5 Karthik Sundaravel 2020-04-16 11:03:29 UTC
Chris/Jagan,

Since the automatic derivation is supposed to remove the VFs used by NIC partitioning, the default "DerivePciWhitelistEnabled: true", should have still worked ?

Comment 6 Christophe Fontaine 2020-04-16 11:26:49 UTC
We may need to rework on the automation to verify it works (it doesn't seems to), and to disable it automatically when NovaPCIPassthrough is provided.

Comment 7 Artom Lifshitz 2020-04-23 14:02:09 UTC
Looks as though DFG:NFV is handling this. To prevent this bz popping up on the DFG:Compute bug triage calls, I've removed the associated Jira ticket. If there's anything for DFG:Compute to do here, please add 'DFG:Compute' in the internal whiteboard and it'll pop up on our radar again.


Note You need to log in before you can comment on or make changes to this bug.