Bug 1729485 - "Invalid PCI devices Whitelist config error" configuring passthrough_whitelist with new 40Gb NICs due domain in PCI address is greater than FFFF
Summary: "Invalid PCI devices Whitelist config error" configuring passthrough_whiteli...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: nova-maint
QA Contact: nova-maint
URL:
Whiteboard:
Depends On: 1561658
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-12 11:54 UTC by Luis Arizmendi
Modified: 2019-07-24 10:47 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Luis Arizmendi 2019-07-12 11:54:44 UTC
Description of problem:
-------------------
Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ have a PCI address with domain 10000 that is greater than the configured maximum  in nova. You get this error:

PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF)



Version-Release number of selected component (if applicable): RHOSP13


How reproducible:
-------------------
Always reproducible


Steps to Reproduce:
-------------------
1.Use Intel Corporation Ethernet Controller XL710 interfaces and configure them to be used for SR-IOV
2.Deployment will fail because of bug https://bugzilla.redhat.com/show_bug.cgi?id=1729439  but if you workaround it and continue you will find that sriov instances cannot be deployed.


Actual results:
-------------------
Nova cannot use these interfaces for SRIOV


Expected results:
-------------------
SR-IOV instances working with that interfaces



Additional info:
-------------------

In nova-scheduler you can see that PciPassthroughFilter returns 0 possible hosts, then on the compute host, checking the nova-compute logs you can see:

[root@computehci-0 ~]# less /var/log/containers/nova/nova-compute.log
...
...
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager [req-d4e5eb11-d0f0-4ce1-ad63-0f020027fc58 - - - - -] Error updating resources for node computehci-0.rhosp.local.: PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF).
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager Traceback (most recent call last):
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7426, in update_available_resource_for_node
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 689, in update_available_resource
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     self._update_available_resource(context, resources)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     return f(*args, **kwargs)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 713, in _update_available_resource
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     self._init_compute_node(context, resources)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 571, in _init_compute_node
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     self._setup_pci_tracker(context, cn, resources)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 600, in _setup_pci_tracker
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     dev_json)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/manager.py", line 120, in update_devices_from_hypervisor_resources
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     if self.dev_filter.device_assignable(dev):
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/whitelist.py", line 91, in device_assignable
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     if spec.match(dev):
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 274, in match
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     address_obj = WhitelistPciAddress(address_str, pf)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 195, in __init__
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     self._init_address_fields(pci_addr)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 216, in _init_address_fields
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     self.pci_address_spec = PhysicalPciAddress(pci_addr)
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 87, in __init__
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     self._set_pci_dev_info('domain', MAX_DOMAIN, '%04x')
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 66, in _set_pci_dev_info
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager     {'property': prop, 'attr': a, 'max': maxval})
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF).
2019-07-12 12:42:25.009 1 ERROR nova.compute.manager 



In nova.conf this is the configuration:

[root@computehci-0 ~]# vi /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf 

...
passthrough_whitelist={"devname":"enP65536p3s0f0","physical_network":"sriov"}
...


I tried to change devname for addresses so I got the pci address with this command:


[root@computehci-0 ~]# sudo lshw -c network -businfo | grep enP65536p3s0f0
pci@0000:03:00.0  enP65536p3s0f0  network        Ethernet interface

So I configured nova.conf in this way:

passthrough_whitelist={"address":"0000:03:00.0","physical_network":"sriov"}


It happens that it didn't work, so I double-checked the PCI address with another command:


[root@computehci-2 ~]# lspci | grep 710
10000:03:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
10000:03:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)


There you can see the domain 10000. I configured that address in nova.conf:

passthrough_whitelist={"address":"10000:03:00.0","physical_network":"sriov"}


Then I got the same error, so devname maps the PCI correctly, the problem seems to be in devspec.py [1]. There you can see the function that is sending the error message along with the MAX_FUNCTION variable set to:

MAX_DOMAIN = 0xFFFF


In order to workaround it, I changed the variable (to MAX_DOMAIN = 0xFFFFF) in my compute nodes, in the file /usr/lib/python2.7/site-packages/nova/pci/devspec.py in the nova_compute container. 

[root@computehci-2 ~]# docker commit -m="Fix max domain in devspec.py" nova_compute new_nova_compute                                                                                                                                  
sha256:270fc8b566113975d355eece8c053889ee1e8d0b38683436fd2eff7a8665aba1

[root@computehci-2 ~]# docker tag new_nova_compute 172.30.0.2:8787/rhosp13/openstack-nova-compute

[root@computehci-2 ~]# docker push 172.30.0.2:8787/rhosp13/openstack-nova-compute
The push refers to a repository [172.30.0.2:8787/rhosp13/openstack-nova-compute]
cefa3bc66d6f: Layer already exists 
ee8c602c858a: Layer already exists 
cf648748c4fe: Layer already exists 
c76ca73178da: Layer already exists 
fb15b60ae932: Layer already exists 
050c734bd286: Layer already exists 
13.0-87.1560797438: digest: sha256:8e8392b25325d9b98d4b06899b25165bd6e636c49994c4e976475f27468c6806 size: 1587
3a4748b9f150: Pushed 
cefa3bc66d6f: Layer already exists 
ee8c602c858a: Layer already exists 
cf648748c4fe: Layer already exists 
c76ca73178da: Layer already exists 
fb15b60ae932: Layer already exists 
050c734bd286: Layer already exists 
latest: digest: sha256:a9c56e2332c140ce1ef19c276342f7168e34ba628863e9073278a1510a57a289 size: 1797

[root@computehci-2 ~]# docker restart nova_compute



After restarting the services the error disappeared but I'm still not able to make SRIOV work (PciPassthroughFilter still returning 0 valid hosts)



[1] https://github.com/openstack/nova/blob/stable/queens/nova/pci/devspec.py

Comment 1 Luis Arizmendi 2019-07-12 12:27:45 UTC
Since the previous test didn't work, I had to change the passthrough_whitelist to use the vendor_id. Then it started working, but using this method I cannot specify a single interface, but all with the same vendor_id:



[root@computehci-2 ~]# grep -r '.*'  /sys/class/net/*/device/vendor | grep enP65536p3s0f0
/sys/class/net/enP65536p3s0f0/device/vendor:0x8086


[root@computehci-0 ~]# vi /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf                                                                                                                                              
..
..
#passthrough_whitelist={"devname":"enP65536p3s0f0","physical_network":"sriov"}
passthrough_whitelist={"vendor_id":"8086","physical_network":"sriov"}
..


[root@computehci-0 ~]# docker restart nova_compute

Comment 2 Luis Arizmendi 2019-07-12 12:45:58 UTC
Actually only the scheduling is working, but the VM cannot start, probably because it is choosing a wrong interface, not the one that I want since I get this error: "Interface type hostdev is currently supported on SR-IOV Virtual Functions only"

Comment 3 Luis Arizmendi 2019-07-12 14:32:39 UTC
Probably related to https://bugzilla.redhat.com/show_bug.cgi?id=1561658

Comment 4 smooney 2019-07-19 11:21:34 UTC
devname should be avoided in general as it is unreliable and we are currently in the process of deprecating it upstream https://review.opendev.org/#/c/670585/2
the libvirt issue seams valid and i suspect the nova limitation that was introduced as part of https://github.com/openstack/nova/commit/eca4286e955861e8e1547a8aabf2c4b5c4aad075
was chosen to be 2 bytes instead of 4 due to libvirt.

i think it is resonable for nova to allow 32bit domains however if the libvirt limitation still exists it will just fail later when the vm tries to boot.

passthrough_whitelist={"vendor_id":"8086","physical_network":"sriov"} is dangours as it would allow any intel pci device on the plathform to be used not jsut nics
you should have set teh vendror_id and product_id
can your trie usign the vendor_id and product_id again?

you can find those with lspci -vvn

it will print as vendor_id:product_id in the output


can you confim two things for me
first if you manually try to boot with he desired nic using livbrit can libvirt process the request and create a vm.
second can you check if the nic is attached to the second or 3/4th socket on the host?

if you have a multi socket host the domain will be non 0 on all sockets other then the first.
if you have more then 2 sockets on this host it could result in it being outside the 16bit range.
as another workaround you could try moving the nic to a different slot.

Comment 5 Luis Arizmendi 2019-07-19 11:30:30 UTC
(In reply to smooney from comment #4)
> devname should be avoided in general as it is unreliable and we are
> currently in the process of deprecating it upstream
> https://review.opendev.org/#/c/670585/2
> the libvirt issue seams valid and i suspect the nova limitation that was
> introduced as part of
> https://github.com/openstack/nova/commit/
> eca4286e955861e8e1547a8aabf2c4b5c4aad075
> was chosen to be 2 bytes instead of 4 due to libvirt.
> 
> i think it is resonable for nova to allow 32bit domains however if the
> libvirt limitation still exists it will just fail later when the vm tries to
> boot.
> 
> passthrough_whitelist={"vendor_id":"8086","physical_network":"sriov"} is
> dangours as it would allow any intel pci device on the plathform to be used
> not jsut nics
> you should have set teh vendror_id and product_id
> can your trie usign the vendor_id and product_id again?
> 
> you can find those with lspci -vvn
> 
> it will print as vendor_id:product_id in the output
> 
> 
> can you confim two things for me
> first if you manually try to boot with he desired nic using livbrit can
> libvirt process the request and create a vm.
> second can you check if the nic is attached to the second or 3/4th socket on
> the host?
> 
> if you have a multi socket host the domain will be non 0 on all sockets
> other then the first.
> if you have more then 2 sockets on this host it could result in it being
> outside the 16bit range.
> as another workaround you could try moving the nic to a different slot.


Libvirt is not able to create the vm. Regarding the slots, we asked Intel about that, because lspci shows that not all devices have the domain starting with 1, so probably it's what you said, that changing the NIC to another slot "solves" the issue here, but I was wondering if that libvirt limitation could eventually be solved or if that's something that it's not in the roadmap

Comment 6 Matthew Booth 2019-07-19 14:49:00 UTC
This is currently blocked because libvirt/kvm doesn't support 32bit PCI domains, but if that is fixed we shouldn't block this config in Nova. We probably also shouldn't block this for other hypervisors.

Comment 7 Luis Arizmendi 2019-07-24 10:47:04 UTC
Just as a comment, while this bug is being solved, if you can do it in your environment, you can turn off the Intel Volume Management Device service in BIOS, that will remove the domain ID number 1000. In that case, all PCI address starts with 0000 and you don't find the issue [1] with the interface name including capital letters either

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1729439


Note You need to log in before you can comment on or make changes to this bug.