Description of problem: ------------------- Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ have a PCI address with domain 10000 that is greater than the configured maximum in nova. You get this error: PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF) Version-Release number of selected component (if applicable): RHOSP13 How reproducible: ------------------- Always reproducible Steps to Reproduce: ------------------- 1.Use Intel Corporation Ethernet Controller XL710 interfaces and configure them to be used for SR-IOV 2.Deployment will fail because of bug https://bugzilla.redhat.com/show_bug.cgi?id=1729439 but if you workaround it and continue you will find that sriov instances cannot be deployed. Actual results: ------------------- Nova cannot use these interfaces for SRIOV Expected results: ------------------- SR-IOV instances working with that interfaces Additional info: ------------------- In nova-scheduler you can see that PciPassthroughFilter returns 0 possible hosts, then on the compute host, checking the nova-compute logs you can see: [root@computehci-0 ~]# less /var/log/containers/nova/nova-compute.log ... ... 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager [req-d4e5eb11-d0f0-4ce1-ad63-0f020027fc58 - - - - -] Error updating resources for node computehci-0.rhosp.local.: PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF). 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager Traceback (most recent call last): 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7426, in update_available_resource_for_node 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 689, in update_available_resource 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager self._update_available_resource(context, resources) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager return f(*args, **kwargs) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 713, in _update_available_resource 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager self._init_compute_node(context, resources) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 571, in _init_compute_node 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager self._setup_pci_tracker(context, cn, resources) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 600, in _setup_pci_tracker 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager dev_json) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/manager.py", line 120, in update_devices_from_hypervisor_resources 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager if self.dev_filter.device_assignable(dev): 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/whitelist.py", line 91, in device_assignable 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager if spec.match(dev): 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 274, in match 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager address_obj = WhitelistPciAddress(address_str, pf) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 195, in __init__ 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager self._init_address_fields(pci_addr) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 216, in _init_address_fields 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager self.pci_address_spec = PhysicalPciAddress(pci_addr) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 87, in __init__ 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager self._set_pci_dev_info('domain', MAX_DOMAIN, '%04x') 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/pci/devspec.py", line 66, in _set_pci_dev_info 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager {'property': prop, 'attr': a, 'max': maxval}) 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF). 2019-07-12 12:42:25.009 1 ERROR nova.compute.manager In nova.conf this is the configuration: [root@computehci-0 ~]# vi /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf ... passthrough_whitelist={"devname":"enP65536p3s0f0","physical_network":"sriov"} ... I tried to change devname for addresses so I got the pci address with this command: [root@computehci-0 ~]# sudo lshw -c network -businfo | grep enP65536p3s0f0 pci@0000:03:00.0 enP65536p3s0f0 network Ethernet interface So I configured nova.conf in this way: passthrough_whitelist={"address":"0000:03:00.0","physical_network":"sriov"} It happens that it didn't work, so I double-checked the PCI address with another command: [root@computehci-2 ~]# lspci | grep 710 10000:03:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) 10000:03:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) There you can see the domain 10000. I configured that address in nova.conf: passthrough_whitelist={"address":"10000:03:00.0","physical_network":"sriov"} Then I got the same error, so devname maps the PCI correctly, the problem seems to be in devspec.py [1]. There you can see the function that is sending the error message along with the MAX_FUNCTION variable set to: MAX_DOMAIN = 0xFFFF In order to workaround it, I changed the variable (to MAX_DOMAIN = 0xFFFFF) in my compute nodes, in the file /usr/lib/python2.7/site-packages/nova/pci/devspec.py in the nova_compute container. [root@computehci-2 ~]# docker commit -m="Fix max domain in devspec.py" nova_compute new_nova_compute sha256:270fc8b566113975d355eece8c053889ee1e8d0b38683436fd2eff7a8665aba1 [root@computehci-2 ~]# docker tag new_nova_compute 172.30.0.2:8787/rhosp13/openstack-nova-compute [root@computehci-2 ~]# docker push 172.30.0.2:8787/rhosp13/openstack-nova-compute The push refers to a repository [172.30.0.2:8787/rhosp13/openstack-nova-compute] cefa3bc66d6f: Layer already exists ee8c602c858a: Layer already exists cf648748c4fe: Layer already exists c76ca73178da: Layer already exists fb15b60ae932: Layer already exists 050c734bd286: Layer already exists 13.0-87.1560797438: digest: sha256:8e8392b25325d9b98d4b06899b25165bd6e636c49994c4e976475f27468c6806 size: 1587 3a4748b9f150: Pushed cefa3bc66d6f: Layer already exists ee8c602c858a: Layer already exists cf648748c4fe: Layer already exists c76ca73178da: Layer already exists fb15b60ae932: Layer already exists 050c734bd286: Layer already exists latest: digest: sha256:a9c56e2332c140ce1ef19c276342f7168e34ba628863e9073278a1510a57a289 size: 1797 [root@computehci-2 ~]# docker restart nova_compute After restarting the services the error disappeared but I'm still not able to make SRIOV work (PciPassthroughFilter still returning 0 valid hosts) [1] https://github.com/openstack/nova/blob/stable/queens/nova/pci/devspec.py
Since the previous test didn't work, I had to change the passthrough_whitelist to use the vendor_id. Then it started working, but using this method I cannot specify a single interface, but all with the same vendor_id: [root@computehci-2 ~]# grep -r '.*' /sys/class/net/*/device/vendor | grep enP65536p3s0f0 /sys/class/net/enP65536p3s0f0/device/vendor:0x8086 [root@computehci-0 ~]# vi /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf .. .. #passthrough_whitelist={"devname":"enP65536p3s0f0","physical_network":"sriov"} passthrough_whitelist={"vendor_id":"8086","physical_network":"sriov"} .. [root@computehci-0 ~]# docker restart nova_compute
Actually only the scheduling is working, but the VM cannot start, probably because it is choosing a wrong interface, not the one that I want since I get this error: "Interface type hostdev is currently supported on SR-IOV Virtual Functions only"
Probably related to https://bugzilla.redhat.com/show_bug.cgi?id=1561658
devname should be avoided in general as it is unreliable and we are currently in the process of deprecating it upstream https://review.opendev.org/#/c/670585/2 the libvirt issue seams valid and i suspect the nova limitation that was introduced as part of https://github.com/openstack/nova/commit/eca4286e955861e8e1547a8aabf2c4b5c4aad075 was chosen to be 2 bytes instead of 4 due to libvirt. i think it is resonable for nova to allow 32bit domains however if the libvirt limitation still exists it will just fail later when the vm tries to boot. passthrough_whitelist={"vendor_id":"8086","physical_network":"sriov"} is dangours as it would allow any intel pci device on the plathform to be used not jsut nics you should have set teh vendror_id and product_id can your trie usign the vendor_id and product_id again? you can find those with lspci -vvn it will print as vendor_id:product_id in the output can you confim two things for me first if you manually try to boot with he desired nic using livbrit can libvirt process the request and create a vm. second can you check if the nic is attached to the second or 3/4th socket on the host? if you have a multi socket host the domain will be non 0 on all sockets other then the first. if you have more then 2 sockets on this host it could result in it being outside the 16bit range. as another workaround you could try moving the nic to a different slot.
(In reply to smooney from comment #4) > devname should be avoided in general as it is unreliable and we are > currently in the process of deprecating it upstream > https://review.opendev.org/#/c/670585/2 > the libvirt issue seams valid and i suspect the nova limitation that was > introduced as part of > https://github.com/openstack/nova/commit/ > eca4286e955861e8e1547a8aabf2c4b5c4aad075 > was chosen to be 2 bytes instead of 4 due to libvirt. > > i think it is resonable for nova to allow 32bit domains however if the > libvirt limitation still exists it will just fail later when the vm tries to > boot. > > passthrough_whitelist={"vendor_id":"8086","physical_network":"sriov"} is > dangours as it would allow any intel pci device on the plathform to be used > not jsut nics > you should have set teh vendror_id and product_id > can your trie usign the vendor_id and product_id again? > > you can find those with lspci -vvn > > it will print as vendor_id:product_id in the output > > > can you confim two things for me > first if you manually try to boot with he desired nic using livbrit can > libvirt process the request and create a vm. > second can you check if the nic is attached to the second or 3/4th socket on > the host? > > if you have a multi socket host the domain will be non 0 on all sockets > other then the first. > if you have more then 2 sockets on this host it could result in it being > outside the 16bit range. > as another workaround you could try moving the nic to a different slot. Libvirt is not able to create the vm. Regarding the slots, we asked Intel about that, because lspci shows that not all devices have the domain starting with 1, so probably it's what you said, that changing the NIC to another slot "solves" the issue here, but I was wondering if that libvirt limitation could eventually be solved or if that's something that it's not in the roadmap
This is currently blocked because libvirt/kvm doesn't support 32bit PCI domains, but if that is fixed we shouldn't block this config in Nova. We probably also shouldn't block this for other hypervisors.
Just as a comment, while this bug is being solved, if you can do it in your environment, you can turn off the Intel Volume Management Device service in BIOS, that will remove the domain ID number 1000. In that case, all PCI address starts with 0000 and you don't find the issue [1] with the interface name including capital letters either [1] https://bugzilla.redhat.com/show_bug.cgi?id=1729439
As noted in comment 6, this is a limitation of libvirt. With the introduction of [1], nova will now ignore PCI devices with 32 bit domains. It was never possible to specify a PCI address with a 32 bit domain and this limitation persists. [1] https://github.com/openstack/nova/commit/8c9d6fc8f073cde78b79ae259c9915216f5d59b0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1001