Description of problem: pciAddress of SR-IOV NICs defined in the VM is not respected, in the generated libvirt XML and therefore inside the Guest the NIC properties are randomly mixed. For example, see these 4 SR-IOV NICs defined: - macAddress: 02:9f:a6:00:01:40 model: virtio name: nic-1 pciAddress: "0000:20:00.0" sriov: {} - macAddress: 02:9f:a6:00:01:41 model: virtio name: nic-2 pciAddress: "0000:21:00.0" sriov: {} - macAddress: 02:9f:a6:00:01:42 model: virtio name: nic-3 pciAddress: "0000:22:00.0" sriov: {} - macAddress: 02:9f:a6:00:01:43 model: virtio name: nic-4 pciAddress: "0000:23:00.0" sriov: {} But inside the Guest 20:00.0 and 22:00.0 are swapped: bus info: pci@0000:20:00.0 serial: 02:9f:a6:00:01:42 bus info: pci@0000:21:00.0 serial: 02:9f:a6:00:01:41 bus info: pci@0000:22:00.0 serial: 02:9f:a6:00:01:40 bus info: pci@0000:23:00.0 serial: 02:9f:a6:00:01:43 Version-Release number of selected component (if applicable): 4.8.4 How reproducible: 100% Steps to Reproduce: 1. Create VM with many SR-IOV NIC with pciAddress and macAddress defined 2. Look inside the Guest, they are often mixed up.
*** Bug 2061216 has been marked as a duplicate of this bug. ***
This bug is blocked on 4.9 and 4.10 by BZ2070050, which makes the VM fail to start. On 4.8 the VM starts, but the problem in comment #0 occurs.
Can the macAddress of the interface inside the Guest be used to correctly identify the matching SR-IOV resource (VF) connected to the specific network, or that is also cannot be trusted and may float around?
(In reply to Germano Veit Michel from comment #3) > Can the macAddress of the interface inside the Guest be used to correctly > identify the matching SR-IOV resource (VF) connected to the specific > network, or that is also cannot be trusted and may float around? Short answer is *yes*, the mac address is the best identifier. The current issue is that we are not able to correctly differentiate between VF/s of the same resource inside the pod (virt-launcher). The MAC address is set before the VF is passed to the pod, in the pod we cannot "see" that mac (the device is bound to the VFIO driver), but once detected in the guest (as a network driver), the mac is revealed and you can trust it. Unless explicitly requested, the MAC address is immutable and no one can change it in the guest. Hope this helps.
Hi Germano, We are currently working on a solution where we will use the multus `k8s.v1.cni.cncf.io/network-status` and `k8s.v1.cni.cncf.io/networks` annotations in order to map the sriov-interfaces to the correct PCI-Addresses. The mapping will use ehe mac-address of the sriov-interfaces and the NAD attached to it as the best-case identifiers for the mapping.
Hi Ram. Any updates on this bug? Are we still on-target for 4.11.1?
Hi Adam, some updates: - Just so we are aligned, as this is a bug, the date relevant is the "Batch 4.11.1 Code Freeze" which is due on Sep. 12th 2022. - We published a design PR to the community(https://github.com/kubevirt/community/pull/185), we already getting some opposition to the chosen method of solution. I would say we are YELLOW for "Batch 4.11.1 Code Freeze". pending on how the design decision will resolve.
https://github.com/kubevirt/kubevirt/pull/8226
backport PR to CNV-4.11 U/S stable branch: https://github.com/kubevirt/kubevirt/pull/8416
backport PR got merged on U/S Once M/S 4.11.1 starts preparing to insert U/S changed it will take this change as well. I will monitor and update this BZ manually when it happens
Hi Germano, I have tried to reproduce the scenario (in order to verify the bug), but I am probably missing something, because I can't start the VM. Can you please supply the exact scenario you ran for reproducing this bug, including: - SriovNetworkNodePolicy manifest - SriovNetwork manifests - VM manifest As I understand it's been quite a while since you opened this bug, I'll specify the resources that I used, so maybe it will make it easier for you if you find what is missing/wrong in my scenario. My resources are attached. In the VM manifest - note that I took the PCI addresses from the node: sh-4.4# ls -l /sys/class/net/ens2f0/device/virtfn2/driver/ total 0 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.0 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.0 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.1 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.1 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.2 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.2 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.3 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.3 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.4 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.4 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.5 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.5 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.6 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.6 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:02.7 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:02.7 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.0 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.0 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.1 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.1 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.2 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.2 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.3 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.3 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.4 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.4 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.5 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.5 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.6 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.6 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:03.7 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:03.7 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.0 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.0 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.1 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.1 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.2 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.2 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.3 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.3 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.4 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.4 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.5 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.5 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.6 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.6 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:04.7 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:04.7 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.0 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.0 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.1 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.1 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.2 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.2 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.3 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.3 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.4 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.4 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.5 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.5 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.6 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.6 lrwxrwxrwx. 1 root root 0 Oct 20 13:43 0000:3b:05.7 -> ../../../../devices/pci0000:3a/0000:3a:00.0/0000:3b:05.7
(In reply to Yossi Segev from comment #15) > Hi Germano, > > I have tried to reproduce the scenario (in order to verify the bug), but I > am probably missing something, because I can't start the VM. > Can you please supply the exact scenario you ran for reproducing this bug, > including: > - SriovNetworkNodePolicy manifest > - SriovNetwork manifests > - VM manifest Hi Yossi, I don't have all this handy anymore, if really needed I'd need to rebuild the test setup for this. Or maybe you can share yours with me? > In the VM manifest - note that I took the PCI addresses from the node: I think thats your problem. The PCI addresses in the VM yaml are the PCI addresses inside the VM that one wants to customize, not to be taken from the host. Their meaning is that inside the VM the VF will have a specific PCI address, it is not related to the VF address on the host, its only inside the VM. Try to do like shown in comment #0, with 4 VFs in the VM YAML, each with a different PCI slot: pciAddress: "0000:20:00.0" pciAddress: "0000:21:00.0" pciAddress: "0000:22:00.0" pciAddress: "0000:23:00.0" After starting the VM, the VFs from the host (in your case the host address is 0000:3b:xx) will be on those PCI slots from 0000:20 to 23 inside the VM. The bug is that there is no proper mapping from the YAML to the libvirt XML, so its all random and the properties for each VF NIC are scrambled. Inside the Guest (you can check the MAC vs PCI mapping inside the Guest wont match the YAML). Please let me know if this helps.
Thank you Germano. The problem with my setup was indeed what you said: > > In the VM manifest - note that I took the PCI addresses from the node: > I think thats your problem. The PCI addresses in the VM yaml are the PCI addresses inside the VM that one wants to customize, not to be taken from the host. So the bug is now verified. OCP 4.11.11 CNV 4.11.1 (HCO bundle v4.11.1-49) sriov-network-operator.4.11.0-202208301406 Verified with the following scenario: * All the resources used for verification are attached in the bz-2070772-verification.zip file. 1. Applied the attached SriovNetworkNodePolicy (declaring 32 VFs on a PF interface on one of the cluster nodes, using a node selector). 2. Applied the attached 4 SriovNetworks 3. Applied the attached VM yaml (sriov-vm.yaml), with 4 secondary NICs backed by the 4 SriovNetworks, each configured with a unique MAC address and PCI address. "interfaces": [ { "macAddress": "02:36:14:00:00:01", "masquerade": {}, "name": "default" }, { "macAddress": "02:00:b5:b5:b5:02", "model": "virtio", "name": "sriov-test-network", "pciAddress": "0000:20:00.0", "sriov": {} }, { "macAddress": "02:00:b5:b5:b5:03", "model": "virtio", "name": "sriov-test-network-1", "pciAddress": "0000:21:00.0", "sriov": {} }, { "macAddress": "02:00:b5:b5:b5:04", "model": "virtio", "name": "sriov-test-network-2", "pciAddress": "0000:22:00.0", "sriov": {} }, { "macAddress": "02:00:b5:b5:b5:05", "model": "virtio", "name": "sriov-test-network-3", "pciAddress": "0000:23:00.0", "sriov": {} 4. Start the VM (virtctl start sriov-vm) 5. When VM got to running state - login to its console. $ virtctl console sriov-vm 6. Verify that each of the secondary interfaces has the same MAC and PCI addresses as set in the VM manifest. [fedora@sriov-vm ~]$ ip link show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 02:00:b5:b5:b5:02 brd ff:ff:ff:ff:ff:ff altname enp32s0 [fedora@sriov-vm ~]$ [fedora@sriov-vm ~]$ ethtool -i eth1 driver: iavf version: 5.15.17-200.fc35.x86_64 firmware-version: N/A expansion-rom-version: bus-info: 0000:20:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [fedora@sriov-vm ~]$ ip link show dev eth2 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 02:00:b5:b5:b5:03 brd ff:ff:ff:ff:ff:ff altname enp33s0 [fedora@sriov-vm ~]$ [fedora@sriov-vm ~]$ ethtool -i eth2 driver: iavf version: 5.15.17-200.fc35.x86_64 firmware-version: N/A expansion-rom-version: bus-info: 0000:21:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [fedora@sriov-vm ~]$ ip link show dev eth3 5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 02:00:b5:b5:b5:04 brd ff:ff:ff:ff:ff:ff altname enp34s0 [fedora@sriov-vm ~]$ [fedora@sriov-vm ~]$ ethtool -i eth3 driver: iavf version: 5.15.17-200.fc35.x86_64 firmware-version: N/A expansion-rom-version: bus-info: 0000:22:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [fedora@sriov-vm ~]$ ip link show dev eth4 6: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 02:00:b5:b5:b5:05 brd ff:ff:ff:ff:ff:ff altname enp35s0 [fedora@sriov-vm ~]$ [fedora@sriov-vm ~]$ ethtool -i eth4 driver: iavf version: 5.15.17-200.fc35.x86_64 firmware-version: N/A expansion-rom-version: bus-info: 0000:23:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [fedora@sriov-vm ~]$
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.11.1 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8750