Bug 2070050 - [4.10.1] Custom guest PCI address and boot order parameters are not respected in a list of multiple SR-IOV NICs
Summary: [4.10.1] Custom guest PCI address and boot order parameters are not respected...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 4.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.10.1
Assignee: Orel Misan
QA Contact: Yossi Segev
URL:
Whiteboard:
Depends On:
Blocks: 2070136 2070772 2072942
TreeView+ depends on / blocked
 
Reported: 2022-03-30 11:25 UTC by Orel Misan
Modified: 2022-05-18 20:28 UTC (History)
3 users (show)

Fixed In Version: virt-launcher v4.10.1-7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2070136 (view as bug list)
Environment:
Last Closed: 2022-05-18 20:27:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
VMI with two SR-IOV NICs and custom guest PCI addresses (1.35 KB, text/plain)
2022-03-30 11:25 UTC, Orel Misan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 7422 0 None Merged virt-launcher, hostdevice: Respect SR-IOV guest pciAddress and bootOrder 2022-03-30 11:28:29 UTC
Github kubevirt kubevirt pull 7478 0 None Merged [release-0.49] virt-launcher, hostdevice: Respect SR-IOV guest pciAddress and bootOrder 2022-04-10 13:34:08 UTC
Red Hat Knowledge Base (Solution) 6870481 0 None None None 2022-03-31 21:46:04 UTC
Red Hat Product Errata RHSA-2022:4668 0 None None None 2022-05-18 20:28:57 UTC

Description Orel Misan 2022-03-30 11:25:23 UTC
Created attachment 1869362 [details]
VMI with two SR-IOV NICs and custom guest PCI addresses

Created attachment 1869362 [details]
VMI with two SR-IOV NICs and custom guest PCI addresses

Description of problem:
On a VMI object that has a list of multiple SR-IOV NICs, the pciAddress and bootOrder parameters of the last item in the list - are applied to all other elements of the list.

The VMI object passes the validation in the virt-api webhooks.

The virt-launcher fails because there are allegedly several host devices (the SR-IOV NICs) with the same required guest PCI addresses:
"message: 'server error. command SyncVMI failed: "LibvirtError(Code=27, Domain=20, Message=''XML error: Attempted double use of PCI Address 0000:82:00.0'')"'"

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create a VMI object with at least two SR-IOV NICs, that has custom guest PCI address.
2. Apply the VMI manifest.

Actual results:
The virt-launcher fails because there are allegedly several host devices (the SR-IOV NICs) with the same required guest PCI addresses.

Expected results:
A VM should start with the number of specified SR-IOV NICs that have the required PCI addresses.

Additional info:

Comment 1 Germano Veit Michel 2022-03-31 21:46:05 UTC
Any workarounds we could document here besides not adding pciAddress to the spec?

Comment 2 Germano Veit Michel 2022-04-04 01:59:47 UTC
Note there is no need to set custom PCI addresses on all SR-IOV NICs.

Just on the last one seems to be enough for the closure bug to replicate it to all:

      interfaces:
      - macAddress: 02:2d:47:00:00:02
        model: virtio
        name: nic-0
        sriov: {}
      - macAddress: 02:2d:47:00:00:03
        model: virtio
        name: nic-1
        pciAddress: 0000:82:00.0
        sriov: {}

server error. command SyncVMI failed: "LibvirtError(Code=27, Domain=20, Message='XML error: Attempted double use of PCI Address 0000:82:00.0')"

Comment 3 Orel Misan 2022-04-10 13:33:54 UTC
The fix was merged on upstream (https://github.com/kubevirt/kubevirt/pull/7478).

Comment 4 Petr Horáček 2022-04-11 10:30:10 UTC
Waiting for M/S patch to pass https://code.engineering.redhat.com/gerrit/c/kubevirt/+/401398/

Comment 5 Yossi Segev 2022-05-09 18:17:03 UTC
Verified on CNV v4.10.1 (HCO v4.10.1-19), virt-launcher v4.10.1-8

1. Apply the attached SriovNetworkNodePolicy and SriovNetworks.
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$ oc apply -f sriov-policy.yaml 
oc apply -f sriov-nesriovnetworknodepolicy.sriovnetwork.openshift.io/test-sriov-policy created
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$ oc apply -f sriov-network1.yaml 
sriovnetwork.sriovnetwork.openshift.io/sriov-test-network1 created
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$ oc apply -f sriov-network2.yaml 
sriovnetwork.sriovnetwork.openshift.io/sriov-test-network2 created

2. Apply the attached VirtualMachine, which has 2 secondary SR-IOV-based NICs, with a hard-coded PCI address set to the second one:
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$ oc create ns sriov-test-sriov
namespace/sriov-test-sriov created
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$ oc apply -f sriov-vm1.yaml 
virtualmachine.kubevirt.io/sriov-vm1 created

3. Verify that the VM starts and runs successfully:
[cnv-qe-jenkins@cnvqe-01 bz-2070050]$ oc get vmi -w
NAME        AGE   PHASE        IP    NODENAME   READY
sriov-vm1   3s    Scheduling                    False
sriov-vm1   7s    Scheduled          cnvqe-10.lab.eng.tlv2.redhat.com   False
sriov-vm1   11s   Scheduled          cnvqe-10.lab.eng.tlv2.redhat.com   False
sriov-vm1   11s   Running      10.128.3.90   cnvqe-10.lab.eng.tlv2.redhat.com   False
sriov-vm1   11s   Running      10.128.3.90   cnvqe-10.lab.eng.tlv2.redhat.com   True

Comment 15 errata-xmlrpc 2022-05-18 20:27:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.1 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4668


Note You need to log in before you can comment on or make changes to this bug.