Bug 1777385

Summary: VMI stuck on "Scheduling" due to not found OVS bridge link
Product: Container Native Virtualization (CNV) Reporter: Yossi Segev <ysegev>
Component: NetworkingAssignee: Petr Horáček <phoracek>
Status: CLOSED ERRATA QA Contact: Meni Yakove <myakove>
Severity: high Docs Contact:
Priority: high    
Version: 2.2.0CC: atragler, cnv-qe-bugs, ellorent, myakove, ncredi, phoracek
Target Milestone: ---   
Target Release: 2.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovs-cni-plugin-container-v2.2.0-3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-30 16:27:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
VM spec using OVS bridge. none

Description Yossi Segev 2019-11-27 14:46:53 UTC
Created attachment 1640120 [details]
VM spec using OVS bridge.

Description of problem:
When trying to use an OVS bridge for adding a secondary NIC in a VM - the VM creation fails with "Link not found" error.

Version-Release number of selected component (if applicable):
CNV-2.2

How reproducible:
Always

Steps to Reproduce:
1. Skip this step if your nodes have multiple NICs:
Create a dummy interface on each worker node:
$ ip link add ovs22 type dummy
$ ip addr add 192.168.12.22/24 dev ovs22 # different IP on each node
$ ip link set ovs22 up

2. Deploying an NNCP for creating an OVS bridge based on the dummy interface:
apiVersion: nmstate.io/v1alpha1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: br22-ovs-policy
spec:
  desiredState:
    interfaces:
      - name: ovs-br22
        type: ovs-bridge
        state: up
        bridge:
          options:
            stp: true
          port:
            - name: ovs22 (or the name of an actual secondary NIC that exists on all the nodes)

3. Verifying that the bridge was created successfully on each worker node:
[core@host-172-16-0-22 ~]$ nmcli c show
NAME                UUID                                  TYPE           DEVICE        
Wired connection 1  ecb5cba3-e6b6-3165-ae2e-9c8123fe804f  ethernet       ens3          
ovs22               212a7355-1f76-4051-addf-892e82e054cf  dummy          ovs22          
ovs-br22            1962d021-9b75-4291-b4ed-fdf65dbabde9  ovs-bridge     ovs-br22      
ovs-port-ovs22      f44d31db-0b62-4010-919e-fd9c9313097c  ovs-port       ovs-port-ovs22 

4. Deploying a NAD which is based on that bridge:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ovs-conf
  annotations:
    k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/ovs-br22
spec:
  config: '{
      "cniVersion": "0.3.1",
      "type": "ovs",
      "bridge": "ovs-br22"
    }'

5. Deploying and starting a VM which uses this NAD (spec yaml is attached).


Actual results:
The VMI is stuck on "Scheduling" with this message in its virt-launcher pod:
  Warning  FailedCreatePodSandBox  16s (x16 over 5m32s)  kubelet, host-172-16-0-22  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_virt-launcher-vma-r5dlx_default_8fb9327f-83de-4956-8013-fd7e26f6a71a_0(fa32adb8112deae9d1626238a755c97e62d1b9a932af70d79c14cdd9227875df): Multus: error adding pod to network "ovs-conf": delegateAdd: error invoking DelegateAdd - "ovs": error in getting result from AddNetwork: Link not found


Expected results:
VM is created successfully, with a usable secondary NIC.

Comment 1 Quique Llorente 2019-11-27 14:51:04 UTC
Issue is that with knmstate you have to use different names for the ovs-interface, but ovs-cni is trying to find it with netlink with the ovs-bridge name, 

we have try to just remove the need of using netlink to get the ovs-interface mac at ovs-cni https://github.com/kubevirt/ovs-cni/pull/92 but it's not working.

Let's try to remove bridge reporting totally https://github.com/kubevirt/ovs-cni/pull/93

Comment 3 Petr Horáček 2019-11-28 12:17:18 UTC
We will release a new version of OVS CNI to fix this.

Comment 4 Yossi Segev 2019-12-04 13:40:46 UTC
Verified on:
CNV 2.2.0
ovs-cni (plugin and marker):v2.2.0-3
cluster-network-addons-operator:v2.2.0-6

Comment 5 Nelly Credi 2019-12-09 11:08:32 UTC
missing fixed in version

Comment 7 errata-xmlrpc 2020-01-30 16:27:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307