1727810 – Starting VM fails due to bridge device annotation in Network Attachment Definition

Bug 1727810 - Starting VM fails due to bridge device annotation in Network Attachment Definition

Summary: Starting VM fails due to bridge device annotation in Network Attachment Defin...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Dan Kenigsberg
QA Contact:	Meni Yakove
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-07-08 08:50 UTC by Yossi Segev
Modified:	2019-07-08 12:28 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-07-08 10:59:42 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
VM spec with secondat network interface (1.18 KB, text/plain) 2019-07-08 08:50 UTC, Yossi Segev	no flags	Details
View All

Description Yossi Segev 2019-07-08 08:50:38 UTC

Created attachment 1588293 [details]
VM spec with secondat network interface

Description of problem:
When creating a VM with a secondary network interface, the VM fails to start if the Network Attachment Definition includes annotation of the bridge device name.

Version-Release number of selected component (if applicable):
OCP 4.1/CNV 2.0

How reproducible:
Always


Steps to Reproduce:
1. Create a bridge interface called br1 on all the worker (not master) nodes of a cluster:
 $ ip link add br1 type bridge
 $ ip link set br1 up

2. Create the following Network Attachment Definition, which include annotation for only worker nodes with br1 interface:
$ cat << EOF | oc create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: a-bridge-network
  annotations:
    k8s.v1.cni.cncf.io/resourceName: bridge-cni.network.kubevirt.io/br1                                                                                                                                            
spec:
  config: '{
    "cniVersion": "0.3.0",
    "name": "a-bridge-network",
    "type": "cnv-bridge",
    "bridge": "br1",
    "isGateway": true,
    "ipam": {}
}'

3. Start a VM using the attached VM spec (vm-cirros-my-secondary-nic.yaml):
 $ oc create -f vm-cirros-my-secondary-nic.yaml
 $ virtctl start vm-cirros

Actual results:
VMI remains stuck on "Scheduling" status, with this error (collected using "oc describe vmi"):
Status:
  Conditions:
    Last Probe Time:       2019-07-08T07:52:44Z
    Last Transition Time:  2019-07-08T07:52:44Z
    Message:               0/6 nodes are available: 3 Insufficient devices.kubevirt.io/kvm, 3 Insufficient devices.kubevirt.io/tun, 3 Insufficient devices.kubevirt.io/vhost-net, 3 node(s) didn't match node selector, 6 Insufficient bridge-cni.network.kubevirt.io/br1.
    Reason:                Unschedulable


Expected results:
VM should start successfully and move to "Running" status.

Additional info:
Workaround:
Remove the annotation line from the Network Attachment Definition spec.

Comment 1 Dan Kenigsberg 2019-07-08 09:48:58 UTC

We have this running on cnv-tests. Did you try running test_bridge_marker on your cluster?
Can you verify the bridge marker is running?

Comment 2 Yossi Segev 2019-07-08 10:05:00 UTC

I have just ran test_bridge_marker on that same cluster, and all 3 tests passed (non failure or skip).
As for verifying that bridge marker is running:
[cnv-qe-jenkins@cnv-executor-ysegev2 cnv-tests]$ oc get all --all-namespaces | grep marker
linux-bridge                                            pod/bridge-marker-2cfk4                                               1/1       Running     0          3d20h
linux-bridge                                            pod/bridge-marker-75bl8                                               1/1       Running     0          3d20h
linux-bridge                                            pod/bridge-marker-8qrnh                                               1/1       Running     1          3d20h
linux-bridge                                            pod/bridge-marker-97nvw                                               1/1       Running     5          3d20h
linux-bridge                                            pod/bridge-marker-hdhmz                                               1/1       Running     0          3d20h
linux-bridge                                            pod/bridge-marker-s6wpl                                               1/1       Running     1          3d20h

linux-bridge                             daemonset.apps/bridge-marker                   6         6         6         6            6           beta.kubernetes.io/arch=amd64     3d20h

This cluster has 3 workers and 3 masters, hence the 6 running pods. Anyway - it looks like bridge marker is functioning.

Comment 4 Yossi Segev 2019-07-08 10:59:42 UTC

Turns out that the bug is in the way the annotation is defined.
It is
k8s.v1.cni.cncf.io/resourceName: bridge-cni.network.kubevirt.io/br1
while it should be
k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/br1

My erroneous configuration derived from an error in the documentation.

Comment 5 Dan Kenigsberg 2019-07-08 12:04:19 UTC

> My erroneous configuration derived from an error in the documentation.

so this should be made a doc bug, or an upstream issue. Can you have one of the two?

Comment 7 Dan Kenigsberg 2019-07-08 12:28:23 UTC

> Why do you suspect it is an u/s issue?

I did not know which docs have mislead you.

Note You need to log in before you can comment on or make changes to this bug.