Bug 1685118

Summary: VMIs with delegated IPs are stopped by the kubelet if kubelet is restarted
Product: Container Native Virtualization (CNV) Reporter: Roman Mohr <rmohr>
Component: NetworkingAssignee: Petr Horáček <phoracek>
Status: CLOSED WONTFIX QA Contact: Meni Yakove <myakove>
Severity: high Docs Contact:
Priority: medium    
Version: 1.4CC: cnv-qe-bugs, danken, fdeutsch, ncredi
Target Milestone: ---   
Target Release: 2.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-20 11:57:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Mohr 2019-03-04 11:39:40 UTC
Description of problem:

When connecting the VMI via `bridge` mode and IP delegation to the pod network, CNI status checks fail and as a consequence the kubelet stops all containers of that pod when the kubelet is restarted.

I checked it on k8s 1.10.4 and k8s 1.13.3. I *think* that the hint in the logs is this:

```
Mar 04 09:28:43 node01 kubelet[29530]: W0304 09:28:43.167534   29530 docker_sandbox.go:384] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "virt-launcher-vmi-nocloud-v4j2m_default": Unexpected address output
```

I did not try it with ovn-cni, crio, just flannel.


Version-Release number of selected component (if applicable):


How reproducible:

```bash
$ cluster/kubectl.sh create -f cluster/examples/vmi-nocloud.yaml
$ cluster/kubectl.sh create -f cluster/examples/vmi-slirp.yaml
$ cluster/kubectl.sh get vmis
NAME          AGE   PHASE     IP            NODENAME
vmi-nocloud   15s   Running   10.244.0.32   node01
vmi-slirp     1m    Running   10.244.0.31   node01
$ cluster/cli.sh ssh node01
$ sudo systemctl restart kubelet
$ cluster/kubectl.sh get vmis
NAME          AGE   PHASE     IP            NODENAME
vmi-nocloud   34s   Failed    10.244.0.32   node01
vmi-slirp     1m    Running   10.244.0.31   node01
```


Steps to Reproduce:
1.
2.
3.

Actual results:

The VMI with the default `bridge` options gets stopped.

Expected results:

No VMI gets restarted if the kubelet restarts.

Additional info:

https://github.com/kubevirt/kubevirt/issues/2076

Comment 1 Dan Kenigsberg 2019-06-20 11:57:25 UTC
Given our plans to eliminate the "bridge" binding mechanism for the Pod network ( https://jira.coreos.com/browse/KNIP-570 ) there is not reason track this bug further.