Bug 2218671 - Need clearer message when migration failed due to missing resource
Summary: Need clearer message when migration failed due to missing resource
Keywords:
Status: NEW
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.14.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.15.0
Assignee: sgott
QA Contact: Kedar Bidarkar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-29 19:29 UTC by Yossi Segev
Modified: 2023-07-19 13:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CNV-30406 0 None None None 2023-06-29 19:31:01 UTC

Description Yossi Segev 2023-06-29 19:29:23 UTC
Description of problem:
When attempting to hot-plug an interface to a guest VM using `virtctl addinterface` command, without creating a backing bridge on the node first, the command fails as expected, but the message doesn't notify the user what is missing:
```
$ virtctl addinterface vm-fedora --network-attachment-definition-name hp-br-nad --name hp2 
the server could not find the requested resource
```
The VM and the NetworkAttachmentDefinition are there (in the same namespace), and what is missing is the `hp-br` bridge interface on the nodes, which the NAD refers to:

```$ oc get vm
NAME        AGE   STATUS    READY
vm-fedora   16m   Running   True
$
$ oc get net-attach-def hp-br-nad -o yaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/hp-br
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"k8s.cni.cncf.io/v1","kind":"NetworkAttachmentDefinition","metadata":{"annotations":{"k8s.v1.cni.cncf.io/resourceName":"bridge.network.kubevirt.io/hp-br"},"name":"hp-br-nad","namespace":"yoss-ns"},"spec":{"config":"{\"cniVersion\": \"0.3.1\", \"name\": \"hp-br\", \"plugins\": [{\"type\": \"cnv-bridge\", \"bridge\": \"hp-br\"}]}"}}
  creationTimestamp: "2023-06-29T18:37:35Z"
  generation: 1
  name: hp-br-nad
  namespace: yoss-ns
  resourceVersion: "313420"
  uid: 44422089-e2e0-417a-811e-ff4367a1789e
spec:
  config: '{"cniVersion": "0.3.1", "name": "hp-br", "plugins": [{"type": "cnv-bridge",
    "bridge": "hp-br"}]}'
```


Version-Release number of selected component (if applicable):
CNV 4.14.0
virtctl version:
Client Version: version.Info{GitVersion:"v1.0.0-beta.0-188-g39dc57cad", GitCommit:"39dc57cad5ea6f882b847fd1ab312ee951d8cd9c", GitTreeState:"clean", BuildDate:"2023-06-04T04:31:39Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{GitVersion:"v1.0.0-beta.0-449-g2d9380079", GitCommit:"2d93800792c87fd0f744a13f80ce82260dd6e279", GitTreeState:"clean", BuildDate:"2023-06-28T15:10:09Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}


How reproducible:
Always


Steps to Reproduce:
1. Create and run a basic VM (no secondary NICs).
$ oc create ns yoss-ns
namespace/yoss-ns created
$ oc project yoss-ns
Now using project "yoss-ns" on server "https://api.net-ys-414o.rhos-psi.cnv-qe.rhood.us:6443".
$ oc apply -f vm-fedora.yaml 
virtualmachine.kubevirt.io/vm-fedora created
$ virtctl start vm-fedora
VM vm-fedora was scheduled to start

2. Create a NetworkattachmentDefinition.
$ oc apply -f bridge-nad.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/hp-br-nad created

4. Run the command to add the new interface to the VM:
virtctl addinterface <vm-name> --network-attachment-definition-name <net-attach-def-name> --name <interface-name>
$ virtctl addinterface vm-fedora --network-attachment-definition-name hp-br-nad --name hp2


Actual results:
Failure message doesn't specify which is the missing resource, which can be quite confusing for the user:
```
the server could not find the requested resource
```


Expected results:
A clearer message, which indicates which is the missing resource.
In this case, for example:
```
the server could not find bridge resource 'hp-br', which is referenced in NetworkAttachmentDefinition `hp-br-nad`.
```

Comment 6 Alona Kaplan 2023-07-18 12:33:39 UTC
Yossi, are you sure this behavior can be reproduced and is not caused because of another issue in your env?
If there is no bridge, it should be created by the CNI.

Comment 7 Alona Kaplan 2023-07-18 12:46:27 UTC
My mistake, the bridge won't be created. By anyway, we don't expect to see any error.

Comment 8 Yossi Segev 2023-07-18 13:55:03 UTC
(In reply to Alona Kaplan from comment #7)
> My mistake, the bridge won't be created. By anyway, we don't expect to see
> any error.

The `addinterface` action doesn't fail, but the migration fails (after timeout), but without any clear message, which should explain that it failed because the bridge interface resource is missing.
This is what currently seen in the migration describe:

Events:
  Type     Reason                           Age                    From                       Message
  ----     ------                           ----                   ----                       -------
  Normal   SuccessfulCreate                 8m16s                  virtualmachine-controller  Created migration target pod virt-launcher-vm-fedora-9nwxt
  Warning  migrationTargetPodUnschedulable  3m17s (x5 over 8m16s)  virtualmachine-controller  Migration target pod for VMI [yoss-ns/vm-fedora] is currently unschedulable.
  Normal   SuccessfulDelete                 3m17s                  virtualmachine-controller  unschedulable pod timeout period exceeded%!(EXTRA string=virt-launcher-vm-fedora-9nwxt)
  Warning  FailedMigration                  3m17s (x2 over 3m17s)  virtualmachine-controller  Migration target pod was removed during active migration.

Comment 9 Alona Kaplan 2023-07-18 14:11:56 UTC
Can you please share the `oc describe` of the VM?

Comment 12 Alona Kaplan 2023-07-18 15:26:53 UTC
Migration failing message due to a missing resource message is not a network bug, moving to compute team. Also changing the title to reflect the real issue.

Comment 13 Kedar Bidarkar 2023-07-19 12:20:17 UTC
Targeting it for CNV-4.15 due to the severity and also it is about ensuring a clear message and not broken functionality.


Note You need to log in before you can comment on or make changes to this bug.