This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 2214457 - Mixing bridge and sr-iov networks with same name fails and is confusing for the user
Summary: Mixing bridge and sr-iov networks with same name fails and is confusing for t...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 4.13.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: future
Assignee: Petr Horáček
QA Contact: Nir Rozen
URL:
Whiteboard:
Depends On: 2224990
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-13 03:45 UTC by Germano Veit Michel
Modified: 2023-12-14 16:07 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-12-14 16:07:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   CNV-29805 0 None None None 2023-12-14 16:07:30 UTC
Red Hat Issue Tracker OCPBUGS-16683 0 None None None 2023-07-24 08:22:38 UTC

Description Germano Veit Michel 2023-06-13 03:45:42 UTC
Description of problem:

I don't think this is a bug on a specific component, but more about how things work together: SR-IOV + CNV + Console.

For example, a user does the following:

1. Configure some bridge network, my example is virt-toca network, which is just a bridge using VLAN 2.

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/virt.toca
  name: virt-toca
  namespace: homelab
spec:
  config: >-
    {"name":"virt.toca","type":"cnv-bridge","cniVersion":"0.4.0","bridge":"virt.toca","macspoofchk":false,"ipam":{}}

2. Go to the CNV UI: Virtualization -> Virtual Machines -> example_vm -> Configuration -> Network Interfaces 

3. Edit a NIC

4. See you can select a Network (item 2 in the menu), *plus* a type for that network (item 3 in the menu). 

For example, I have a network named virt-toca, which I just configured a bridge NAD which is working fine. But now I want SR-IOV too and the dialog above suggests I can have an Sr-iov network with the same name and just select a different type for the VM NIC (i.e. SR-IOV)

5. Now I configure an sr-iov network with the same name:

apiVersion: v1
items:
- apiVersion: sriovnetwork.openshift.io/v1
  kind: SriovNetwork
  metadata:
    annotations:
    name: virt-toca
    namespace: openshift-sriov-network-operator
  spec:
    networkNamespace: homelab
    resourceName: virt_toca_sriov_resource
    spoofChk: "off"
    trust: "on"
    vlan: 2
kind: List
metadata:
  resourceVersion: ""

6. Here it gets confusing. Now in the CNV dialog from step 4 above one would expect to use virt-toca in bridge or sr-iov mode, just by changing the type of the network (item 3 in NIC edit menu) right? No, it does not work like that, and now both networks are broken.

7. The SR-IOV network defined at step 5 overwrote the NAD of the bridge, there is still a single NAD named virt-toca, but its not pointing to the sr-iov resource and not the bridge one.

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: openshift.io/virt_toca_sriov_resource
  creationTimestamp: '2023-06-13T03:26:51Z'
  generation: 2
  managedFields:
    - apiVersion: k8s.cni.cncf.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations': {}
        'f:spec': {}
      manager: kubectl-client-side-apply
      operation: Update
      time: '2023-06-13T03:26:51Z'
    - apiVersion: k8s.cni.cncf.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            'f:k8s.v1.cni.cncf.io/resourceName': {}
        'f:spec':
          'f:config': {}
      manager: sriov-network-operator
      operation: Update
      time: '2023-06-13T03:38:48Z'
  name: virt-toca
  namespace: homelab
  resourceVersion: '966960'
  uid: 9d63f136-10be-45fb-b303-f51949d17efa
spec:
  config: >-
    { "cniVersion":"0.3.1",
    "name":"virt-toca","type":"sriov","vlan":2,"spoofchk":"off","trust":"on","vlanQoS":0,"ipam":{}
    }

So essentially the sr-iov one overwrote the bridge one, and now its all broken. VMs configured with bridge network will fail to schedule.

Look at this, a VM with bridge network

          interfaces:
            - bridge: {}
              macAddress: '02:52:59:00:00:02'
              model: virtio
              name: nic-virt-toca

      networks:
        - multus:
            networkName: virt-toca
          name: nic-virt-toca

Fails to schedule because of a missing sr-iov resource on the node its pinned to (that node does not have SR-IOV card)

      message: >-
        0/9 nodes are available: 1 Insufficient
        openshift.io/virt_toca_sriov_resource,

Things get very very confusing for the user, only to latter notice that the NAD for the bridge was overwritten, and the CNV UI dialog doesn't really make much sense because each network (NAD) can have only one type, so its redundant to ask for both - as it implies it would work.

Would be nice to actually have this work, I don't want a network named vlan-2-bridge that I use with bridge type, and another vlan-2-sriov that I use with sriov type. Having just vlan-2 and then choose the type would be the best user experience.

Version-Release number of selected component (if applicable):
4.13

How reproducible:
Always

Steps to Reproduce:
As above

Actual results:
- Configuring sr-iov breaks bridge
- weird things happen

Expected results:
- Don't overwrite the bridge NAD
- Allow both NAD to co-exist, or merge them instead of overwrite
- If each network can only have a single type, then the UI field for type selection is redundant

Comment 2 Petr Horáček 2023-07-13 08:28:00 UTC
Hey, thanks for reporting this. Very confusing indeed.

This is what I plan to do with this BZ, let me know if it is sensible:

* Open a bug on UI asking them to figure out the binding type (bridge vs SR-IOV) from the selected network. If they won't be able to find a match for a selected network (RBAC does not allow them to read it, or it uses a third-party CNI we don't know), only then it should show the dropdown.
* Open a bug on SR-IOV operator asking them to not touch existing NAD if they are not of their own type.

About shared names, IIUIC you would like to be able to expose multiple different network attachment definitions as one. I could create network "blue" from the UI and it would act as an abstraction over NAD "blue-bridge" and SR-IOV network "blue-sriov". I don't think we can do this without a shared high-level OpenShift object for a "network" - that sounds like a pretty ambitious RFE for OpenShift Network. I would ask you to open an RFE, but I don't think it's realistic now. We would not only need a new abstraction for network definition, but also for network request, since on a Pod, you just reference the NAD name directly. Let me know what you think.

Comment 3 Germano Veit Michel 2023-07-13 21:22:31 UTC
Yes, I think those 2 things will fix the experience problem a user may have with this.

IMHO nothing is really a *bug* here on when seen from each component perspective, but things are not fitting well together and those 2 changes should correct this.

Thanks Petr!

Comment 4 Petr Horáček 2023-07-24 08:22:28 UTC
I'm keeping this BZ open to have a central tracker. I will target it to "future" so it's not in the way while the bugs we depend on are getting targeted and solved.

This depends on:
* https://bugzilla.redhat.com/show_bug.cgi?id=2224990 for internally assigning the binding method based on the requested NetworkAttachmentDefinition.
* https://issues.redhat.com/browse/OCPBUGS-16683 for not overwriting NetworkAttachmentDefinitions that are not owned by the SR-IOV operator


Note You need to log in before you can comment on or make changes to this bug.