Bug 2209293 - RFE - nmstate should tolerate existing patch ports
Summary: RFE - nmstate should tolerate existing patch ports
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: nmstate
Version: 9.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Gris Ge
QA Contact: Mingyu Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-23 11:44 UTC by Miguel Duarte Barroso
Modified: 2023-08-14 07:56 UTC (History)
7 users (show)

Fixed In Version: nmstate-2.2.14-1.el9
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker NMT-568 0 None None None 2023-05-23 13:11:13 UTC
Red Hat Issue Tracker RHELPLAN-157938 0 None None None 2023-05-23 13:11:18 UTC

Description Miguel Duarte Barroso 2023-05-23 11:44:13 UTC
Description of problem:
OVN has an API (ovn-bridge-mappings) - [0] - where the network administrator can associate a physical network name to an OVS bridge.

The OVN controller reacts to this by creating patch ports between the OVN integration bridge (configured and owned by OVN Kubernetes) and the OVS bridge (configured using nmstate by cluster admin) we point via the mappings.

This is an example of an OVN bridge mapping from the openvswitch table (ovs-vsctl list open .):
```
external_ids        : {..., ovn-bridge-mappings="physnet:br-ex,tenantblue_br-localnet:brsec", ...}
```

We are using this feature to configure the underlay for secondary OVN-Kubernetes networks; thus, when a network admin attempts to reconfigure the bridge mappings when the OVS bridge is being used will fail with the following error (grabbed from the k8s-nmstate nnce):
```
[2023-05-23T10:39:04Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: brsec.interface.bridge.port desire '[{"name":"ens4","vlan":{"mode":"access","tag":51}}]', current '[{"name":"ens4","vlan":{"mode":"access","tag":51}},{"name":"patch-tenantblue_ovn_localnet_port-to-br-int"}]' 
```

We thus would like to request nmstate to somehow leave these patch ports alone.

Here's an NNCP configuration for your reference:
```
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: dedicated-bridge
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ''
  desiredState:
    interfaces:
      - name: brsec
        description: bridge with a secondary interface as port
        type: ovs-bridge
        state: up
        bridge:
          port:
            - name: ens4
              vlan:
                mode: access
                tag: 51
    ovs-db:
      external_ids:
        ovn-bridge-mappings: "physnet:br-ex,tenantblue_br-localnet:brsec"
```

[0] - https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/networking_guide/bridge-mappings

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Petr Horáček 2023-05-23 11:52:07 UTC
Setting the severity to high. This feature would make configuration of OVN localnet networks in OpenShift much smoother and a little more intuitive.

Comment 3 Gris Ge 2023-05-23 13:10:18 UTC
Hi Miguel,

When creating this OVS bridge, can you add these lines also?

```yml
---
interfaces:
- name: patch-tenantblue_ovn_localnet_port-to-br-int
  state: ignore
```

This means nmstate should ignore `patch-tenantblue_ovn_localnet_port-to-br-int` during verification.

Comment 4 Miguel Duarte Barroso 2023-05-23 13:45:20 UTC
Hi Gris,

Thanks for the prompt reply.

At first glance, this seems to work. Let me check a few more configurations, and I will confirm this.

@phoracek is this work-around acceptable ? It is a bit cumbersome, but it does work ... 

IMO, this at least would lower the priority of the RFE.

Comment 5 Petr Horáček 2023-05-23 13:56:08 UTC
Applying this workaround requires the cluster admin to prepare a list of all the localnet networks defined on the cluster that may be using the given bridge. While that is a viable workaround, I would not document it as something we expect people to go through.

We should either tell users "once you start using the bridge, no changes are allowed", or give them a nicer workflow where they don't have to tiptoe around existing patches.

I agree with Miguel about the lesser severity. Even though this is limiting, we can release support of localnet without this present.

Do I understand it correctly that an attempted change to the bridge will be rejected? i.e. it cannot happen that an attempted change will disconnect existing patches.

Comment 6 Miguel Duarte Barroso 2023-05-24 14:42:26 UTC
(In reply to Petr Horáček from comment #5)
> Applying this workaround requires the cluster admin to prepare a list of all
> the localnet networks defined on the cluster that may be using the given
> bridge. While that is a viable workaround, I would not document it as
> something we expect people to go through.
> 
> We should either tell users "once you start using the bridge, no changes are
> allowed", or give them a nicer workflow where they don't have to tiptoe
> around existing patches.
> 
> I agree with Miguel about the lesser severity. Even though this is limiting,
> we can release support of localnet without this present.
> 
> Do I understand it correctly that an attempted change to the bridge will be
> rejected? i.e. it cannot happen that an attempted change will disconnect
> existing patches.

Yes, the policy goes into degraded state but the patch ports are not deleted, and traffic keeps flowing.

I've checked connections between VMs, and from one VM to something running on the underlay. Both survived.

Comment 7 Petr Horáček 2023-05-24 14:44:30 UTC
Thanks. Medium priority it is then.

Comment 8 Gris Ge 2023-05-28 03:26:04 UTC
Demo patch set to upstream: https://github.com/nmstate/nmstate/pull/2358

Introducing `allow-extra-patch-ports` to OVS bridge section which will
allows extra OVS patch port found when applying or verifying. For
example:

```yml
- name: br0
  type: ovs-bridge
  state: up
  bridge:
    allow-extra-patch-ports: true
    port:
    - name: eth1
```

This YAML will not remove existing patch port of `br0` when applying,
and also ignore extra OVS patch found during verification.

This property will not be persisted, every time you modify
ports of specified OVS bridge, you need to explicitly define this
property if not using default value.

Scratch build could be found at: https://people.redhat.com/fge/bz_2209293

Comment 9 Gris Ge 2023-05-28 03:31:06 UTC
What to discuss are: 

 * Should nmstate persistent this `allow-extra-patch-ports`? It means next modification of ovs bridge will be inherent allowing extra ovs patchs?
 * If nmstate persistent or make it default to true, when user trying to overriding all OVS port list, they need to set `allow-extra-patch-ports: false`.
   But I guess, user seldom do that. Of course, removing the whole bridge will not be impacted by `allow-extra-patch-ports`.

Comment 10 Petr Horáček 2023-05-30 13:58:19 UTC
Thanks a lot Gris!

I would prefer if this flag was not persistent. Repeating this attribute every time we are applying the config is acceptable for our API.

I think it should remain false by default. Outside OVN use-cases, users may use nmstate to create patch ports between OVS bridges. The way I see it, the OVN use-case is a special one, so it should require changing the attributed to a non-default true.


Note You need to log in before you can comment on or make changes to this bug.