RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2071985 - Nmstate Verification when creating bond over eth interfaces
Summary: Nmstate Verification when creating bond over eth interfaces
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: NetworkManager
Version: 8.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Thomas Haller
QA Contact: David Jaša
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-05 11:07 UTC by Radim Hrazdil
Modified: 2023-09-18 04:34 UTC (History)
18 users (show)

Fixed In Version: NetworkManager-1.39.3-1.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-08 10:07:32 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Reproducer script (678 bytes, application/x-shellscript)
2022-04-13 13:53 UTC, Gris Ge
no flags Details
NetworkManager trace log (689.89 KB, text/plain)
2022-04-13 13:54 UTC, Gris Ge
no flags Details
rh2071985-repro.sh (759 bytes, text/plain)
2022-04-21 16:17 UTC, Thomas Haller
no flags Details
rh2071985-rhel-8.4.log (891.33 KB, text/plain)
2022-04-21 16:18 UTC, Thomas Haller
no flags Details
rh2071985-rhel-8.5.log (448.32 KB, text/plain)
2022-04-21 16:19 UTC, Thomas Haller
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-117883 0 None None None 2022-04-05 11:18:11 UTC
Red Hat Product Errata RHBA-2022:7680 0 None None None 2022-11-08 10:08:14 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1199 0 None opened [th/bond-port-mtu] device: reset MTU after attaching bond port 2022-04-26 11:16:17 UTC

Description Radim Hrazdil 2022-04-05 11:07:42 UTC
Description of problem:

Nmstate returns Verification error when crating a bond over two eth interfaces


The current state of the two ethernet interfaces:
[cnv-qe-jenkins@n-yoss-411-ovn-nkjcx-executor ~]$ oc get nns c01-rn411-sdn-pvtbd-worker-0-htfjr -ojson | jq '.status.currentState.interfaces[] | select(."name" == "ens9")'
{
  "ipv4": {
    "address": [],
    "dhcp": false,
    "enabled": false
  },
  "ipv6": {
    "address": [],
    "autoconf": false,
    "dhcp": false,
    "enabled": false
  },
  "lldp": {
    "enabled": false
  },
  "mac-address": "FA:16:3E:CF:D0:93",
  "mtu": 1450,
  "name": "ens9",
  "state": "up",
  "type": "ethernet"
}
[cnv-qe-jenkins@n-yoss-411-ovn-nkjcx-executor ~]$ 
[cnv-qe-jenkins@n-yoss-411-ovn-nkjcx-executor ~]$ oc get nns c01-rn411-sdn-pvtbd-worker-0-htfjr -ojson | jq '.status.currentState.interfaces[] | select(."name" == "ens11")'
{
  "ipv4": {
    "address": [],
    "dhcp": false,
    "enabled": false
  },
  "ipv6": {
    "address": [],
    "autoconf": false,
    "dhcp": false,
    "enabled": false
  },
  "lldp": {
    "enabled": false
  },
  "mac-address": "FA:16:3E:40:50:86",
  "mtu": 1450,
  "name": "ens11",
  "state": "up",
  "type": "ethernet"
}


Applying following desiredState:

interfaces:
- ipv4:
    auto-dns: true
    dhcp: false
    enabled: false
  ipv6:
    auto-dns: true
    autoconf: false
    dhcp: false
    enabled: false
  link-aggregation:
    mode: balance-xor
    options:
      miimon: '120'
    port:
    - ens11
    - ens9
  mtu: 1450
  name: mtx-bond1
  state: up
  type: bond
- mtu: 1400
  name: ens11
  state: up
  type: ethernet
- mtu: 1400
  name: ens9
  state: up
  type: ethernet


Results in the following difference reported with VerificationError
      difference
      ==========
      --- desired
      +++ current
      @@ -2,4 +2,11 @@
       name: ens11
       type: ethernet
       state: up
      -mtu: 1400
      +ipv4:
      +  enabled: false
      +ipv6:
      +  enabled: false
      +lldp:
      +  enabled: false
      +mac-address: FA:16:3E:40:50:86
      +mtu: 1450


Version-Release number of selected component (if applicable):
1.0.2

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Comment 1 Yossi Segev 2022-04-05 11:22:50 UTC
This is the policy I tried to apply:


apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: matrix-bond1-nncp
spec:
  desiredState:
    interfaces:
    - ipv4:
        auto-dns: true
        dhcp: false
        enabled: false
      ipv6:
        auto-dns: true
        autoconf: false
        dhcp: false
        enabled: false
      link-aggregation:
        mode: balance-xor
        options:
          miimon: '120'
        port:
        - ens11
        - ens9
      mtu: 1450
      name: mtx-bond1
      state: up
      type: bond
    - mtu: 1400
      name: ens11
      state: up
      type: ethernet
    - mtu: 1400
      name: ens9
      state: up
      type: ethernet
  nodeSelector:
    kubernetes.io/hostname: c01-rn411-sdn-pvtbd-worker-0-htfjr

Comment 2 Gris Ge 2022-04-06 07:55:13 UTC
Hi Radim,

The bond port(ens11 and ens9) is holding MTU 1400, while the bond itself is using 1450 MTU. Is this configure error or dedicate for special usage?

Comment 3 Yossi Segev 2022-04-06 11:26:36 UTC
Hi Gris,

After checking this scenario, I found that setting the same MTU (1400) for the bond interface and its 2 ports ends successfully.
But the configuration described in this bug used to work until recently, and all our downstream bond tests passed with this configuration, where the bond is assigned +50 MTU more than its ports.
What happened?

Thank you,
Yossi

Comment 4 Yossi Segev 2022-04-06 12:11:18 UTC
And another clarification - to rule-out the possibility that the 1450 MTU was too high, I tried the same policy, only this time with setting the bond's MTU to 1400 and the ports' MTU to 1350, and the configuration failed.

Comment 5 Radim Hrazdil 2022-04-07 07:30:15 UTC
Hi Gris, I don't know if there is an actual use-case behind setting the MTU higher on the bond than on the ports,
but it doesn't seem like a conflicting configuration.

@Yossi was there ever any particular reason to add 50B to the bond?


If you decide to force the MTU to be the same across bond and bonded interfaces, it would be noce if there was an error message saying that there is a discrepancy in MTU setting

Comment 6 Fernando F. Mancera 2022-04-07 07:35:31 UTC
(In reply to Yossi Segev from comment #3)
> Hi Gris,
> 
> After checking this scenario, I found that setting the same MTU (1400) for
> the bond interface and its 2 ports ends successfully.
> But the configuration described in this bug used to work until recently, and
> all our downstream bond tests passed with this configuration, where the bond
> is assigned +50 MTU more than its ports.
> What happened?
> 
> Thank you,
> Yossi

Hello!

We did not introduce changes to Nmstate 1.0.2 so it doesn't seem Nmstate faul. I must say that Nmstate is not reporting this properly IMO. Do you know if NetworkManager was updated in your environment? Thank you!

Comment 7 Yossi Segev 2022-04-07 09:32:44 UTC
> I must say that Nmstate is not reporting this properly IMO.

I totally agree, and there's already https://bugzilla.redhat.com/show_bug.cgi?id=2044150 (which evolved from https://bugzilla.redhat.com/show_bug.cgi?id=1812559 I opened) to handle it.


> Do you know if NetworkManager was updated in your environment?

I don't know if it was updated, but I can say that the current NetworkManager on my cluster node is
sh-4.4# NetworkManager --version
1.32.10-4.el8

@Radim:
> @Yossi was there ever any particular reason to add 50B to the bond?

Actually what we do is reduce 50 from the ports, rather than adding 50 to the bond.
I tried to recall and also consult with colleagues, but unfortunately we can't remember why we had to assign different MTUs to the bond and to its ports.
what we do know is that we had to do it, because our tests failed otherwise. And keep in mind that the tests passed successfully until recently, with different MTU on the bond and on the ports.

Comment 8 Gris Ge 2022-04-13 13:35:40 UTC
Confirmed. The same yaml works well on NetworkManager-1.30.0-10.el8_4.x86_64 (RHEL 8.4.0), but fail since NetworkManager-1.32.10-4.el8(RHEL 8.5.0).

I will create simpler reproducer and collect NM logs for NetworkManager team to investigate this regression.

For the MTU difference between port and bond, as long as kernel support so and it used to work, we should keep supporting it.

Comment 9 Gris Ge 2022-04-13 13:53:42 UTC
Created attachment 1872179 [details]
Reproducer script

This is simplified reproducer.

Comment 10 Gris Ge 2022-04-13 13:54:13 UTC
Created attachment 1872181 [details]
NetworkManager trace log

Comment 11 Gris Ge 2022-04-13 13:55:26 UTC
Setting regression keyword as this used to works well in RHEL 8.4.

Changing component to NetworkManage for follow up investigation.

Comment 12 Gris Ge 2022-04-13 13:57:56 UTC
Hi Petr,

Which RHEL releases are your team expecting the fix been backported to?

Comment 13 Petr Horáček 2022-04-14 09:23:48 UTC
I'm not convinced this should be urgent. Unless there is a clear reason why would a customer set different MTUs. IIUIC such config does not have a networking justification, does it? I'd suggest we lower it to high at most. Please protest if I'm missing something.

For the same reason I think it is ok to get a fix of this to 8.6 batch 1. With that it would reach OCP 4.11.z.

Comment 14 Thomas Haller 2022-04-21 16:17:06 UTC
Created attachment 1874176 [details]
rh2071985-repro.sh

very similar to attachment 1872179 [details], minor modifications and this is what was used for the following logs.

Comment 15 Thomas Haller 2022-04-21 16:18:17 UTC
Created attachment 1874178 [details]
rh2071985-rhel-8.4.log

logfile running rh2071985-repro.sh on rhel-8.4

NetworkManager 1:1.30.0-13.el8_4
kernel 4.18.0-305.25.1.el8_4

Comment 16 Thomas Haller 2022-04-21 16:19:22 UTC
Created attachment 1874179 [details]
rh2071985-rhel-8.5.log

logfile running rh2071985-repro.sh on rhel-8.5

NetworkManager 1:1.32.10-5.el8_5
kernel 4.18.0-348.23.1.el8_5

Comment 17 Thomas Haller 2022-04-21 16:23:24 UTC
(In reply to Gris Ge from comment #10)
> Created attachment 1872181 [details]
> NetworkManager trace log


Kernel changes the MTU when enslaving to a bond:


You can also see that with this shell script:
```
#!/bin/bash

set -x
set -e

ip netns del t 2>/dev/null || :
ip netns add t

ip -netns t link add name v type veth peer name w
ip -netns t link add name b type bond
ip -netns t link set dev w up
ip -netns t link set dev b up

ip -netns t link set dev b mtu 1450
ip -netns t link set dev v mtu 1400
ip -netns t link set dev w mtu 1400

ip -netns t link

ip -netns t link set dev v master b

ip -netns t link
```

this happens on all kernels (rhel-8.4 (4.18.0-305.25.1.el8_4.x86_64); rhel-8.5 (4.18.0-348.23.1.el8_5.x86_64), Fedora 35).


the difference between rhel-8.4 and rhel-8.5 is that on rhel-8.4 NM will change the MTU *after* enslaving the port. Still investigating why.



As to whether this configuration is useful, seems questionable to me. What happens if the bond interface wants to send a package of 1450 bytes? Should it be truncated by the port?

Comment 18 Thomas Haller 2022-04-22 13:53:22 UTC
This changed in 1.32.0, with [1]. Now we set the MTU before "ip-config" state.

Interestingly, we enslave ports during "ip-config" state ([2]), so now we first set the MTU, before enslaving.
This results in kernel winning, when it changes the MTU when attaching the port.

[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/e67ddd826fae1a348a7a253e9c37744c05645112
[2] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/e67ddd826fae1a348a7a253e9c37744c05645112/src/core/devices/nm-device.c#L6364


A simple fix would be to change the MTU again after enslaving.

A follow-up rework should be, that we don't enslave the port in state "ip-config", it is the wrong state. 



The large question is however whether it's correct to fight with kernel over setting the MTU.
It's not clear that setting a smaller MTU on the bond port results in a working configuration (see [3], [4]).

[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/bonding/bond_main.c?h=v5.17#n3603
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/bonding/bond_main.c?h=v5.17#n4372


@Yossi, could you clarify again, why this is a useful setup? Why is this done, and does this even work in a correct bond configuration? Did you actually test that it works, or did you merely test that such a configuration could be created. You already touched on that topic, but the answer is not clear to me. Could you provide more information about this use-case? Thank you.

Comment 19 Yossi Segev 2022-04-24 09:30:38 UTC
@Meni
It seems to me that the fix that Thomas is suggesting in comment #18 ("change the MTU again after enslaving") is redundant, as it actually means "to fight with kernel", as Thomas put it.
I think that we should change our tests, and keep the same MTU for the bond interface and its ports.
But we should keep in mind, that if the future rework that Thomas is suggesting ("don't enslave the port in state "ip-config", it is the wrong state") is implemented, we will probably be forced to change our tests again.
What do you think?

Comment 20 Thomas Haller 2022-04-26 11:16:18 UTC
with https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1199, NetworkManager is going to set the MTU after attaching the port.

This may not result in a working configuration. But it seems better to honor the user's configuration, than silently not applying it. After all, the user must configure a sensible MTU, and if they fail to do so, they have problems anyway.


This should restore the rhel-8.4 (1.30) behavior.

Comment 26 errata-xmlrpc 2022-11-08 10:07:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7680

Comment 27 Red Hat Bugzilla 2023-09-18 04:34:45 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.