Bug 2017304 - connections created using nmcli/nmtui don't work after a node reboot
Summary: connections created using nmcli/nmtui don't work after a node reboot
Keywords:
Status: CLOSED DUPLICATE of bug 1970021
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Ben Nemec
QA Contact: Victor Voronkov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-26 09:04 UTC by Olimp Bockowski
Modified: 2021-11-02 06:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-28 14:45:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Olimp Bockowski 2021-10-26 09:04:23 UTC
OCP Version at Install Time: 4.6.34
RHCOS Version at Install Time: accordingly to OCP version 
OCP Version after Upgrade (if applicable): 4.7.31
RHCOS Version after Upgrade (if applicable):  accordingly to OCP version
Platform: bare metal
Architecture: x86_64


What are you trying to do? What is your use case?
Persistewnt connections created using nmcli/nmtui 

What happened? What went wrong or what did you expect?
They are lost after the node reboot

What are the steps to reproduce your issue? Please try to reduce these steps to something that can be reproduced with a single RHCOS node.

In OCP 4.6 on a node we can configure bonding using following commands:
$nmcli con add type bond con-name test ifname test mode active-backup
$nmcli con mod test ipv4.method disabled ipv6.method disabled
$nmcli con add type bond-slave con-name ens161 ifname ens161 master test
$nmcli con add type bond-slave con-name ens193 ifname ens193 master test

As a result relevant files are created in /etc/NetworkManager/system-connections/ dir.
These files persist reboot. NetworkManager reads the files and sets the bonding.

However there is a change in OCP4.7.
The relevant files are created in /etc/NetworkManager/systemConnectionsMerged/.
The files after reboot are vanished.


More answers are not needed, since it is easily reproducible, but here you have my findings: 

So customers were using nmcli to configure additional NICs on 4.6 and everything was fine, later with 4.7 it doesn't -> huge difference. I believe the general root cause for it is that 4.6 is based on RHEL 8.2 and 4.7 is based on RHEL 8.3
Going forward we know that the NetworkManager packages have been upgraded to upstream version 1.26.0, which provided a number of enhancements and bug fixes over the previous version
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/1.26.0/NEWS
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/1.24.0/NEWS

I guess that the plugin config and keyfile change something here. In the latest OCP 4.7 version we have a file /etc/NetworkManager/conf.d/99-keyfiles.conf (a new file) having:

[main]
plugins=keyfile,ifcfg-rh
[keyfile]
path=/etc/NetworkManager/systemConnectionsMerged

(BTW it had by default path /etc/NetworkManager/system-connections, but is changed but is now /etc/NetworkManager/systemConnectionsMerged)
When we use nmcli, the profiles are created in /etc/NetworkManager/systemConnectionsMerged and they are volatile
However, putting them to /etc/NetworkManager/system-connections gives the needed solution, e.g. it works. I believe that's due to ifcfg-rh plugin that monitors this directory

Nonetheless, the real problem is that /etc/NetworkManager/system-connections-merged (4.7) or /etc/NetworkManager/systemConnectionsMerged (4.8+) is a overlay mount from /etc/NetworkManager/system-connections, so indeed everything written to the latter (as configured) is gone if you reboot.

In general, that's not good because nmstate is still in a tech preview. In the case of static IPs, I don't think there is any other approach (I would think about machineconfigs if we had DHCP settings)

Temporary we could create some doc/KCS to just make it persistent using cp to /etc/NetworkManager/system-connections, but it doesn't look like a neat approach IMHO. Better would be to just change NM config file to write into persistent directory.

Comment 2 Timothée Ravier 2021-10-26 09:52:42 UTC
This is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1970021 and this behavior has been reverted in https://github.com/openshift/machine-config-operator/pull/2742 but I don't think this has been backported.

A workaround is to create a systemd oneshost service unit that runs on every boot and sets up the required connection configuration.

Comment 3 Olimp Bockowski 2021-10-26 10:56:27 UTC
@Timothee - ok so I understand that in 4.8 it is working as expected? but hasn't been backported only to 4.7? Or maybe it is fixed for 4.9 and the rest is still broken? :]

Comment 4 Olimp Bockowski 2021-10-26 11:01:17 UTC
ok, I see that https://bugzilla.redhat.com/show_bug.cgi?id=1970021 will be shiped for 4.10 
I am not sure about backporting, maybe KCS would be enough...

Comment 5 Timothée Ravier 2021-10-27 14:23:59 UTC
Moving to the MCO as I don't know the details for backporting and which version has which fix.


Note You need to log in before you can comment on or make changes to this bug.