Bug 1858439

Summary: Write NetworkManager 'keyfile' configs to installed system
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: anacondaAssignee: Radek Vykydal <rvykydal>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 33CC: anaconda-maint-list, jkonecny, jonathan, kellin, rvykydal, thaller, vanmeeuwen+fedora, vponcova, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: openqa
Fixed In Version: anaconda-33.25-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-11 18:28:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1857391    

Description Adam Williamson 2020-07-18 01:04:30 UTC
I haven't verified this yet, but I think I know what's going on.

In the Fedora-Rawhide-20200717.n.0 compose, a couple of tests started failing - the VNC install test server ends. The actual install runs fine, but they fail in a post-install phase where we check if there are any AVCs, and upload the log if there are. This step is failing because the network isn't working at that point in the test. It worked fine up until the previous compose.

My theory as to what's happening is this: we boot the VNC server end - the system that will actually get installed - with a static network config passed in on the kernel command line:

net.ifnames=0 biosdevname=0 ip=10.0.2.114::10.0.2.2:255.255.255.0:vnc001.domain.local:eth0:off

the client box then just connects to it, clicks through the installer, and exits, its job is done (that one passes). So we're relying on that static network config that we passed in on the command line to somehow be transferred to the installed system, for the installed system to have working networking.

I think what happened before is that NetworkManager actually wrote the config out as an ifcfg file, and anaconda has code that runs at the end of the install process, finds any relevant ifcfg files, and writes them to the installed system. So that's the mechanism by which the config got from the cmdline to the installed system.

What changed in Fedora-Rawhide-20200717.n.0 is NetworkManager's default for persisting network config to disk:
https://fedoraproject.org/wiki/Changes/NetworkManager_keyfile_instead_of_ifcfg_rh

it now defaults to persisting the config using its native 'keyfile' config format, not the older ifcfg format. And I'm guessing anaconda does *not* have code to discover relevant *keyfile* configs and copy those across to the installed system. So our test fails because the static network config never makes it to the installed system, and it instead tries to bring up the interface using DHCP, which won't work in this case (there is no DHCP server it can talk to).

As I said I haven't verified this yet, but if my memory is correct, I think I'm right :) I'll dig into it more next week. Copying the Change owner for information, because this will affect more than just openQA tests, I think.

Comment 1 Thomas Haller 2020-07-18 14:42:56 UTC
during installation, anaconda is also welcome to run NetworkManager with a configuration snippet

  [main]
  plugins=ifcfg-rh

At least as a quick remedy.

Granted, that results in files still created in ifcfg-rh format. But that is still fine. rh#1857391 is about changing the default, and prefer keyfile. But if anaconda wishes to handle ifcfg files, it may do so.

Comment 2 Adam Williamson 2020-07-19 01:22:59 UTC
Indeed, if I'm right about what's going on, that should work too.

Comment 3 Radek Vykydal 2020-08-07 08:49:59 UTC
Thomas, is there any pattern for config files we should copy to installed system (ie "nmconnection" suffix) ?
Or should we copy the content of the /etc/NetworkManager/system-connections ?

For ifcfg we copy
DEVICE_CONFIG_FILE_PREFIXES = ("ifcfg-", "keys-", "route-")

Comment 4 Thomas Haller 2020-08-07 09:17:03 UTC
keyfiles reside under /{usr/lib,run,etc}/NetworkManager/system-connections.

Under /usr/lib and /run, the files *MUST* have a ".nmconnection" extension. For historical reasons, that is not required under /etc/NetworkManager/system-connections. NetworkManager would write new files with that extension, however, if you update an existing profile, it would not forcefully rename the file.

Also, theoretically there can be certificate files which NetworkManager could have written to /etc/NetworkManager/system-connections.

So, you basically copy all the files there. There really should be no irrelevant files in that directory.



To be precise, some filenames (like leading "." or ".??????" suffix) are never valid keyfiles. However, they might still be important, for example as ".$UUID.nmmeta" files.
The code that checks valid filenames is at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/24c534225faa786a0a4a4cda6744659493354183/libnm-core/nm-keyfile/nm-keyfile.c#L4253 .
See the "require_extension" argument, which is TRUE for /{usr/lib,run}/ but false for /etc.

Comment 5 Radek Vykydal 2020-08-07 11:07:27 UTC
https://github.com/rhinstaller/anaconda/pull/2775

Comment 6 Ben Cotton 2020-08-11 15:25:16 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 33 development cycle.
Changing version to 33.

Comment 7 Adam Williamson 2020-08-11 18:28:21 UTC
So this looks to be working: the openQA tests that were failing started passing today. Let's hope it doesn't have any unexpected effects, but looks good for now.