Bug 2093050

Summary: NetworkManager keeps auto-creating "Wired Connection" during initramfs phase even though it should not
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: NetworkManagerAssignee: Lubomir Rintel <lkundrak>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 36CC: acabral, bgalvani, dcbw, francesco.giudici, gnome-sig, liangwen12year, lkundrak, mclasen, rstrode, sandmann, thaller
Target Milestone: ---   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-07-15 17:56:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Adam Williamson 2022-06-02 19:41:53 UTC
I have a problem with NM on one of the Fedora infra openQA worker hosts, openqa-a64-worker01 .

The host has two wired network interfaces that are plugged in. I *do not want* one of them to be brought up. To this end, I have set up the host variables in infra ansible to not bring it up:

network_connections:
  - autoconnect: no
    mac: "{{ enP2p1s0_mac }}"
    name: enP2p1s0
    state: down
    type: ethernet

enP2p1s0_mac is the MAC address of the interface I do not want brought up. The interface I *do* want brought up is enp1s0.

However, on boot, NetworkManager insists on bringing up both interfaces, under a connection called "Wired Connection". This appears to be happening during the initramfs phase of boot, because it comes before the switch root. We see messages like this:

Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7235] policy: auto-activating connection 'Wired Connection' (9c45133e-f816-4429-b056-f73bf9d284bd)
Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7243] device (enP2p1s0): Activation: starting connection 'Wired Connection' (9c45133e-f816-4429-b056-f73bf9d284bd)
Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7244] device (enP2p1s0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7248] manager: NetworkManager state is now CONNECTING
Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7250] device (enP2p1s0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7266] device (enP2p1s0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Jun 02 19:23:35 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197815.7283] dhcp4 (enP2p1s0): activation: beginning transaction (timeout in 90 seconds)

then, later:

Jun 02 19:23:46 openqa-a64-worker01.iad2.fedoraproject.org systemd[1]: Switching root.

The 'real' boot then assumes the connection that was created during initramfs phase and brings up the interface again:

Jun 02 19:23:52 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[1122]: <info>  [1654197832.1668] manager: (enP2p1s0): assume: will attempt to assume matching connection 'Wired Connection' (9c45133e-f816-4429-b056-f73bf9d284bd) (indicated)
Jun 02 19:23:52 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[1122]: <info>  [1654197832.1669] device (enP2p1s0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'assume')
Jun 02 19:23:52 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[1122]: <info>  [1654197832.1682] device (enP2p1s0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'assume')
Jun 02 19:23:52 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[1122]: <info>  [1654197832.1696] device (enP2p1s0): Activation: starting connection 'Wired Connection' (9c45133e-f816-4429-b056-f73bf9d284bd)

and it winds up being brought up under this 'Wired Connection' connection. I do not want this, and cannot figure out how to stop it.

As far as I can tell, this *should not* be happening, because dracut writes a config file into the initramfs which tells NM to *not* auto-create connections:

[root@openqa-a64-worker01 adamwill][PROD-IAD2]# cat /usr/lib/dracut/modules.d/35network-manager/initrd-no-auto-default.conf
[.config]
enable=env:initrd

[main]
no-auto-default=*

[root@openqa-a64-worker01 adamwill][PROD-IAD2]# lsinitrd /boot/initramfs-5.17.11-300.fc36.aarch64.img | grep no-auto-def
-rw-r--r--   1 root     root           54 Feb 18 11:32 usr/lib/NetworkManager/conf.d/initrd-no-auto-default.conf

and the log does show NM reading this config file, three seconds *before* it auto-creates the connection:

Jun 02 19:23:32 openqa-a64-worker01.iad2.fedoraproject.org NetworkManager[619]: <info>  [1654197812.4755] Read config: /etc/NetworkManager/NetworkManager.conf (lib: initrd-no-auto-default.conf)

but somehow it just goes ahead and auto-creates "Wired Connection" anyway. I really do not want this, I have tried as hard as I can to make it stop, but it keeps happening. Help!

Comment 1 Thomas Haller 2022-06-03 11:04:31 UTC
in initrd, there is nm-initrd-generator running, which generates configuration of NetworkManager in initrd.

This "Wired Connection" generated by nm-initrd-generator. That is different from the profiles named "Wired connection 1", which are disabled with `[main].no-auto-default` setting.

nm-initrd-generator parses the kernel command line. You can also call

  /usr/libexec/nm-initrd-generator --stdout -- $CMDLINE

to see what it would do.

Comment 2 Adam Williamson 2022-06-30 17:58:34 UTC
So, I forgot about this for a bit, but just rebooted the machine and it fricking happened again.

nm-initrd-generator from the booted system isn't saying anything:

[root@openqa-a64-worker01 adamwill][PROD-IAD2]# /usr/libexec/nm-initrd-generator --stdout -- $CMDLINE
[root@openqa-a64-worker01 adamwill][PROD-IAD2]# 

and there's nothing obviously network-related on the cmdline:

[root@openqa-a64-worker01 adamwill][PROD-IAD2]# cat /proc/cmdline 
BOOT_IMAGE=(hd2,gpt2)/vmlinuz-5.18.5-200.fc36.aarch64 root=UUID=e65d047c-ed45-414a-8dcb-911c0459539d ro rootflags=subvol=root net.ifnames=1

so what's going on?

Comment 3 Beniamino Galvani 2022-07-01 07:37:08 UTC
In dracut, different modules can inject additional kernel arguments at runtime. Or, a custom command line can be stored inside the initrd itself (from kernel_cmdline= in dracut.conf).

To check why the "Wired Connection" is generated, the best way is to add "rd.debug" to the command line and check how nm-initrd-generator is invoked.

Comment 4 Adam Williamson 2022-07-14 19:33:38 UTC
So, a bit of news here: it looks like the culprit here is something in Fedora infra. Still unpicking it with nirik ATM, but there's a /etc/dracut.conf.d/nbde_client.conf which says this:

# nbde_client dracut config
kernel_cmdline="rd.neednet=1"
omit_dracutmodules+=" ifcfg "

this is part of this thing:

https://github.com/linux-system-roles/nbde_client

which we're using on the openQA worker systems, apparently, to do boot-time device decryption. That's what's making dracut bring up the network.

nirik says he's shaved this yak before - he's tried a couple of things to prevent the connection bleeding from dracut into the booted system. There's https://pagure.io/fedora-infra/ansible/blob/main/f/files/common/noautodefault.conf , which gets installed as /etc/NetworkManager/conf.d/00-no-auto.conf on some systems using this role; but we've already discussed above that that probably won't help here.

There's also https://pagure.io/fedora-infra/ansible/blob/main/f/files/common/nbde_client-network-flush , which is getting installed to replace /usr/bin/nbde_client-network-flush on some systems using this role, but currently *not* on the openQA workers. Possibly that does the trick, so I'm going to add it to the openQA workers and see.

I do still wonder, though - is there a way to tell NetworkManager *not* to 'migrate in' connections from the initrd environment to the booted system, if we don't actually want that? Or can it just not be disabled?

Comment 5 Thomas Haller 2022-07-15 15:36:15 UTC
> I do still wonder, though - is there a way to tell NetworkManager *not* to 'migrate in' connections from the initrd environment to the booted system, if we don't actually want that? Or can it just not be disabled?


a) you could just delete /run/NetworkManager during switch root. Maybe also `ip addr flush` on all devices is necessary...

b) there is keep-configuration and allowed-connections in `man NetworkManager.conf`. See

 - https://networkmanager.dev/docs/api/latest/NetworkManager.conf.html
 - https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/df2fe157142f92847d4f14103775e3cf704cb3ca
 - https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/bace14fe1f374db26e49e4e7d61d2fbfce4241cc


If not, could you elaborate on the use-case and what's the problem with any of the two solutions above?

Comment 6 Adam Williamson 2022-07-15 17:56:05 UTC
Thomas: thanks. Hard to tell just from the docs, but it sounds like one or both of those might do the trick. For now I've worked around the issue by just disabling this nbde role on the systems that are affected by this, since they're not actually encrypted and don't need it. But if I get time to look at this again I'll see if one of those does the trick.

I'm gonna close this as NOTABUG for now because it seems like things are more or less working as intended and we've figured out why the connection was being brought up in dracut at all; if I come back to it and find I can't find any way to *not* have the dracut connection activated in the regular system environment I might re-open it for that issue.