Bug 1829461 - initramfs systemd-networkd v245 DHCP lease lost results in NetworkManager disabling IPv4 at boot
Summary: initramfs systemd-networkd v245 DHCP lease lost results in NetworkManager dis...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Lubomir Rintel
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-29 15:17 UTC by Matthew Krupcale
Modified: 2021-05-25 16:01 UTC (History)
15 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2021-05-25 16:01:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
5.6.6-200.fc31.x86_64 (systemd v243 initramfs) truncated journal (7.24 KB, text/plain)
2020-04-29 15:17 UTC, Matthew Krupcale
no flags Details
5.6.6-300.fc32.x86_64 (systemd v245 initramfs) truncated journal (4.37 KB, text/plain)
2020-04-29 15:18 UTC, Matthew Krupcale
no flags Details

Description Matthew Krupcale 2020-04-29 15:17:21 UTC
Created attachment 1682970 [details]
5.6.6-200.fc31.x86_64 (systemd v243 initramfs) truncated journal

Description of problem:

I'm not entirely sure this is a bug in systemd-networkd, but it is a change in behavior surrounding stopping IPv4 networking/DHCP in initramfs because after updating F31->F32 (systemd v243->v245), my server stopped getting IPv4 addresses (via DHCP) on boot. What seems to happen is (a and b refer to systemd v243 and v245, respectively):

1. systemd-networkd gets IPv4/IPv6 addresses via DHCP/DHCPv6
2. before switch to primary userspace, systemd stops Network service
3a. drops DHCPv6 lease only
3b. drops both DHCP and DHCPv6 leases
4. after switch to primary userspace, NetworkManager starts and tries to set up devices and connections
5a. seeing that there is a previous IPv4/IPv6 setup, NetworkManager tries to match the guessed connection using DHCP/DHCPv6, which is the existing connection profile for the device
5b. seeing that there is a previous IPv6-only setup, NetworkManager creates a new connection with disabled IPv4 and only sets up IPv6
6b. after logging in via IPv6 or locally and taking down this new, IPv6-only connection (via "nmcli con down"), the previously existing connection profile for the device takes over and properly sets up both IPv4/IPv6 via DHCP/DHCPv6.

Version-Release number of selected component (if applicable):
systemd-245.4-1.fc32.x86_64
NetworkManager-1.22.10-1.fc32.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. Set up systemd-networkd with DHCP=yes in initramfs
2. Have NetworkManager connection for the same device setup with DHCP and IPv6
3. Reboot system

Actual results:
New NetworkManager connection created with IPv4 networking disabled post-boot.

Expected results:
IPv4 networking setup via DHCP using existing NetworkManager connection profile.

Additional info:
Taking down new NetworkManager connection results in the previously existing connection properly obtaining IPv4/IPv6 address via DHCP/DHCPv6.

See the attached truncated logs for booting kernel 5.6.6-200.fc31.x86_64 (systemd v243 initramfs) and 5.6.6-300.fc32.x86_64 (systemd v245 initramfs) showing Network stop prior to root switch and NetworkManager setting up the connection after switch.

As far as solutions go, I'm not entirely sure what should be done. It seems correct that systemd will release both DHCP and DHCPv6 leases in stopping Network prior to switch. However, it also seems reasonable that NetworkManager will try to replicate the existing connection settings after the switch.

When systemd-networkd is not used in initramfs and no IPv4/IPv6 networking is set up prior to switch, NetworkManager seems to set up both IPv4/IPv6 networking post-switch just fine using the existing connection profiles as well. That is, it does not try to disable all networking just because none was detected prior to the switch. Is there a way for systemd-networkd to return to this state from the perspective of NetworkManager? Or should there be changes made by NetworkManager?

Comment 1 Matthew Krupcale 2020-04-29 15:18:37 UTC
Created attachment 1682971 [details]
5.6.6-300.fc32.x86_64 (systemd v245 initramfs) truncated journal

Comment 2 Fedora Program Management 2021-04-29 16:21:44 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 3 Zbigniew Jędrzejewski-Szmek 2021-05-18 19:12:37 UTC
I don't think we can support systemd-networkd and NetworkManager taking over from one another.
Having one handle some interfaces and the other others is probably manageable with some careful
config. But passing state for the same connection is too perilous.

> What seems to happen is (a and b refer to systemd v243 and v245, respectively):
> 
> 1. systemd-networkd gets IPv4/IPv6 addresses via DHCP/DHCPv6
> 2. before switch to primary userspace, systemd stops Network service
> 3a. drops DHCPv6 lease only
> 3b. drops both DHCP and DHCPv6 leases
> 4. after switch to primary userspace, NetworkManager starts and tries to set up devices and connections
> 5a. seeing that there is a previous IPv4/IPv6 setup, NetworkManager tries to match the guessed connection using DHCP/DHCPv6, which is the existing connection profile for the device
> 5b. seeing that there is a previous IPv6-only setup, NetworkManager creates a new connection with disabled IPv4 and only sets up IPv6
> 6b. after logging in via IPv6 or locally and taking down this new, IPv6-only connection (via "nmcli con down"), the previously existing connection profile for the device takes over and properly sets up both IPv4/IPv6 via DHCP/DHCPv6.

3a seems to be a bug. 3b seems correct.
5b seems like a bug. I think NM should just do whatever it's configuration says and not behave differently based on some preexisting state.

Note: you can set KeepConfiguration=dhcp-on-stop to tell networkd to keep addreses. This might help or might not in this particular case.

Dunno, I'd be inclined to close this as CANTFIX, because it's just too fragile.
I'll assign to NM first for comments though.

Comment 4 Ben Cotton 2021-05-25 16:01:52 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.