Bug 1848460

Summary: [NMCI] incorrect match of devices after service restart with /var/run cleanup
Product: Red Hat Enterprise Linux 8 Reporter: Vladimir Benes <vbenes>
Component: NetworkManagerAssignee: Vladimir Benes <vbenes>
Status: CLOSED NOTABUG QA Contact: Desktop QE <desktop-qa-list>
Severity: unspecified Docs Contact:
Priority: high    
Version: 8.3CC: acardace, atragler, bgalvani, fge, lrintel, rkhan, sukulkar, thaller, till
Target Milestone: rcKeywords: TestOnly, Triaged
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-30 08:01:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vladimir Benes 2020-06-18 12:09:52 UTC
Description of problem:
NM should match both devices for which it has a profiles. Even if /var/run/NetworkManager is gone.

https://desktopqe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/beaker-NetworkManager-master-veth-rhel8-upstream/1718/artifact/artifacts/FAIL_report_NetworkManager-ci_Test0548_match_connections_with_infinite_leasetime.html

Version-Release number of selected component (if applicable):
1.26

Comment 1 Thomas Haller 2020-06-29 09:35:04 UTC
First of all, this is all ugly.

When NetworkManager starts and it finds a device already configured, then there should be two cases only:

- after `systemctl restart NetworkManager`, in `/run/NetworkManager/devices` we remember which devices were active and which profiles were activated. So, after restart, we take over those configuration again (gracefully). The result is to fully manage the devices after restart.

- /run does not indicate that NetworkManager was managed in the device. This is for the first start after boot or (after restart). In that case, NM generates a profile (named like "eth0") and does not touch the device at all. It's a pretend-only mode.


Now, when not running NetworkManager in initrd, then dracut configures the device, drops ifcfg files, and expect NetworkManager to take over. That violates the two nices cases that we would like to have. That is not a problem when running NetworkManager in initrd. There were many issues about this, one of them is bug 1771792.


test match_connections_with_infinite_leasetime was added for bug 1771792.

It fails with:

 > Connection 'testG' differs from candidate 'con_general' in 802-3-ethernet.mac-address, ipv4.method, ipv4.gateway, ipv4.addresses

The supposed fix for that is https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/53bb23b403b82bce248018deaa409a8f0aa5d1d4/src/nm-manager.c#L2808

This test doesn't succeed, because the fix requires to find a file `/var/run/initramfs/net.testG.lease` with the dhclient lease file. That file would then be copied to "/run/NetworkManager/dhclient-$UUID-testG.lease".

In that case, you should find a message in log "assume: taking over an initramfs-configured connection"

How about adjusting the test fo have such a file `/var/run/initramfs/net.testG.lease`?

Comment 2 Thomas Haller 2020-06-30 11:01:46 UTC
> How about adjusting the test fo have such a file `/var/run/initramfs/net.testG.lease`?

To be precise, I think to make the test hit the right condition, it should suffice to create a file /var/run/initramfs/net.testG.lease. The content doesn't really matter, but it should probably be a "proper" lease file of dhclient.

Comment 3 Till Maas 2020-07-02 13:48:20 UTC
The test needs to be fixed.