Created attachment 1522090 [details]
Journal output of bond99 failure to activate
Description of problem:
nmstate is using libnm to configure a bond profile and activate it.
When using autoconnect=True, the bond fails activation with the following error:
error=nm-manager-error-quark: Connection 'bond99' is not available on the device bond99 at this time. (2)
Version-Release number of selected component (if applicable):
Tested on CentOS container with NM 1.12.0-8.el7_6
Run nmstate integration on https://github.com/nmstate/nmstate/pull/239/commits/8eec4329a8a469b682cc7b4cca9c86392e8752aa
Steps to Reproduce:
here the new connection gets added
<trace> [1548051268.2500] ifcfg-rh: write: write connection bond99 (e320e547-051a-4a87-92b6-4925f29aac7c) to file "/etc/sysconfig/network-scripts/ifcfg-bond99"
<debug> [1548051268.2503] ifcfg-rh: loading from file "/etc/sysconfig/network-scripts/ifcfg-bond99"...
Since it has autoconnect=yes, the device gets realized and a new
bond99 link is created:
<debug> [1548051268.2519] device[0x559e5da0ae80] (bond99): unmanaged: flags set to [platform-init,!sleeping=0x10/0x11/unmanaged/unrealized], set-managed [sleeping=0x1])
<trace> [1548051268.2519] dbus-object[0x559e5da0ae80]: export: "/org/freedesktop/NM/Devices/5"
<info> [1548051268.2522] manager: (bond99): new Bond device (/org/freedesktop/NM/Devices/5)
<debug> [1548051268.2523] device[0x559e5da0ae80] (bond99): create (is nm-owned)
<debug> [1548051268.2523] platform: link: adding link 'bond99' of type 'bond' (196609)
At this time the device is unmanaged due to "platform-init,user-conf":
<debug> [1548051268.2580] device[0x559e5da0ae80] (bond99): unmanaged: flags set to [platform-init,user-conf,!sleeping,!loopback=0x210/0x219/unmanaged/unrealized], set-unmanaged [user-conf=0x200])
<info> [1548051268.2587] audit: op="connection-add" uuid="e320e547-051a-4a87-92b6-4925f29aac7c" name="bond99" pid=4316 uid=0 result="success"
and so when a user activation request comes, it fails because the
device is platform-init unmanaged (which can't be overridden by user).
<debug> [1548051268.2637] active-connection[0x559e5d983930]: Failed to activate 'bond99': Connection 'bond99' is not available on the device bond99 at this time.
There are two problems here. First, NM should not create a link for a
software device that is declared as unmanaged in configuration. This
is the same issue as bug 1679230 and is already solved on master.
The second problem is that, even with the fix above, this race
condition can hit when a connection for a not-unmanaged software
device is added and then activated in short time. This problem can be
seen also in bug 1700528. I'm attaching a python reproducer for this.
Created attachment 1555788 [details]
Fix on review:
The issue is gone after building NM with the code here.
Merged to master:
Hi Beniamino Galvani,
Can we expect this bug been fixed in any version of Fedora?
I can include the fix in the next F30 update, would that be ok?
(In reply to Beniamino Galvani from comment #8)
> I can include the fix in the next F30 update, would that be ok?