Bug 1667874 - Bond activation fails when autoconnect is set to true (using libnm)
Summary: Bond activation fails when autoconnect is set to true (using libnm)
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: NetworkManager
Version: 8.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.1
Assignee: Beniamino Galvani
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks: 1689408 1701002
TreeView+ depends on / blocked
 
Reported: 2019-01-21 10:30 UTC by Edward Haas
Modified: 2019-07-09 11:21 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
Journal output of bond99 failure to activate (480.07 KB, text/plain)
2019-01-21 10:30 UTC, Edward Haas
no flags Details
Python reproducer (1.69 KB, text/x-python)
2019-04-17 08:48 UTC, Beniamino Galvani
no flags Details

Description Edward Haas 2019-01-21 10:30:30 UTC
Created attachment 1522090 [details]
Journal output of bond99 failure to activate

Description of problem:
nmstate is using libnm to configure a bond profile and activate it.
When using autoconnect=True, the bond fails activation with the following error:

error=nm-manager-error-quark: Connection 'bond99' is not available on the device bond99 at this time. (2)

Version-Release number of selected component (if applicable):
Tested on CentOS container with NM 1.12.0-8.el7_6

How reproducible:
Run nmstate integration on https://github.com/nmstate/nmstate/pull/239/commits/8eec4329a8a469b682cc7b4cca9c86392e8752aa

Steps to Reproduce:
1. 
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Beniamino Galvani 2019-04-17 08:47:28 UTC
Hi,

here the new connection gets added

  <trace> [1548051268.2500] ifcfg-rh: write: write connection bond99 (e320e547-051a-4a87-92b6-4925f29aac7c) to file "/etc/sysconfig/network-scripts/ifcfg-bond99"
  <debug> [1548051268.2503] ifcfg-rh: loading from file "/etc/sysconfig/network-scripts/ifcfg-bond99"...

Since it has autoconnect=yes, the device gets realized and a new
bond99 link is created:

  <debug> [1548051268.2519] device[0x559e5da0ae80] (bond99): unmanaged: flags set to [platform-init,!sleeping=0x10/0x11/unmanaged/unrealized], set-managed [sleeping=0x1])
  <trace> [1548051268.2519] dbus-object[0x559e5da0ae80]: export: "/org/freedesktop/NM/Devices/5"
  <info>  [1548051268.2522] manager: (bond99): new Bond device (/org/freedesktop/NM/Devices/5)
  <debug> [1548051268.2523] device[0x559e5da0ae80] (bond99): create (is nm-owned)
  <debug> [1548051268.2523] platform: link: adding link 'bond99' of type 'bond' (196609)

At this time the device is unmanaged due to "platform-init,user-conf":

  <debug> [1548051268.2580] device[0x559e5da0ae80] (bond99): unmanaged: flags set to [platform-init,user-conf,!sleeping,!loopback=0x210/0x219/unmanaged/unrealized], set-unmanaged [user-conf=0x200])
  <info>  [1548051268.2587] audit: op="connection-add" uuid="e320e547-051a-4a87-92b6-4925f29aac7c" name="bond99" pid=4316 uid=0 result="success"

and so when a user activation request comes, it fails because the
device is platform-init unmanaged (which can't be overridden by user).

  <debug> [1548051268.2637] active-connection[0x559e5d983930]: Failed to activate 'bond99': Connection 'bond99' is not available on the device bond99 at this time.

There are two problems here. First, NM should not create a link for a
software device that is declared as unmanaged in configuration. This
is the same issue as bug 1679230 and is already solved on master.

The second problem is that, even with the fix above, this race
condition can hit when a connection for a not-unmanaged software
device is added and then activated in short time. This problem can be
seen also in bug 1700528. I'm attaching a python reproducer for this.

Comment 3 Beniamino Galvani 2019-04-17 08:48:52 UTC
Created attachment 1555788 [details]
Python reproducer

Comment 4 Beniamino Galvani 2019-05-13 15:23:25 UTC
Fix on review:

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/merge_requests/144

Comment 5 Vladimir Benes 2019-05-14 11:56:36 UTC
The issue is gone after building NM with the code here.

Comment 7 Gris Ge 2019-05-29 15:22:42 UTC
Hi Beniamino Galvani,

Can we expect this bug been fixed in any version of Fedora?

Thank you.

Comment 8 Beniamino Galvani 2019-05-29 16:03:37 UTC
I can include the fix in the next F30 update, would that be ok?

Comment 9 Gris Ge 2019-05-31 15:17:48 UTC
(In reply to Beniamino Galvani from comment #8)
> I can include the fix in the next F30 update, would that be ok?

Yes. Thanks.


Note You need to log in before you can comment on or make changes to this bug.