Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1394579 - improve handling of unmanaged/assumed devices
improve handling of unmanaged/assumed devices
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager (Show other bugs)
7.4
Unspecified Unspecified
urgent Severity medium
: rc
: ---
Assigned To: Thomas Haller
Desktop QE
Ioanna Gkioka
:
: 1400411 (view as bug list)
Depends On:
Blocks: 1393481 1428406
  Show dependency treegraph
 
Reported: 2016-11-13 17:19 EST by Thomas Haller
Modified: 2017-08-01 05:19 EDT (History)
13 users (show)

See Also:
Fixed In Version: NetworkManager-1.8.0-0.4.rc1.el7
Doc Type: Enhancement
Doc Text:
*NetworkManager* now better handles devices state With this update, *NetworkManager* now maintains the state of devices after the service restart and takes over interfaces which are set into managed mode during restart. In addition, *NetworkManager* can handle devices which are not explicitly set as unmanaged but controlled manually by the user or another network service.
Story Points: ---
Clone Of:
: 1428406 (view as bug list)
Environment:
Last Closed: 2017-08-01 05:19:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
GNOME Bugzilla 746440 None None None 2016-11-13 17:19 EST
Red Hat Product Errata RHSA-2017:2299 normal SHIPPED_LIVE Moderate: NetworkManager and libnl3 security, bug fix and enhancement update 2017-08-01 08:40:28 EDT

  None (edit)
Description Thomas Haller 2016-11-13 17:19:34 EST
On start/restart, NetworkManager "assumes" a connection on the device.

That is just wrong and causes many issues.

This should be fixed, but is a large effort -- and changing previous behavior.

For details, see upstream bug https://bugzilla.gnome.org/show_bug.cgi?id=746440
Comment 1 Thomas Haller 2016-12-01 05:35:08 EST
*** Bug 1400411 has been marked as a duplicate of this bug. ***
Comment 2 Thomas Haller 2017-03-16 13:39:56 EDT
merged https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=2d1b85f8d7f1e53b581e56f0f542b63e8a80da98 upstream.

This may not yet be the full solution, I think we should separate externally managed devices better and making assuming connections more flexible. But it's another step, and it's all that can be done withint rhel-7.4 time frame.

And it *does* "improve handling of unmanaged/assumed devices" already.
Comment 4 Thomas Haller 2017-03-17 10:40:16 EDT
The change touches areas that were not well defined in NetworkManager or where it would not behave optimally. A lot of that was not properly covered by tests, so as a base requirement, I would be already happy if all old tests succeeds (or get adjusted to what we identify as new, desired, improved behavior).


What matters is start (the first time) and restart of NetworkManager.

Previously, when NM finds an already configured interface, it would try to "assume" a connection. It did so for external devices (virbr0) and for devices that are taken over after a restart.
Now, there is a clear distinction between
  "external" (like virbr0). NM now would always create a new in-memory 
    connection and pretend that to be active. It's important that in that mode, 
    NM would not touch the interface at all.
  "assumed" this means, to gracefully take over an already configured interface.
    Currently, that only works after a restart (not start first time) where NM
    would write to /var/run/NetworkManager/devices/<IFINDEX> which connection to
    assume. It would then try to assume that connection, or fallback to 
    "external". "assume" means to gracefully take over device. After the 
    "assumed" activation reaches ACTIVATED state, it becomes identical
    to "managed". The "assumed" distinction only matters initially during 
    activation.
  "managed": if NM cannot assume|external the device, it manages it. Meaning:
    it will try to autoconnect an existing connection. This especially happens 
    when the device has no IP configuration yet so that "external" doesn't 
    apply.

So, it's all about starting/restarting NM. A first-start is different from a restart in that there is no state in /var/run/NetworkManager directory. Simulate first-start by removing that directory before starting NM.


Interesting tests are:

 - have an external interface and start NM. See that the device is in "external" 
   mode and not touched by NM. We already have tests for that. E.g. no DHCP for
   this interface, addresses/routes are preserved.
 - have a device managed by NM and restart NM. See that the connection gets
   assumed (non-destructively) and is afterwards fully managed by NM (e.g. DHCP 
   leases get extended).
 - it gets more interesting when starting with nested slave/master hierarchies 
   (bond/vlan/bridge/team). When all interfaces are "external", we would expect
   that NM activates external connections on all of them and does not touch the
   interfaces at all. If they were all activated in a previous run, we would
   expect that NM assumes them all and managed them all full.
   More complicated it gets when a the decision external/assumed/managed differs
   between master/slaves. I suspect there are bugs in this regard that we have
   to figure out.
 - what happens when setting an interfaces
   `nmcli device set $IF managed yes|no`? Does that work as one would expect?
   Also, we now persist the managed state in /var/run/NetworkManager. That 
   means, the managed state is preserved after restart of NM (but not across 
   reboot).
   Especially interesting, what happens if you set a device as unmanaged that
   is "external"? Unclear what is even desired. See bug 1371433.
   How does it work when setting a master/slave as unmanaged?
   What happens when having a set of unmanaged master/slaves devices, and then
   managing one of them?
 - activate another connection on an "external" managed device.
 - modify the generated, in-memory connection of an external device. That causes
   the connection to be persisted. It's unclear what should happen with the 
   device. Probably we should now assume it (gracefully).



Please open new bugs for each defect you find and let's keep this as tracker bug for the individual issues.
Comment 6 Vladimir Benes 2017-06-08 08:06:39 EDT
I think we have a lot of scenario covered and also fixed so let's wait for real world usage failure as I don't see any improvements to be done now.
Comment 7 errata-xmlrpc 2017-08-01 05:19:37 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2299

Note You need to log in before you can comment on or make changes to this bug.