Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1376199 - stalled eth1.80 vlan after restart and connection delete
stalled eth1.80 vlan after restart and connection delete
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager (Show other bugs)
7.3
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Beniamino Galvani
Desktop QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-14 16:30 EDT by Vladimir Benes
Modified: 2017-08-01 05:17 EDT (History)
9 users (show)

See Also:
Fixed In Version: NetworkManager-1.8.0-7.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 05:17:07 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logz (188.96 KB, text/plain)
2016-09-14 16:31 EDT, Vladimir Benes
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2299 normal SHIPPED_LIVE Moderate: NetworkManager and libnl3 security, bug fix and enhancement update 2017-08-01 08:40:28 EDT

  None (edit)
Description Vladimir Benes 2016-09-14 16:30:04 EDT
Description of problem:
stalled vlan after this test scenario:

nmcli connection add type vlan con-name vlan dev eth1 id 80
nmcli connection modify vlan eth.mtu 1450 ipv4.method manual ipv4.addresses 1.2.3.4/24
nmcli connection up id testeth1
nmcli con up id vlan
systemctl restart NetworkManager
nmcli connection
nmcli con del vlan
nmcli device

result:
eth1.80  vlan      unmanaged     --    

Version-Release number of selected component (if applicable):
NetworkManager-1.4.0-6.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1.see above

Actual results:
stalled device

Expected results:
device should be deleted

Additional info:
log attached
Comment 1 Vladimir Benes 2016-09-14 16:31 EDT
Created attachment 1200967 [details]
logz
Comment 2 Aniss 2016-09-15 11:36:40 EDT
Confirmed:

Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
NetworkManager 1.4.0-0.5.beta1.el7

after running
# nmcli con del vlan
# nmcli device
results:
DEVICE   TYPE      STATE         CONNECTION
p6p1.80  vlan      disconnected  --  
-------------
with
Red Hat Enterprise Linux Server release 7.2 
NetworkManager 1.0.6-27.el7

the device was deleted
Comment 3 Aniss 2016-09-15 11:42:05 EDT
(In reply to Aniss from comment #2)

> -------------
> with
> Red Hat Enterprise Linux Server release 7.2 
> NetworkManager 1.0.6-27.el7
> 
> the device was deleted
not really, I missed a step here (systemctl restart NetworkManager). I tried it again and I got:
DEVICE   TYPE      STATE         CONNECTION
em1.80  vlan      disconnected  --
Comment 4 Beniamino Galvani 2016-09-21 10:26:04 EDT
Hi, in this scenario NM tries to fulfill these two goals:

 - keeping the connection up when NM is stopped, to avoid breaking
   connectivity

 - don't destroy software devices that already existed when NM started

If these two are satisfied, the result is exactly what you see, that
after a restart NM finds a pre-existing vlan device and will not
delete it upon disconnect.

We have planned to rework how the connection assumption works and that
change will probably improve this scenario; see bug [1] for more details.

For now I propose to close this, as NM is behaving as expected.

[1] https://bugzilla.gnome.org/show_bug.cgi?id=746440
Comment 5 Beniamino Galvani 2017-05-30 12:46:37 EDT
Now that we have a state file to persist the device state on daemon restart, we could save there whether the device was created by NM or not, and do the right thing after restart. Implementation in branch:

 bg/nm-owned-persist-rh1376199

Please review.
Comment 6 Thomas Haller 2017-05-31 05:19:20 EDT
I dislike a bit that there is nm_device_set_nm_owned(), so the device gets fully realized, and only then we set the flag.

How about nm_device_realize_start() and nm_device_create_and_realize() having an argument "nm_owned", and the caller (NMManager) determines for the device whether it is nm-owned -- and it should do so very early when realizing the device.
Comment 7 Beniamino Galvani 2017-06-05 03:29:38 EDT
(In reply to Thomas Haller from comment #6)
> How about nm_device_realize_start() and nm_device_create_and_realize()
> having an argument "nm_owned", and the caller (NMManager) determines for the
> device whether it is nm-owned -- and it should do so very early when
> realizing the device.

These functions are called from 6 different places and I prefer not to
patch all those to load the state. Instead, how about setting nm-owned
in realize_start_setup(), which is called by both functions? Repushed
branch bg/nm-owned-persist-rh1376199.
Comment 8 Thomas Haller 2017-06-06 13:38:00 EDT
yeah, the place is good.

could we move it a bit up, I think it should set as early as possible.
Maybe immediately after _add_capabilities() (because we check for NM_DEVICE_CAP_IS_SOFTWARE).

Otherwise lgtm. This might fix CI failure bug 1452062. Will test tomorrow.
Comment 9 Beniamino Galvani 2017-06-07 02:33:03 EDT
(In reply to Thomas Haller from comment #8)
> yeah, the place is good.
> 
> could we move it a bit up, I think it should set as early as possible.
> Maybe immediately after _add_capabilities() (because we check for
> NM_DEVICE_CAP_IS_SOFTWARE).

Branch bg/nm-owned-persist-rh1376199 updated.
Comment 10 Thomas Haller 2017-06-07 03:53:10 EDT
lgtm
Comment 12 Thomas Haller 2017-06-08 16:06:06 EDT
(In reply to Beniamino Galvani from comment #11)
> Merged to master:
> 
> https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/
> ?id=3223d92eeaf704f0bed774610f5935b8fcfb1adb

I did a related follow-up patch (merged to master as https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=d83848be9dfd0edb5f318b81854b371133d84f6e )

I also backported bg/nm-owned-persist-rh1376199 branch + the follow-up patch to nm-1-8, as:

https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=de1c460e586be65f5549c5d705a10888d5f1baae
https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=8e25de8ab360fc973d7222685f107b81dd872dc1
Comment 13 Thomas Haller 2017-06-09 04:02:28 EDT
please see https://bugzilla.redhat.com/show_bug.cgi?id=1452062#c7 for rhel-7.4 backport.
Comment 15 errata-xmlrpc 2017-08-01 05:17:07 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2299

Note You need to log in before you can comment on or make changes to this bug.