|Summary:||stalled eth1.80 vlan after restart and connection delete|
|Product:||Red Hat Enterprise Linux 7||Reporter:||Vladimir Benes <vbenes>|
|Component:||NetworkManager||Assignee:||Beniamino Galvani <bgalvani>|
|Status:||CLOSED ERRATA||QA Contact:||Desktop QE <desktop-qa-list>|
|Version:||7.3||CC:||aloughla, atragler, bgalvani, lmiksik, lrintel, mleitner, rkhan, sukulkar, thaller|
|Fixed In Version:||NetworkManager-1.8.0-7.el7||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|Last Closed:||2017-08-01 09:17:07 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Vladimir Benes 2016-09-14 20:30:04 UTC
Description of problem: stalled vlan after this test scenario: nmcli connection add type vlan con-name vlan dev eth1 id 80 nmcli connection modify vlan eth.mtu 1450 ipv4.method manual ipv4.addresses 220.127.116.11/24 nmcli connection up id testeth1 nmcli con up id vlan systemctl restart NetworkManager nmcli connection nmcli con del vlan nmcli device result: eth1.80 vlan unmanaged -- Version-Release number of selected component (if applicable): NetworkManager-1.4.0-6.el7.x86_64 How reproducible: always Steps to Reproduce: 1.see above Actual results: stalled device Expected results: device should be deleted Additional info: log attached
Comment 2 Aniss Loughlam 2016-09-15 15:36:40 UTC
Confirmed: Red Hat Enterprise Linux Server release 7.3 Beta (Maipo) NetworkManager 1.4.0-0.5.beta1.el7 after running # nmcli con del vlan # nmcli device results: DEVICE TYPE STATE CONNECTION p6p1.80 vlan disconnected -- ------------- with Red Hat Enterprise Linux Server release 7.2 NetworkManager 1.0.6-27.el7 the device was deleted
Comment 3 Aniss Loughlam 2016-09-15 15:42:05 UTC
(In reply to Aniss from comment #2) > ------------- > with > Red Hat Enterprise Linux Server release 7.2 > NetworkManager 1.0.6-27.el7 > > the device was deleted not really, I missed a step here (systemctl restart NetworkManager). I tried it again and I got: DEVICE TYPE STATE CONNECTION em1.80 vlan disconnected --
Comment 4 Beniamino Galvani 2016-09-21 14:26:04 UTC
Hi, in this scenario NM tries to fulfill these two goals: - keeping the connection up when NM is stopped, to avoid breaking connectivity - don't destroy software devices that already existed when NM started If these two are satisfied, the result is exactly what you see, that after a restart NM finds a pre-existing vlan device and will not delete it upon disconnect. We have planned to rework how the connection assumption works and that change will probably improve this scenario; see bug  for more details. For now I propose to close this, as NM is behaving as expected.  https://bugzilla.gnome.org/show_bug.cgi?id=746440
Comment 5 Beniamino Galvani 2017-05-30 16:46:37 UTC
Now that we have a state file to persist the device state on daemon restart, we could save there whether the device was created by NM or not, and do the right thing after restart. Implementation in branch: bg/nm-owned-persist-rh1376199 Please review.
Comment 6 Thomas Haller 2017-05-31 09:19:20 UTC
I dislike a bit that there is nm_device_set_nm_owned(), so the device gets fully realized, and only then we set the flag. How about nm_device_realize_start() and nm_device_create_and_realize() having an argument "nm_owned", and the caller (NMManager) determines for the device whether it is nm-owned -- and it should do so very early when realizing the device.
Comment 7 Beniamino Galvani 2017-06-05 07:29:38 UTC
(In reply to Thomas Haller from comment #6) > How about nm_device_realize_start() and nm_device_create_and_realize() > having an argument "nm_owned", and the caller (NMManager) determines for the > device whether it is nm-owned -- and it should do so very early when > realizing the device. These functions are called from 6 different places and I prefer not to patch all those to load the state. Instead, how about setting nm-owned in realize_start_setup(), which is called by both functions? Repushed branch bg/nm-owned-persist-rh1376199.
Comment 8 Thomas Haller 2017-06-06 17:38:00 UTC
yeah, the place is good. could we move it a bit up, I think it should set as early as possible. Maybe immediately after _add_capabilities() (because we check for NM_DEVICE_CAP_IS_SOFTWARE). Otherwise lgtm. This might fix CI failure bug 1452062. Will test tomorrow.
Comment 9 Beniamino Galvani 2017-06-07 06:33:03 UTC
(In reply to Thomas Haller from comment #8) > yeah, the place is good. > > could we move it a bit up, I think it should set as early as possible. > Maybe immediately after _add_capabilities() (because we check for > NM_DEVICE_CAP_IS_SOFTWARE). Branch bg/nm-owned-persist-rh1376199 updated.
Comment 10 Thomas Haller 2017-06-07 07:53:10 UTC
Comment 11 Beniamino Galvani 2017-06-07 08:31:56 UTC
Comment 12 Thomas Haller 2017-06-08 20:06:06 UTC
(In reply to Beniamino Galvani from comment #11) > Merged to master: > > https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/ > ?id=3223d92eeaf704f0bed774610f5935b8fcfb1adb I did a related follow-up patch (merged to master as https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=d83848be9dfd0edb5f318b81854b371133d84f6e ) I also backported bg/nm-owned-persist-rh1376199 branch + the follow-up patch to nm-1-8, as: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=de1c460e586be65f5549c5d705a10888d5f1baae https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=8e25de8ab360fc973d7222685f107b81dd872dc1
Comment 13 Thomas Haller 2017-06-09 08:02:28 UTC
please see https://bugzilla.redhat.com/show_bug.cgi?id=1452062#c7 for rhel-7.4 backport.
Comment 15 errata-xmlrpc 2017-08-01 09:17:07 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2299