Bug 1376199
Summary: | stalled eth1.80 vlan after restart and connection delete | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Vladimir Benes <vbenes> | ||||
Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> | ||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.3 | CC: | aloughla, atragler, bgalvani, lmiksik, lrintel, mleitner, rkhan, sukulkar, thaller | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | NetworkManager-1.8.0-7.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-08-01 09:17:07 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Vladimir Benes
2016-09-14 20:30:04 UTC
Created attachment 1200967 [details]
logz
Confirmed: Red Hat Enterprise Linux Server release 7.3 Beta (Maipo) NetworkManager 1.4.0-0.5.beta1.el7 after running # nmcli con del vlan # nmcli device results: DEVICE TYPE STATE CONNECTION p6p1.80 vlan disconnected -- ------------- with Red Hat Enterprise Linux Server release 7.2 NetworkManager 1.0.6-27.el7 the device was deleted (In reply to Aniss from comment #2) > ------------- > with > Red Hat Enterprise Linux Server release 7.2 > NetworkManager 1.0.6-27.el7 > > the device was deleted not really, I missed a step here (systemctl restart NetworkManager). I tried it again and I got: DEVICE TYPE STATE CONNECTION em1.80 vlan disconnected -- Hi, in this scenario NM tries to fulfill these two goals: - keeping the connection up when NM is stopped, to avoid breaking connectivity - don't destroy software devices that already existed when NM started If these two are satisfied, the result is exactly what you see, that after a restart NM finds a pre-existing vlan device and will not delete it upon disconnect. We have planned to rework how the connection assumption works and that change will probably improve this scenario; see bug [1] for more details. For now I propose to close this, as NM is behaving as expected. [1] https://bugzilla.gnome.org/show_bug.cgi?id=746440 Now that we have a state file to persist the device state on daemon restart, we could save there whether the device was created by NM or not, and do the right thing after restart. Implementation in branch: bg/nm-owned-persist-rh1376199 Please review. I dislike a bit that there is nm_device_set_nm_owned(), so the device gets fully realized, and only then we set the flag. How about nm_device_realize_start() and nm_device_create_and_realize() having an argument "nm_owned", and the caller (NMManager) determines for the device whether it is nm-owned -- and it should do so very early when realizing the device. (In reply to Thomas Haller from comment #6) > How about nm_device_realize_start() and nm_device_create_and_realize() > having an argument "nm_owned", and the caller (NMManager) determines for the > device whether it is nm-owned -- and it should do so very early when > realizing the device. These functions are called from 6 different places and I prefer not to patch all those to load the state. Instead, how about setting nm-owned in realize_start_setup(), which is called by both functions? Repushed branch bg/nm-owned-persist-rh1376199. yeah, the place is good. could we move it a bit up, I think it should set as early as possible. Maybe immediately after _add_capabilities() (because we check for NM_DEVICE_CAP_IS_SOFTWARE). Otherwise lgtm. This might fix CI failure bug 1452062. Will test tomorrow. (In reply to Thomas Haller from comment #8) > yeah, the place is good. > > could we move it a bit up, I think it should set as early as possible. > Maybe immediately after _add_capabilities() (because we check for > NM_DEVICE_CAP_IS_SOFTWARE). Branch bg/nm-owned-persist-rh1376199 updated. lgtm Merged to master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=3223d92eeaf704f0bed774610f5935b8fcfb1adb (In reply to Beniamino Galvani from comment #11) > Merged to master: > > https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/ > ?id=3223d92eeaf704f0bed774610f5935b8fcfb1adb I did a related follow-up patch (merged to master as https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=d83848be9dfd0edb5f318b81854b371133d84f6e ) I also backported bg/nm-owned-persist-rh1376199 branch + the follow-up patch to nm-1-8, as: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=de1c460e586be65f5549c5d705a10888d5f1baae https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=8e25de8ab360fc973d7222685f107b81dd872dc1 please see https://bugzilla.redhat.com/show_bug.cgi?id=1452062#c7 for rhel-7.4 backport. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2299 |