Bug 1309899
Summary: | RHEL7.2: default route for vlan devices does not get added on boot | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jonathan Maxwell <jmaxwell> | ||||
Component: | NetworkManager | Assignee: | Thomas Haller <thaller> | ||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7.3 | CC: | bgalvani, dcbw, gerhard.stenzel, jmaxwell, lrintel, msugaya, ptalbert, rkhan, rmanes, sferguso, thaller, vbenes | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | NetworkManager-1.4.0-0.1.git20160606.b769b4df.el7 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1343609 (view as bug list) | Environment: | |||||
Last Closed: | 2016-11-03 19:07:25 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1343609 | ||||||
Attachments: |
|
Description
Jonathan Maxwell
2016-02-18 22:28:24 UTC
(In reply to Jonathan Maxwell from comment #0) > From my reproducer we have: > > # nmcli con show team0.619|grep ipv4 > ipv4.method: manual > ipv4.dns: it's slightly confusing that you added 3 connections with the same name. It's not really an issue, but I would clean that up. Hint: use `nmcli connection show` and `nmcli connection delete`. Could you please - enable DEBUG logging (see [1] how to set "logging" in /etc/NetworkManager/NetworkManager.conf) - reproduce the issue by rebooting and - show the logfile (journalctl -b 0 -u NetworkManager) Thank you!! [1] https://cgit.freedesktop.org/NetworkManager/NetworkManager/plain/contrib/fedora/rpm/NetworkManager.conf It looks like team0.619 does initially get a default route. Feb 18 19:35:14 ibm-hs21-04.lab.bos.redhat.com NetworkManager[634]: <debug> [1455842114.252271] [platform/nm-platform.c:2914] log_ip4_route(): platform: signal: route 4 added: 0.0.0.0/0 via 192.168.2.1 dev 7 metric 400 mss 0 src user scope global It appears the issue is triggered by the enslavement of enp6s0 to team0, which changes team0's MAC address and thus requires a change to team0.619's MAC address. Perhaps the default-route-manager doesn't cope as well with route deletions showing up later than it expects? Feb 18 19:35:14 ibm-hs21-04.lab.bos.redhat.com NetworkManager[634]: <info> (team0): enslaved team port enp6s0 Feb 18 19:35:14 ibm-hs21-04.lab.bos.redhat.com NetworkManager[634]: <debug> [1455842114.548095] [nm-default-route-manager.c:640] _entry_at_idx_update(): default-route4: entry[0/dev:0x7fb8e89308b0:team0.619:1:-sync]: record:update 0.0.0.0/0 via 192.168.2.1 dev 7 metric 400 mss 0 src user scope global (400) Feb 18 19:35:14 ibm-hs21-04.lab.bos.redhat.com NetworkManager[634]: <debug> [1455842114.627667] [platform/nm-platform.c:2914] log_ip4_route(): platform: signal: route 4 removed: 0.0.0.0/0 via 192.168.2.1 dev 7 metric 400 mss 0 src user scope global Feb 18 19:35:14 ibm-hs21-04.lab.bos.redhat.com NetworkManager[634]: <debug> [1455842114.637098] [nm-default-route-manager.c:661] _entry_at_idx_remove(): default-route4: entry[1/dev:0x7fb8e89308b0:team0.619:1:-sync]: record:remove 0.0.0.0/0 via 192.168.2.1 dev 7 metric 400 mss 0 src user scope global (400) Hey guys, I have another customer reporting the same issue with NM/team/VLAN and no gateway being assigned. Is there any movement here? Thank you, Patrick How about https://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=th/device-ip-config-on-link-up-rh1309899 ? (In reply to Thomas Haller from comment #11) > How about > https://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=th/device- > ip-config-on-link-up-rh1309899 ? (there is also th/device-ip-config-on-link-up-rh1309899 for nm-1-0 branch) Still need to do more testing... > device: improve logging when changing IP configuration + _LOGD (LOGD_IP6, "ip4-config: update (commit=%d, routes-full-sync=%d, new-config=%p)", LOGD_IP4 > device: restore IP configuration when link comes up I understand this is necessary for IPv6; but for IPv4, addresses and routes stay configured when the link goes down and so probably this is not needed? Also, NMDeviceVlan already does something similar in parent_hwaddr_maybe_changed(), I think after this commit that code can be removed. > platform: ensure refetching routes and link goes down s/and/when/ ? (In reply to Beniamino Galvani from comment #13) > > device: improve logging when changing IP configuration > > + _LOGD (LOGD_IP6, "ip4-config: update (commit=%d, routes-full-sync=%d, > new-config=%p)", > > LOGD_IP4 Fixed. > > device: restore IP configuration when link comes up > > I understand this is necessary for IPv6; but for IPv4, addresses and > routes stay configured when the link goes down and so probably this is > not needed? You are right. Maybe not needed, but does it hurt? There is obviously something going on on the device. > Also, NMDeviceVlan already does something similar in > parent_hwaddr_maybe_changed(), I think after this commit that code can > be removed. Maybe. But I am not confident that this really covers all conditions as parent_hwaddr_maybe_changed(). Is there a problem, if we leave them both? > > platform: ensure refetching routes and link goes down > > s/and/when/ ? fixed. Repushed. (In reply to Thomas Haller from comment #14) > > I understand this is necessary for IPv6; but for IPv4, addresses and > > routes stay configured when the link goes down and so probably this is > > not needed? > > You are right. Maybe not needed, but does it hurt? There is obviously > something going on on the device. If addresses and routes remain configured, it should do nothing, so it doesn't hurt. > > Also, NMDeviceVlan already does something similar in > > parent_hwaddr_maybe_changed(), I think after this commit that code can > > be removed. > > Maybe. But I am not confident that this really covers all conditions as > parent_hwaddr_maybe_changed(). Is there a problem, if we leave them both? I guess not. parent_hwaddr_maybe_changed() performs a reconfiguration because it has to bring the interface down to change the MAC, and device_link_changed() should cover this, but I'm not sure this will work in all cases (especially because of the condition priv->ip_state == IP_DONE). It's fine with me to leave both. > Repushed. Looks good. merged branch upstream master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=05010747b2818bcf08cf93f52f4ce24dc02d10c2 nm-1-2: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=0c3dc9d32621f4eefc87000720b6b39b0e97ab84 nm-1-0: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=e7c70c4ab2dffc15511d4977becfa4a2a7af8726 A scratch build for RHEL-7.2 + the fix would be here: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11021046 Any chance to confirm that this fixes the issue? Hi, I observed the same or similar problem on RHEL 7.2 with a bond/vlan configuration and would be able to verify. Thanks. But I don't have access to: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11021046 Here there is a new scratch-build as the previous expired: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11074512 (it is the same as from comment 17: it's a RHEL-7.2 build + the few relevant patches). @Gerhard: I also copied the x86_64 builds to https://people.redhat.com/~bgalvani/NM/rh1309899/ (I signed the archive with my GPG key 49EA 7C67 0E08 50E7 4195 14F6 29C2 366E 4DFC 5728). Thank you. Sorry for the inconvenience of letting you wait. Thanks. The situation has definitely improved but seems not completely fixed. The default route was added 17 times out of 20 reboots. (In reply to Gerhard Stenzel from comment #24) > Thanks. The situation has definitely improved but seems not completely > fixed. The default route was added 17 times out of 20 reboots. Could you attach a logfile of a failing case with TRACE level enabled? /etc/NetworkManager/NetworkManager.conf [logging] level=TRACE Thank you. Created attachment 1161060 [details]
/var/log/messages when adding default route fails
Done .. hope this helps.
(In reply to Gerhard Stenzel from comment #26) > Created attachment 1161060 [details] > /var/log/messages when adding default route fails > > Done .. hope this helps. Thank you, Gerhard. Very helpful. How about patch https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=th%2Fdevice-ip-config-on-link-up-rh1309899-pt2 ? A new version of the scratch build is here: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11101806 and again, RPMs are available here: https://people.redhat.com/~thaller/NM/rh1309899/ No fails in 20 reboots. So this looks fixed. Thank you. (In reply to Thomas Haller from comment #27) > How about patch > https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/ > ?id=th%2Fdevice-ip-config-on-link-up-rh1309899-pt2 ? Looks good to me. @Gerhard: thank you! pt-2 patch now also merged upstream: master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=63571b266634c6d8bbbb37f26e502c2df759fc65 nm-1-2: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=951013d1e1182510366fb0de088179a3303ea65d nm-1-0: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=nm-1-0 [root@amd-dinar-04 ~]# rpm -q NetworkManager NetworkManager-1.0.6-27.el7.x86_64 [root@amd-dinar-04 ~]# nmcli device DEVICE TYPE STATE CONNECTION enp1s0f0 ethernet connected team-slave team0 team connected team0 team0.619 vlan connected VLAN team0 enp1s0f1 ethernet disconnected -- lo loopback unmanaged -- [root@amd-dinar-04 ~]# ip r default via 10.16.47.254 dev team0 proto static metric 350 10.16.40.0/21 dev team0 proto kernel scope link src 10.16.42.37 metric 350 192.168.2.0/24 dev team0.619 proto kernel scope link src 192.168.2.10 [root@amd-dinar-04 ~]# nmcli connection show VLAN\ team0 connection.id: VLAN team0 connection.uuid: fe5270f4-79b0-4a3a-b3a4-ea2c0516e3d1 connection.interface-name: team0.619 connection.type: vlan connection.autoconnect: yes connection.autoconnect-priority: 0 connection.timestamp: 1474621817 connection.read-only: no connection.permissions: connection.zone: -- connection.master: -- connection.slave-type: -- connection.autoconnect-slaves: -1 (default) connection.secondaries: connection.gateway-ping-timeout: 0 connection.metered: unknown 802-3-ethernet.port: -- 802-3-ethernet.speed: 0 802-3-ethernet.duplex: -- 802-3-ethernet.auto-negotiate: yes 802-3-ethernet.mac-address: -- 802-3-ethernet.cloned-mac-address: -- 802-3-ethernet.mac-address-blacklist: 802-3-ethernet.mtu: auto 802-3-ethernet.s390-subchannels: 802-3-ethernet.s390-nettype: -- 802-3-ethernet.s390-options: 802-3-ethernet.wake-on-lan: 1 (default) 802-3-ethernet.wake-on-lan-password: -- ipv4.method: manual ipv4.dns: ipv4.dns-search: ipv4.addresses: 192.168.2.10/24 ipv4.gateway: 192.168.2.1 ipv4.routes: ipv4.route-metric: -1 ipv4.ignore-auto-routes: no ipv4.ignore-auto-dns: no ipv4.dhcp-client-id: -- ipv4.dhcp-send-hostname: yes ipv4.dhcp-hostname: -- ipv4.never-default: no ipv4.may-fail: yes ipv6.method: auto ipv6.dns: ipv6.dns-search: ipv6.addresses: ipv6.gateway: -- ipv6.routes: ipv6.route-metric: -1 ipv6.ignore-auto-routes: no ipv6.ignore-auto-dns: no ipv6.never-default: no ipv6.may-fail: yes ipv6.ip6-privacy: -1 (unknown) ipv6.dhcp-send-hostname: yes ipv6.dhcp-hostname: -- vlan.parent: team0 vlan.id: 619 vlan.flags: 0 (NONE) vlan.ingress-priority-map: vlan.egress-priority-map: GENERAL.NAME: VLAN team0 GENERAL.UUID: fe5270f4-79b0-4a3a-b3a4-ea2c0516e3d1 GENERAL.DEVICES: team0.619 GENERAL.STATE: activated GENERAL.DEFAULT: no GENERAL.DEFAULT6: no GENERAL.VPN: no GENERAL.ZONE: -- GENERAL.DBUS-PATH: /org/freedesktop/NetworkManager/ActiveConnection/2 GENERAL.CON-PATH: /org/freedesktop/NetworkManager/Settings/0 GENERAL.SPEC-OBJECT: / GENERAL.MASTER-PATH: -- IP4.ADDRESS[1]: 192.168.2.10/24 IP4.GATEWAY: 0.0.0.0 IP6.GATEWAY: and with the new one: [root@amd-dinar-04 ~]# rpm -q NetworkManager NetworkManager-1.4.0-10.el7.x86_64 [root@amd-dinar-04 ~]# ip r default via 10.16.47.254 dev team0 proto static metric 350 default via 192.168.2.1 dev team0.619 proto static metric 400 10.16.40.0/21 dev team0 proto kernel scope link src 10.16.42.37 metric 350 192.168.2.0/24 dev team0.619 proto kernel scope link src 192.168.2.10 192.168.2.0/24 dev team0.619 proto kernel scope link src 192.168.2.10 metric 400 [root@amd-dinar-04 ~]# nmcli connection show VLAN\ team0 connection.id: VLAN team0 connection.uuid: fe5270f4-79b0-4a3a-b3a4-ea2c0516e3d1 connection.stable-id: -- connection.interface-name: team0.619 connection.type: vlan connection.autoconnect: yes connection.autoconnect-priority: 0 connection.timestamp: 1474625285 connection.read-only: no connection.permissions: connection.zone: -- connection.master: -- connection.slave-type: -- connection.autoconnect-slaves: -1 (default) connection.secondaries: connection.gateway-ping-timeout: 0 connection.metered: unknown connection.lldp: -1 (default) 802-3-ethernet.port: -- 802-3-ethernet.speed: 0 802-3-ethernet.duplex: -- 802-3-ethernet.auto-negotiate: yes 802-3-ethernet.mac-address: -- 802-3-ethernet.cloned-mac-address: -- 802-3-ethernet.generate-mac-address-mask:-- 802-3-ethernet.mac-address-blacklist: 802-3-ethernet.mtu: auto 802-3-ethernet.s390-subchannels: 802-3-ethernet.s390-nettype: -- 802-3-ethernet.s390-options: 802-3-ethernet.wake-on-lan: 1 (default) 802-3-ethernet.wake-on-lan-password: -- ipv4.method: manual ipv4.dns: ipv4.dns-search: ipv4.dns-options: (default) ipv4.dns-priority: 0 ipv4.addresses: 192.168.2.10/24 ipv4.gateway: 192.168.2.1 ipv4.routes: ipv4.route-metric: -1 ipv4.ignore-auto-routes: no ipv4.ignore-auto-dns: no ipv4.dhcp-client-id: -- ipv4.dhcp-timeout: 0 ipv4.dhcp-send-hostname: yes ipv4.dhcp-hostname: -- ipv4.dhcp-fqdn: -- ipv4.never-default: no ipv4.may-fail: yes ipv4.dad-timeout: -1 (default) ipv6.method: auto ipv6.dns: ipv6.dns-search: ipv6.dns-options: (default) ipv6.dns-priority: 0 ipv6.addresses: ipv6.gateway: -- ipv6.routes: ipv6.route-metric: -1 ipv6.ignore-auto-routes: no ipv6.ignore-auto-dns: no ipv6.never-default: no ipv6.may-fail: yes ipv6.ip6-privacy: -1 (unknown) ipv6.addr-gen-mode: eui64 ipv6.dhcp-send-hostname: yes ipv6.dhcp-hostname: -- ipv6.token: -- vlan.parent: team0 vlan.id: 619 vlan.flags: 1 (REORDER_HEADERS) vlan.ingress-priority-map: vlan.egress-priority-map: GENERAL.NAME: VLAN team0 GENERAL.UUID: fe5270f4-79b0-4a3a-b3a4-ea2c0516e3d1 GENERAL.DEVICES: team0.619 GENERAL.STATE: activated GENERAL.DEFAULT: no GENERAL.DEFAULT6: no GENERAL.VPN: no GENERAL.ZONE: -- GENERAL.DBUS-PATH: /org/freedesktop/NetworkManager/ActiveConnection/2 GENERAL.CON-PATH: /org/freedesktop/NetworkManager/Settings/0 GENERAL.SPEC-OBJECT: / GENERAL.MASTER-PATH: -- IP4.ADDRESS[1]: 192.168.2.10/24 IP4.GATEWAY: 192.168.2.1 IP6.GATEWAY: Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2581.html |