Bug 1505893
Summary: | [NMCI] show_zones_after_firewalld_install test failure | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Vladimir Benes <vbenes> |
Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> |
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.5 | CC: | aloughla, atragler, bgalvani, fgiudici, lmiksik, lrintel, rkhan, sukulkar, thaller, vbenes |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | NetworkManager-1.10.2-6.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-10 13:31:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Vladimir Benes
2017-10-24 13:57:20 UTC
I think this is a kernel issue. DNS requests are sent through eth1: [root@testhostname NetworkManager-ci]# ping download.eng.bos.redhat.com & [1] 21610 [root@testhostname NetworkManager-ci]# tcpdump -i eth0 -n tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 04:22:21.328980 IP6 fe80::527b:9dff:fed8:38e6 > ff02::1:ffd8:38e6: HBH ICMP6, multicast listener reportmax resp delay: 0 addr: ff02::1:ffd8:38e6, length 24 04:22:21.515952 IP6 fe80::527b:9dff:fed8:30fe > ff02::1:ffd8:30fe: HBH ICMP6, multicast listener reportmax resp delay: 0 addr: ff02::1:ffd8:30fe, length 24 ... [root@testhostname NetworkManager-ci]# tcpdump -i eth1 -n -xx tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes 04:22:51.270410 IP 10.16.122.90.52940 > 10.16.36.29.domain: 41986+ A? download.eng.bos.redhat.com.wlan.rhts.eng.bos.redhat.com. (74) 0x0000: 829d 6753 84e0 f011 2233 4455 0800 4500 0x0010: 0066 4b53 4000 4011 3c9d 0a10 7a5a 0a10 0x0020: 241d cecc 0035 0052 b2fa a402 0100 0001 0x0030: 0000 0000 0000 0864 6f77 6e6c 6f61 6403 0x0040: 656e 6703 626f 7306 7265 6468 6174 0363 0x0050: 6f6d 0477 6c61 6e04 7268 7473 0365 6e67 0x0060: 0362 6f73 0672 6564 6861 7403 636f 6d00 0x0070: 0001 0001 The source address belongs to another interface and also, the default route is through eth0: [root@testhostname NetworkManager-ci]# ip r default via 10.16.122.254 dev eth0 proto dhcp metric 100 default via 192.168.100.1 dev eth1 proto dhcp metric 100 10.16.122.0/24 dev eth0 proto kernel scope link src 10.16.122.90 metric 100 192.168.100.0/24 dev eth1 proto kernel scope link src 192.168.100.20 metric 100 [root@testhostname NetworkManager-ci]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 50:7b:9d:d8:38:e6 brd ff:ff:ff:ff:ff:ff inet 10.16.122.90/24 brd 10.16.122.255 scope global noprefixroute dynamic eth0 valid_lft 85905sec preferred_lft 85905sec inet6 2620:52:0:107a:527b:9dff:fed8:38e6/64 scope global noprefixroute dynamic valid_lft 2591848sec preferred_lft 604648sec inet6 fe80::527b:9dff:fed8:38e6/64 scope link noprefixroute valid_lft forever preferred_lft forever 10: eth1@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000 link/ether f0:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.100.20/24 brd 192.168.100.255 scope global noprefixroute dynamic eth1 valid_lft 217sec preferred_lft 217sec inet6 fe80::7232:87a3:eb24:325e/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@testhostname NetworkManager-ci]# ip n 192.168.100.1 dev eth1 lladdr 82:9d:67:53:84:e0 REACHABLE 10.16.122.254 dev eth0 lladdr 28:8a:1c:09:a5:c1 STALE fe80:52:0:107a::fe dev eth0 lladdr 28:8a:1c:09:a5:c1 router STALE fe80::1875:2aff:fe6a:ce23 dev eth10 lladdr 1a:75:2a:6a:ce:23 router STALE The kernel is 3.10.0-730.el7.sgruszka1.x86_64. I'll try with a more recent kernel. Hi Vladimir, the problem doesn't seem to happen with kernel 3.10.0-783. Can you update CI scripts to use a more recent kernel? When the connection is added on eth1 we get the following default routes: default via 10.16.122.254 dev eth0 proto dhcp metric 100 default via 192.168.100.1 dev eth1 proto dhcp metric 100 while in the past the device activated before got a lower metric: default via 10.16.122.254 dev eth0 proto dhcp metric 100 default via 192.168.100.1 dev eth1 proto dhcp metric 101 The change is the result of commit [1] that removed the default-route-manager and started to add default routes without tweaking the metric. The previous behavior is described in [2]. The effect of having multiple routes with the same metric is that ECMP (multi-path routing) is used and packets flow through a gateway or the other based on a layer-3 hash. In the test scenario, eth1 is added as default route but doesn't actually routes packets and this leads to the test failure. We can easily fix the test by specifying a higher metric for eth1 (or disabling the default route for eth1), but I wonder if the change in behavior is acceptable. [1] https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=77ec302714795f905301d500b9aab6c88001f32e [2] https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=e8824f6a5205ffcf761abd3e0897a22b254c7797 how about th/device-route-metric-rh1505893 ? LGTM (didn't test). merged upstream. master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=b2273ce3dd13c467be2f5b7bca82ee97b11cf805 nm-1-10: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=1cfbfde38d09885c44482b10d4ef1ec22c99d463 working on altered branch with tests Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0778 |