Bug 1445499
| Summary: | team with link_watch = nsna_ping does not stay up | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Amit Supugade <asupugad> |
| Component: | libteam | Assignee: | Xin Long <lxin> |
| Status: | CLOSED ERRATA | QA Contact: | Rick Alongi <ralongi> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.4 | CC: | aiyengar, atragler, lxin, ralongi, rkhan, sukulkar |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libteam-1.27-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-10 18:48:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
(In reply to Amit Supugade from comment #0) > Description of problem: > team with link_watch = nsna_ping does not stay up. team interfaces tries to > come up it comes up for a short time and goes fails again > ... > 2: enp7s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > team0 portid 0100000000000000000000333135384643 state UP qlen 1000 > link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff > 3: enp7s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > team0 portid 0200000000000000000000333135384643 state UP qlen 1000 > link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff These two NICs have no any ipv6 link/local addrs, which causes no route for ns packet. I think they are managed by NM while NM has no connects for them. pls try with: 1. make them out of the NM's control # nmcli dev set enp7s0f0 managed no # nmcli dev set enp7s0f1 managed no or 2. let NM ignore their IPV6 # nmcli con add type ethernet ifname enp7s0f0 ipv6.method ignore # nmcli con add type ethernet ifname enp7s0f1 ipv6.method ignore or 3. just disable NM to do this test. Hi Xin,
I tried it by disabling NetworkManager and it still fails.
LOG-
[root@sam ~]# systemctl stop NetworkManager
[root@sam ~]# systemctl disable NetworkManager
Removed symlink /etc/systemd/system/multi-user.target.wants/NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service.
[root@sam ~]# systemctl status NetworkManager
● NetworkManager.service - Network Manager
Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; disabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2017-05-04 10:57:21 EDT; 10s ago
Docs: man:NetworkManager(8)
Main PID: 824 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/NetworkManager.service
└─884 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp5s0f0.pid -lf /var/l...
May 04 10:51:08 localhost.localdomain NetworkManager[824]: <info> [1493909468.2944] policy: set 'enp5s0f0' (enp...DNS
May 04 10:51:08 localhost.localdomain NetworkManager[824]: <info> [1493909468.3195] device (enp5s0f0): Activati...ed.
May 04 10:51:08 localhost.localdomain NetworkManager[824]: <info> [1493909468.3211] manager: NetworkManager sta...BAL
May 04 10:51:08 localhost.localdomain NetworkManager[824]: <info> [1493909468.3527] policy: set-hostname: set h...up)
May 04 10:51:09 sam.knqe.lab.eng.bos.redhat.com NetworkManager[824]: <info> [1493909469.0762] manager: startup c...te
May 04 10:51:10 sam.knqe.lab.eng.bos.redhat.com NetworkManager[824]: <info> [1493909470.2102] policy: set 'enp5s...NS
May 04 10:57:21 sam.knqe.lab.eng.bos.redhat.com systemd[1]: Stopping Network Manager...
May 04 10:57:21 sam.knqe.lab.eng.bos.redhat.com NetworkManager[824]: <info> [1493909841.8774] caught SIGTERM, sh...y.
May 04 10:57:21 sam.knqe.lab.eng.bos.redhat.com NetworkManager[824]: <info> [1493909841.9240] exiting (success)
May 04 10:57:21 sam.knqe.lab.eng.bos.redhat.com systemd[1]: Stopped Network Manager.
Hint: Some lines were ellipsized, use -l to show in full.
[root@sam ~]# port0=enp7s0f0
[root@sam ~]# port1=enp7s0f1
[root@sam ~]# teamd -d -t team0 -c '{ "runner" : { "name": "roundrobin" }, "link_watch" : { "name": "nsna_ping", "interval": 500, "target_host": "2001::254" } }'
This program is not intended to be run as root.
[root@sam ~]# ip link set team0 up
[root@sam ~]# teamdctl team0 port add $port0
[root@sam ~]# teamdctl team0 port add $port1
[root@sam ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether e4:11:5b:dd:e6:6c brd ff:ff:ff:ff:ff:ff
inet 10.19.15.26/24 brd 10.19.15.255 scope global dynamic enp5s0f0
valid_lft 85975sec preferred_lft 85975sec
inet6 2620:52:0:130b:e611:5bff:fedd:e66c/64 scope global noprefixroute dynamic
valid_lft 2591577sec preferred_lft 604377sec
inet6 fe80::e611:5bff:fedd:e66c/64 scope link
valid_lft forever preferred_lft forever
3: enp7s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 portid 0100000000000000000000333135384643 state UP qlen 1000
link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff
inet6 fe80::e4bb:31ff:fe93:ddf9/64 scope link tentative
valid_lft forever preferred_lft forever
4: enp5s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
link/ether e4:11:5b:dd:e6:6d brd ff:ff:ff:ff:ff:ff
5: enp7s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 portid 0200000000000000000000333135384643 state UP qlen 1000
link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff
inet6 fe80::290:faff:fe8a:5bfa/64 scope link tentative
valid_lft forever preferred_lft forever
7: team0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000
link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff
inet6 fe80::290:faff:fe8a:5bfa/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
I could see many "missed" happened in your env. Before starting teamd, can you check if "target_host" arrives ? in your host with: ndisc6 2001::254 enp7s0f0 ndisc6 2001::254 enp7s0f1 If it works well, I think there's something changing the ns/na packets when forwarding in your switch. can I check on your env, or pls provide the packet you captured on enp7s0f0 and enp7s0f1 ? Thanks. As we expected, the switch indeed did something different from linux:
1. ipv6 ns packet's dscp from your switch is 0xc which it is 0x0 in linux
2. ipv6 ns packet's source addr from your switch is the link/local addr (fe80::de38:e1ff:fe9c:4d41) instead of global target addr (2001::254) which it is global target addr (2001::254) in linux's
These two caused teamd to fail to validate ns and not process ns packet.
I will post upstream the following fix:
--- a/teamd/teamd_lw_nsna_ping.c
+++ b/teamd/teamd_lw_nsna_ping.c
@@ -247,11 +247,11 @@ static int lw_nsnap_receive(struct lw_psr_port_priv *psr_ppriv)
return err;
/* check IPV6 header */
- if (nap.ip6h.ip6_vfc != 0x60 /* IPV6 */ ||
+ if ((nap.ip6h.ip6_vfc & 0xf0) != 0x60 /* IPV6 */ ||
nap.ip6h.ip6_plen != htons(sizeof(nap) - sizeof(nap.ip6h)) ||
nap.ip6h.ip6_nxt != IPPROTO_ICMPV6 ||
nap.ip6h.ip6_hlim != 255 /* Do not route */ ||
- memcmp(&nap.ip6h.ip6_src, &nsnap_ppriv->dst.sin6_addr,
+ memcmp(&nap.nah.nd_na_target, &nsnap_ppriv->dst.sin6_addr,
sizeof(struct in6_addr)))
return 0;
btw, is your switch/router cisco or something else ?
Thanks.
upstream fix: https://github.com/jpirko/libteam/commit/9a9fbff3e75f78cbff76e9dbd1cfa0a05fd1f120 https://github.com/jpirko/libteam/commit/49c1de9b67a5a26f120294743d206f3a9286a314 Hi Xin, Test failed on RHEL-7.3. As we discussed, it could be because of switch software update. Removing Regression tag. Thanks! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1011 |
Description of problem: team with link_watch = nsna_ping does not stay up. team interfaces tries to come up it comes up for a short time and goes fails again Version-Release number of selected component (if applicable): kernel-3.10.0-656.el7.x86_64 libteam-1.25-5.el7.x86_64 teamd-1.25-5.el7.x86_64 How reproducible: Always Steps to Reproduce: port0=enp7s0f0 port1=enp7s0f1 teamd -d -t team0 -c '{ "runner" : { "name": "roundrobin" }, "link_watch" : { "name": "nsna_ping", "interval": 500, "target_host": "2001::254" } }' ip link set team0 up teamdctl team0 port add $port0 teamdctl team0 port add $port1 ip a Actual results: team0 does not stay up Expected results: team0 should stay up Additional info: [root@sam ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp7s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 portid 0100000000000000000000333135384643 state UP qlen 1000 link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff 3: enp7s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master team0 portid 0200000000000000000000333135384643 state UP qlen 1000 link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff 4: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether e4:11:5b:dd:e6:6c brd ff:ff:ff:ff:ff:ff inet 10.19.15.26/24 brd 10.19.15.255 scope global dynamic enp5s0f0 valid_lft 85452sec preferred_lft 85452sec inet6 2620:52:0:130b:e611:5bff:fedd:e66c/64 scope global noprefixroute dynamic valid_lft 2591958sec preferred_lft 604758sec inet6 fe80::e611:5bff:fedd:e66c/64 scope link valid_lft forever preferred_lft forever 5: enp5s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000 link/ether e4:11:5b:dd:e6:6d brd ff:ff:ff:ff:ff:ff 10: team0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000 link/ether 00:90:fa:8a:5b:fa brd ff:ff:ff:ff:ff:ff inet6 2001::290:faff:fe8a:5bfa/64 scope global mngtmpaddr dynamic valid_lft 2591796sec preferred_lft 604596sec inet6 fe80::290:faff:fe8a:5bfa/64 scope link valid_lft forever preferred_lft forever [root@sam ~]# teamdctl team0 state dump { "ports": { "enp7s0f0": { "ifinfo": { "dev_addr": "00:90:fa:8a:5b:fa", "dev_addr_len": 6, "ifindex": 2, "ifname": "enp7s0f0" }, "link": { "duplex": "half", "speed": 0, "up": true }, "link_watches": { "list": { "link_watch_0": { "down_count": 20, "init_wait": 0, "interval": 500, "missed": 6, "missed_max": 3, "name": "nsna_ping", "target_host": "2001::254", "up": false } }, "up": false } }, "enp7s0f1": { "ifinfo": { "dev_addr": "00:90:fa:8a:5b:fa", "dev_addr_len": 6, "ifindex": 3, "ifname": "enp7s0f1" }, "link": { "duplex": "half", "speed": 0, "up": true }, "link_watches": { "list": { "link_watch_0": { "down_count": 20, "init_wait": 0, "interval": 500, "missed": 6, "missed_max": 3, "name": "nsna_ping", "target_host": "2001::254", "up": false } }, "up": false } } }, "setup": { "daemonized": true, "dbus_enabled": false, "debug_level": 0, "kernel_team_mode_name": "roundrobin", "pid": 21443, "pid_file": "/var/run/teamd/team0.pid", "runner_name": "roundrobin", "zmq_enabled": false }, "team_device": { "ifinfo": { "dev_addr": "00:90:fa:8a:5b:fa", "dev_addr_len": 6, "ifindex": 10, "ifname": "team0" } } }