1450205 – Gratuitous ARP updates received in span of 2-3 seconds time frame are all ignored

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1450205 - Gratuitous ARP updates received in span of 2-3 seconds time frame are all ignored

Summary: Gratuitous ARP updates received in span of 2-3 seconds time frame are all ign...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	7.3
Assignee:	Eric Garver
QA Contact:	Jianlin Shi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1438662
TreeView+	depends on / blocked

Reported:	2017-05-11 19:55 UTC by Ihar Hrachyshka
Modified:	2020-09-10 10:33 UTC (History)
CC List:	6 users (show)
Fixed In Version:	kernel-3.10.0-764.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1554608 (view as bug list)
Environment:
Last Closed:	2018-04-10 20:01:25 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:1062	0	None	None	None	2018-04-10 20:03:05 UTC

Description Ihar Hrachyshka 2017-05-11 19:55:31 UTC

OpenStack Neutron L3 agent sends 3 gratuitous ARP replies when moving a "floating" IP address from one device to another. When Linux kernel receives the first, it usually correctly processes it, updating an existing ARP entry with the new lladdr. Then the second reply is usually ignored because they happen in span of locktime interval since the previous update. The third one is also ignored because while the second reply is generally ignored, it still triggers bump of neigh->updated field that is used to determine if an ARP frame is in locktime interval. Since the first reply was honoured, that doesn't render a problem.

The problem happens when the first gratuitous ARP is ignored because it's also in locktime interval. This may happen either because another ARP reply arrived just prior to the gratuitous one (correct) or if kernel transitioned the ARP table entry to another state just before the reply was received. Any state transition triggers neigh->updated bump. If such a state transition happens, all of gratuitous updates will be ignored, and then ARP entry will be left with wrong old MAC address. After delay time (which is 5s by default), kernel will usually issue an ARP probe that will hopefully heal the ARP entry. While that's mostly ok, we just wasted 5s of service availability + ARP probe round-trip.

The problem is aggravated by the fact that kernel sometimes proliferate incorrect ARP entries without issuing a single probe in an unfortunate scenario, f.e. see: https://bugzilla.redhat.com/show_bug.cgi?id=1450203

Version-Release number of selected component (if applicable): 3.10.0-514.22.1.el7

How reproducible: always.

Steps to Reproduce:
issue 3 gratuitous ARP replies right after corresponding ARP entry transitioned STALE->DELAY. Observe that not a single reply is honored.

Actual results: not a single consequent gratuitous ARP triggers an update in local ARP table if the first one arrives just before entry state changed.

Expected results: Ideally, the first reply is honoured, because we haven't received any ARP reply before, so locktime should not be effective. At the very least, second reply should be honoured.

Additional info:

I posted a fix upstream: https://patchwork.ozlabs.org/patch/760372/
This bug + https://bugzilla.redhat.com/show_bug.cgi?id=1450203 are causes for OpenStack CI failures: https://bugzilla.redhat.com/show_bug.cgi?id=1438662

Comment 2 Ihar Hrachyshka 2017-05-16 16:15:02 UTC

Marked the bug for OpenStack layered product since it affects OpenStack CI. I also asked to target for 7.3 because we probably can't wait for 7.5 (?) to fix OpenStack CI (we have some OpenStack side workarounds but they are very fragile).

Comment 3 Ihar Hrachyshka 2017-05-17 17:15:42 UTC

Note: the patch was accepted by David Miller, see: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=77d7123342dcf6442341b67816321d71da8b2b16

Comment 4 Eric Garver 2017-09-21 18:54:03 UTC

Ihar,

Is there any reason to not also include your follow up series?

776ee323ddf1 ("Merge branch 'arp-always-override-existing-neigh-entries-with-gratuitous-ARP'")
7d472a59c0e5 ("arp: always override existing neigh entries with gratuitous ARP")
d9ef2e7bf99f ("arp: postpone addr_type calculation to as late as possible")
6fd05633bdaf ("arp: decompose is_garp logic into a separate function")
34eb5fe07831 ("arp: fixed error in a comment")

Comment 5 Ihar Hrachyshka 2017-09-26 13:55:07 UTC

Eric, I think it's a good idea, but the bug would be mostly fixed by the other patch, and I am not sure if rhel kernel policy allows backporting nice-to-haves. It will definitely help dealing with gARPs, at least to do it more efficiently.

Comment 6 Jianlin Shi 2017-10-18 02:24:00 UTC

steps to reproduce presented in description and set ack+

Comment 7 Rafael Aquini 2017-10-31 12:32:43 UTC

Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 9 Rafael Aquini 2017-11-01 10:34:37 UTC

Patch(es) available on kernel-3.10.0-764.el7

Comment 11 Jianlin Shi 2017-11-02 02:49:14 UTC

test topo:
#         br0
# Host A --|-- Host B
#   0.1          0.2
# 2000::1     2000::2

test setting:

# increase locktime for ease to reproduce
ip netns exec ha sysctl -w net.ipv4.neigh.ha_veth0.locktime=10000


reproduced on 3.10.0-760:

[root@ibm-x3650m4-01-vm-05 ~]# uname -a
Linux ibm-x3650m4-01-vm-05.lab.eng.bos.redhat.com 3.10.0-760.el7.x86_64 #1 SMP Fri Oct 27 07:23:03 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

1. confirmed that neigh on ha is stale:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec ha ip neigh show
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 STALE

2. change mac on hb:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec hb ip link set dev hb_veth0 address 6e:03:b7:8d:e1:12

3. ping hb on ha:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec ha ping 192.168.0.2 -c 1                         
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.

--- 192.168.0.2 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

4. send gratuitous ARP on hb when neigh state changed to DELAY:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec hb arping -A -c 3 -I hb_veth0 -U 192.168.0.2
ARPING 192.168.0.2 from 192.168.0.2 hb_veth0
Sent 3 probes (3 broadcast(s))
Received 0 response(s)

5. watch the neigh state on ha
[root@ibm-x3650m4-01-vm-05 ~]# while :; do ip netns exec ha ip neigh show; sleep 1; done
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 STALE
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 STALE
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 STALE
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 STALE
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 STALE

<=== become DELAY

192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 DELAY
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 DELAY
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 DELAY
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 DELAY
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 DELAY

<=== DELAY to PROBE, gratuitous ARP not honored

192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 PROBE
192.168.0.2 dev ha_veth0 lladdr 6e:03:b7:8d:e1:11 PROBE
192.168.0.2 dev ha_veth0  FAILED


verified on 3.10.0-766:

[root@ibm-x3650m4-01-vm-05 ~]# uname -a
Linux ibm-x3650m4-01-vm-05.lab.eng.bos.redhat.com 3.10.0-766.el7.x86_64 #1 SMP Wed Nov 1 07:08:44 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

1. confirmed that neigh on ha is stale:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec ha ip neigh show
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE

2. change mac on hb:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec hb ip link set dev hb_veth0 address e2:6d:81:cc:30:13

3. ping hb on ha:
[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec ha ping 192.168.0.2 -c 1
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.

--- 192.168.0.2 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

4. send gratuitous ARP on hb when neigh state changed to DELAY:

[root@ibm-x3650m4-01-vm-05 ~]# ip netns exec hb arping -A -c 3 -I hb_veth0 -U 192.168.0.2
ARPING 192.168.0.2 from 192.168.0.2 hb_veth0
Sent 3 probes (3 broadcast(s))
Received 0 response(s)

5. watch the neigh state on ha

[root@ibm-x3650m4-01-vm-05 ~]# while :; do ip netns exec ha ip neigh show; sleep 1; done       
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 STALE

<=== to DELAY

192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 DELAY
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:12 DELAY
192.168.0.2 dev ha_veth0 lladdr e2:6d:81:cc:30:13 STALE

<=== updated after receive gratuitous ARP

Comment 12 Jianlin Shi 2017-11-02 07:50:06 UTC

reproducer failed on RHEL-7.4:
https://beaker.engineering.redhat.com/jobs/2122415

passed on 3.10.0-766:
https://beaker.engineering.redhat.com/jobs/2122416

Comment 13 errata-xmlrpc 2018-04-10 20:01:25 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1062

Note You need to log in before you can comment on or make changes to this bug.