Bug 1708717

Summary: backport 5.x IPv6 fix: neighbour: arp_cache: neighbor table overflow!
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 29CC: airlied, bskeggs, edgar.hoch, hdegoede, ichavero, itamar, jan.kratochvil, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kas, kernel-maint, linville, mchehab, mjg59, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-5.1.8-300.fc30 kernel-5.1.8-200.fc29 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-12 14:47:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel.spec patch to build F-29 kernel none

Description Jan Kratochvil 2019-05-10 16:39:38 UTC
1. Please describe the problem:

After few days of running kernel-5.0.x my IPv6 connectivity gets very unreliable and kernel log is full of:
  neighbour: arp_cache: neighbor table overflow

2. What is the Version-Release number of the kernel:

kernel-5.0.6-200.fc29.x86_64 fails
kernel-5.0.11-200.fc29.x86_64 fails

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

kernel-4.20.16-200.fc29.x86_64 works fine

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes, run IPv6 on local gigabit ethernet over openvpn and ping it, sometimes it stops responding:
17:24:08.288597 IP6 fc00::67:2 > fc00::67:1: ICMP6, echo request, seq 717, length 1408
17:24:08.288632 IP6 fc00::67:1 > fc00::67:2: ICMP6, echo reply, seq 717, length 1408
17:24:08.298252 IP6 fc00::67:2 > fc00::67:1: ICMP6, echo request, seq 718, length 1408
17:24:08.298287 IP6 fc00::67:1 > fc00::67:2: ICMP6, echo reply, seq 718, length 1408
17:24:08.308274 IP6 fc00::67:2 > fc00::67:1: ICMP6, echo request, seq 719, length 1408
17:24:08.319391 IP6 fc00::67:2 > fc00::67:1: ICMP6, echo request, seq 720, length 1408


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Latest Rawhide is kernel-5.1.0-1.fc31 and it still does not contain the fix so I expect it will still fail but I have not verified that.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

May 10 17:42:04 host1 kernel: neighbour: ndisc_cache: neighbor table overflow!
May 10 17:42:04 host1 kernel: neighbour: ndisc_cache: neighbor table overflow!
May 10 17:42:04 host1 kernel: neighbour: ndisc_cache: neighbor table overflow!
May 10 17:42:04 host1 named[3391702]: ../../../../lib/isc/unix/socket.c:2173: unexpected error:
May 10 17:42:04 host1 named[3391702]: internal_send: 2001:500:c::1#53: Invalid argument
May 10 17:42:04 host1 kernel: net_ratelimit: 14 callbacks suppressed
May 10 17:42:04 host1 kernel: neighbour: ndisc_cache: neighbor table overflow!

Fix is:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64c6f4bbca748c3b2101469a76d88b7cd1c00476

Comment 1 Edgar Hoch 2019-05-18 18:40:10 UTC
I get a similar error running Fedora 29:

kernel-5.0.16-200.fc29.x86_64
bind-chroot-9.11.6-2.P1.fc29.x86_64
dhcp-server-4.3.6-31.fc29.x86_64

MAC and IP addresses are replaced by XXX and YYY:

Mai 18 20:34:16 kernel: neighbour: arp_cache: neighbor table overflow!
Mai 18 20:34:16 kernel: neighbour: arp_cache: neighbor table overflow!
Mai 18 20:34:16 kernel: neighbour: arp_cache: neighbor table overflow!
Mai 18 20:34:16 named[1367]: ../../../../lib/isc/unix/socket.c:2176: unexpected error:
Mai 18 20:34:16 kernel: neighbour: arp_cache: neighbor table overflow!
Mai 18 20:34:16 named[1367]: internal_send: YYY#59267: Invalid argument
Mai 18 20:34:16 named[1367]: client @0xXXX YYY#59267 (fedoraproject.org): view ims: error sending response: invalid file
Mai 18 20:34:16 named[1367]: ../../../../lib/isc/unix/socket.c:2176: unexpected error:
Mai 18 20:34:16 named[1367]: internal_send: YYY#43658: Invalid argument

It seams that the patch linked in previous comment is still not in released kernel?

Is there a workaround for this problem (beside restarting the server very often)?

Comment 2 Jan Kratochvil 2019-05-18 19:07:47 UTC
Created attachment 1570674 [details]
kernel.spec patch to build F-29 kernel

(In reply to Edgar Hoch from comment #1)
> It seams that the patch linked in previous comment is still not in released
> kernel?

It is present in (not in 5.1.x):
kernel-5.2.0-0.rc0.git6.1.fc31 = cbd87613cc306a7ccd5ce4006daf8dc737922c3f = has IPv6

But fc31 kernels are not installable into F-29 OS.  They are installable into F-30 OS (but upgrade F-29 to F-30 failed for me due to LUKS today).


> Is there a workaround for this problem (beside restarting the server very
> often)?

I build my own F-29 kernels with this attached patch.

Comment 3 Jan Kratochvil 2019-05-28 21:48:17 UTC
The problem still affects also: kernel-5.0.19-200.fc29.x86_64

Comment 4 Jan "Yenya" Kasprzak 2019-06-05 17:37:24 UTC
Also affected:  5.1.5-300.fc30.x86_64

On my server/router the problem manifests 3 to 4 days after reboot.

Comment 5 Jan "Yenya" Kasprzak 2019-06-10 08:58:14 UTC
FWIW, also affected: 5.1.6-300.fc30.x86_64

Comment 6 Jan Kratochvil 2019-06-10 09:05:46 UTC
And also affected kernel-5.1.7-200.fc29.x86_64 :-)

Comment 7 Fedora Update System 2019-06-10 15:19:53 UTC
FEDORA-2019-c03eda3cc6 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-c03eda3cc6

Comment 8 Fedora Update System 2019-06-10 15:19:54 UTC
FEDORA-2019-83858fc57b has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-83858fc57b

Comment 9 Fedora Update System 2019-06-11 01:19:17 UTC
kernel-5.1.8-300.fc30, kernel-headers-5.1.8-300.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-c03eda3cc6

Comment 10 Fedora Update System 2019-06-11 01:45:36 UTC
kernel-5.1.8-200.fc29, kernel-headers-5.1.8-200.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-83858fc57b

Comment 11 Fedora Update System 2019-06-12 14:47:57 UTC
kernel-5.1.8-300.fc30, kernel-headers-5.1.8-300.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2019-06-13 01:38:25 UTC
kernel-5.1.8-200.fc29, kernel-headers-5.1.8-200.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.