Bug 1813691 - Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
Summary: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 35
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-15 14:12 UTC by Juan Orti
Modified: 2022-12-13 15:14 UTC (History)
45 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-13 15:14:31 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl --no-hostname -k (143.97 KB, text/plain)
2020-03-15 14:12 UTC, Juan Orti
no flags Details

Description Juan Orti 2020-03-15 14:12:51 UTC
Created attachment 1670309 [details]
journalctl --no-hostname -k

1. Please describe the problem:

I'm seeing many of these errors in a dual stack Fedora 31 server:

~~~
mar 15 11:45:02 kernel: dst_alloc: 576 callbacks suppressed
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:02 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mar 15 11:45:11 kernel: dst_alloc: 813 callbacks suppressed
~~~

sysctl:

net.ipv4.route.max_size = 2147483647
net.ipv6.route.max_size = 4096

# ip route show cache
(empty)

# ip -6 route show cache
(empty)

# ip route show table all
default via 192.168.10.1 dev br-lan proto dhcp metric 425 
192.168.10.0/24 dev br-lan proto kernel scope link src 192.168.10.10 metric 425 
192.168.102.0/24 dev storage proto kernel scope link src 192.168.102.1 linkdown 
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
broadcast 192.168.10.0 dev br-lan table local proto kernel scope link src 192.168.10.10 
local 192.168.10.10 dev br-lan table local proto kernel scope host src 192.168.10.10 
broadcast 192.168.10.255 dev br-lan table local proto kernel scope link src 192.168.10.10 
broadcast 192.168.102.0 dev storage table local proto kernel scope link src 192.168.102.1 linkdown 
local 192.168.102.1 dev storage table local proto kernel scope host src 192.168.102.1 
broadcast 192.168.102.255 dev storage table local proto kernel scope link src 192.168.102.1 linkdown 
::1 dev lo proto kernel metric 256 pref medium
2001:db8::10 dev br-lan proto kernel metric 425 pref medium
2001:db8::/64 dev br-lan proto ra metric 425 pref medium
2001:db8::/56 via fe80::5054:ff:fe4f:611f dev br-lan proto ra metric 425 pref high
fdcf:5e58:5158::10 dev br-lan proto kernel metric 425 pref medium
fdcf:5e58:5158::/64 dev br-lan proto ra metric 425 pref medium
fdcf:5e58:5158::/48 via fe80::5054:ff:fe4f:611f dev br-lan proto ra metric 425 pref high
fe80::/64 dev br-dmz proto kernel metric 256 pref medium
fe80::/64 dev vnet0 proto kernel metric 256 pref medium
fe80::/64 dev br-lan proto kernel metric 425 pref medium
default via fe80::5054:ff:fe4f:611f dev br-lan proto ra metric 425 pref high
local ::1 dev lo table local proto kernel metric 0 pref medium
local 2001:db8::10 dev br-lan table local proto kernel metric 0 pref medium
local 2001:db8:6514:a449:b7e9:df4f dev br-lan table local proto kernel metric 0 pref medium
local fdcf:5e58:5158::10 dev br-lan table local proto kernel metric 0 pref medium
local fdcf:5e58:5158:0:697d:e027:aee5:d688 dev br-lan table local proto kernel metric 0 pref medium
local fe80::9c61:30ff:fe49:e82a dev br-dmz table local proto kernel metric 0 pref medium
local fe80::b073:50de:bc39:f289 dev br-lan table local proto kernel metric 0 pref medium
local fe80::fc54:ff:fe48:9748 dev vnet0 table local proto kernel metric 0 pref medium
ff00::/8 dev enp5s0.5 table local metric 256 pref medium
ff00::/8 dev enp5s0.1 table local metric 256 pref medium
ff00::/8 dev enp5s0 table local metric 256 pref medium
ff00::/8 dev br-dmz table local metric 256 pref medium
ff00::/8 dev vnet0 table local metric 256 pref medium
ff00::/8 dev br-lan table local metric 256 pref medium

# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether f4:6d:04:61:ab:b4 brd ff:ff:ff:ff:ff:ff
3: enp5s0.5@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-dmz state UP group default qlen 1000
    link/ether f4:6d:04:61:ab:b4 brd ff:ff:ff:ff:ff:ff
4: enp5s0.1@enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-lan state UP group default qlen 1000
    link/ether f4:6d:04:61:ab:b4 brd ff:ff:ff:ff:ff:ff
5: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 9a:e8:af:f0:c9:47 brd ff:ff:ff:ff:ff:ff
    inet 192.168.10.10/24 brd 192.168.10.255 scope global dynamic noprefixroute br-lan
       valid_lft 1158sec preferred_lft 1158sec
    inet6 2001:db8::10/128 scope global dynamic noprefixroute 
       valid_lft 1457sec preferred_lft 1457sec
    inet6 fdcf:5e58:5158::10/128 scope global dynamic noprefixroute 
       valid_lft 1457sec preferred_lft 1457sec
    inet6 fdcf:5e58:5158:0:697d:e027:aee5:d688/64 scope global noprefixroute 
       valid_lft forever preferred_lft forever
    inet6 2001:db8:6514:a449:b7e9:df4f/64 scope global dynamic noprefixroute 
       valid_lft 47554sec preferred_lft 47554sec
    inet6 fe80::b073:50de:bc39:f289/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
6: br-dmz: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 9e:61:30:49:e8:2a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::9c61:30ff:fe49:e82a/64 scope link 
       valid_lft forever preferred_lft forever
7: storage: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:05:9f:a6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.102.1/24 brd 192.168.102.255 scope global storage
       valid_lft forever preferred_lft forever
8: storage-nic: <BROADCAST,MULTICAST> mtu 9000 qdisc fq_codel master storage state DOWN group default qlen 1000
    link/ether 52:54:00:05:9f:a6 brd ff:ff:ff:ff:ff:ff
9: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br-dmz state UNKNOWN group default qlen 1000
    link/ether fe:54:00:48:97:48 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc54:ff:fe48:9748/64 scope link 
       valid_lft forever preferred_lft forever

2. What is the Version-Release number of the kernel:

kernel 5.5.8-200.fc31.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

First time I see this in the journal is on 2020-02-07, with kernel 5.4.15-200.fc31.x86_64

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Not sure how to reproduce it. I run a transmission-server that opens many connections and have some VMs in two bridges, but I don't do any routing in this box.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Not tested.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 sheepdestroyer 2020-03-28 11:35:45 UTC
Same problem for me, also on fedora 31 :

mars 28 20:08:05 sheepora-X230 kernel: dst_alloc: 25 callbacks suppressed
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:05 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: dst_alloc: 6 callbacks suppressed
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:14 sheepora-X230 kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
mars 28 20:08:19 sheepora-X230 kernel: dst_alloc: 206 callbacks suppressed


sysctl:

net.ipv4.route.max_size = 2147483647
net.ipv6.route.max_size = 4096

# ip route show cache
(empty)

# ip -6 route show cache
(empty)

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1452 qdisc fq_codel state UP group default qlen 1000
    link/ether 3c:97:0e:a9:41:60 brd ff:ff:ff:ff:ff:ff
    inet 192.168.11.8/24 brd 192.168.11.255 scope global dynamic noprefixroute enp0s25
       valid_lft 163712sec preferred_lft 163712sec
    inet6 240b:250:84c0:a00:77c4:bfdf:9b37:9215/64 scope global dynamic noprefixroute 
       valid_lft 86382sec preferred_lft 14382sec
    inet6 fe80::ee46:207e:6ec9:a7a6/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 06:83:1e:dd:dc:bf brd ff:ff:ff:ff:ff:ff
4: wg1: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1412 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet 192.168.255.2/32 scope global wg1
       valid_lft forever preferred_lft forever
    inet6 fd00:7::2/48 scope global 
       valid_lft forever preferred_lft forever


$ ip route show table all
default dev wg1 table 51820 scope link 
default via 192.168.11.1 dev enp0s25 proto dhcp metric 100 
192.168.11.0/24 dev enp0s25 proto kernel scope link src 192.168.11.8 metric 100 
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
broadcast 192.168.11.0 dev enp0s25 table local proto kernel scope link src 192.168.11.8 
local 192.168.11.8 dev enp0s25 table local proto kernel scope host src 192.168.11.8 
broadcast 192.168.11.255 dev enp0s25 table local proto kernel scope link src 192.168.11.8 
local 192.168.255.2 dev wg1 table local proto kernel scope host src 192.168.255.2 
default dev wg1 table 51820 metric 1024 pref medium
::1 dev lo proto kernel metric 256 pref medium
240b:250:84c0:a00::/64 dev enp0s25 proto ra metric 100 pref medium
fd00:7::/48 dev wg1 proto kernel metric 256 pref medium
fe80::/64 dev enp0s25 proto kernel metric 100 pref medium
default via fe80::1ac2:bfff:fe07:7eee dev enp0s25 proto ra metric 20100 pref medium
local ::1 dev lo table local proto kernel metric 0 pref medium
local 240b:250:84c0:a00:77c4:bfdf:9b37:9215 dev enp0s25 table local proto kernel metric 0 pref medium
local fd00:7::2 dev wg1 table local proto kernel metric 0 pref medium
local fe80::ee46:207e:6ec9:a7a6 dev enp0s25 table local proto kernel metric 0 pref medium
ff00::/8 dev enp0s25 table local metric 256 pref medium
ff00::/8 dev wg1 table local metric 256 pref medium

Comment 2 Juan Orti 2020-04-01 11:07:45 UTC
It happens in Fedora 32 too.

kernel-5.6.0-300.fc32.x86_64

Comment 3 Christian Kujau 2020-05-24 21:57:31 UTC
Happend here in F32 (5.6.12-300.fc32.x86_64) shortly after I created a tunnel with wireguard-tools:

$ wg-quick up vpn-provider
[#] ip link add vpn-provider type wireguard
[#] wg setconf vpn-provider /dev/fd/63
[#] ip -4 address add 10.60.140.40/32 dev vpn-provider
[#] ip -6 address add fc00:bbbb:bbbb:bb01::1:9f27/128 dev vpn-provider
[#] ip link set mtu 1420 up dev vpn-provider
[#] mount `1.1.1.1' /etc/resolv.conf
[#] wg set vpn-provider fwmark 51820
[#] ip -6 route add ::/0 dev vpn-provider table 51820
[#] ip -6 rule add not fwmark 51820 table 51820
[#] ip -6 rule add table main suppress_prefixlength 0
[#] nft -f /dev/fd/63
[#] ip -4 route add 0.0.0.0/0 dev vpn-provider table 51820
[#] ip -4 rule add not fwmark 51820 table 51820
[#] ip -4 rule add table main suppress_prefixlength 0
[#] sysctl -q net.ipv4.conf.all.src_valid_mark=1
[#] nft -f /dev/fd/63

$ dmesg | tail -1
kernel: Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.

$ sysctl net.ipv{4,6}.route.max_size
net.ipv4.route.max_size = 2147483647
net.ipv6.route.max_size = 4096


Both "ip route show cache" and "ip -6 route show cache" show an empty result. For IPv4 that is to be expected, because ip-route(8) states:

  Starting with Linux kernel version 3.6, there is no routing cache for IPv4 anymore. Hence ip route show cached 
  will never print any entries on systems with this or newer kernel versions.


Similarly, Documentation/networking/ip-sysctl.txt explains:

 route/max_size - INTEGER
        Maximum number of routes allowed in the kernel.  Increase
        this when using large numbers of interfaces and/or routes.
        From linux kernel 3.6 onwards, this is deprecated for ipv4
        as route cache is no longer used.


So, if net.ipv4.route.max_size isn't used any more, why is it set to its maximum (2^31-1) value? Raising net.ipv6.route.max_size to 2147483647 (same as the ipv4 part) makes the warning go away, but it's unclear what the implication of that setting are. And why "ip -6 route show cache" doesn't show any entries despite claiming that the route cache is full.

Comment 4 Christian Kujau 2020-05-24 22:06:53 UTC
FWIW, that printk has been added with (around v5.1-rc1):

 commit 22c2ad616b74f3de2256b242572ab449d031d941
 Author: Peter Oskolkov <posk>
 Date:   Wed Jan 16 08:50:28 2019 -0800

    net: add a route cache full diagnostic message
    
    In some testing scenarios, dst/route cache can fill up so quickly
    that even an explicit GC call occasionally fails to clean it up. This leads
    to sporadically failing calls to dst_alloc and "network unreachable" errors
    to the user, which is confusing.
    
    This patch adds a diagnostic message to make the cause of the failure
    easier to determine.
    
    Signed-off-by: Peter Oskolkov <posk>
    Signed-off-by: David S. Miller <davem>

Maybe this is something to be discussed upstream?

Comment 5 Garry T. Williams 2020-08-23 19:26:50 UTC
FYI, messages only started appearing here after installing 5.7.16-200.fc32.x86_64.
5.7.15-200.fc32.x86_64 and earlier were clean.

Comment 6 Garry T. Williams 2020-10-31 15:16:52 UTC
i just noticed that these messages are gone.  I think it happened quite a while ago.  I am running 5.8.16-300.fc33.x86_64 at the moment, but there is no trace of these in my entire journal, which begins with 5.6.6-300.fc32.x86_64.  I think this bug needs to be closed.  (@mat added CC prompted me to update this bug.)

Comment 7 Mathieu Chouquet-Stringer 2020-10-31 17:57:39 UTC
Still have them with 5.8.16-300.fc33.x86_64...

I just don't know what triggers them though.

journalctl|grep 'Route cache is full'|cut -c1-15|sort|uniq -c shows the following:
      4 Oct 12 01:47:37
      2 Oct 19 01:40:53
      3 Oct 19 01:44:43
      3 Oct 26 01:05:51
      9 Oct 26 01:06:06
      3 Oct 26 01:16:39
      1 Oct 26 01:16:40
      3 Oct 26 01:24:59

So it's always at night from Sunday to Monday to me...

Comment 8 Juan Orti 2020-11-01 07:03:11 UTC
For me the messages have disappeared, with the last ones on September 19 (kernel-5.8.9-200.fc32.x86_64)

Comment 9 Dustin C. Hatch 2020-12-17 23:45:22 UTC
I am getting these messages today on 5.9.8-100.fc32.x86_64

Comment 10 Arne 2020-12-21 12:43:46 UTC
Same here. I am using Wireguard in case that could be the reason
Linux lenovo2 5.9.14-200.fc33.x86_64 #1 SMP Fri Dec 11 14:30:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Comment 11 Anthony Messina 2021-02-25 16:29:54 UTC
I also see this issue on 5.10.18-200.fc33.x86_64

Comment 12 Anthony Messina 2021-02-27 05:39:15 UTC
(In reply to Anthony Messina from comment #11)
> I also see this issue on 5.10.18-200.fc33.x86_64

Sorry for not being more complete in my previous post.

I see this in 5.10.19-200.fc33.x86_64 when using a wireguard network connection.  When it occurs, not just the wireguard connection is affected -- the underlying network also is unable to route as well:

dmesg output:
Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

Setting net.ipv4.route.max_size = 2147483647 as in comment #3 removes the warning and allows connections to resume.

Comment 13 Jens Reimann 2021-03-02 14:38:01 UTC
I just ran into the same issue:

The `dmesg` log is spammed with:
~~~
[113050.599558] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113052.663065] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113052.663341] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113056.148709] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113056.148722] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113056.149862] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113056.149866] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[113056.150473] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
~~~

I am using Wireguard as well. However even disabling the wiregard connection doesn't bring back connectivity. Reconnecting the network also doesn't help.

Fedora: 33 - Linux xxx 5.10.18-200.fc33.x86_64 #1 SMP Tue Feb 23 22:06:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Comment 14 Anthony Messina 2021-03-26 23:15:49 UTC
This continues with kernel-5.11.9-200.fc33.x86_64

Comment 15 Juan Orti 2021-04-06 13:55:55 UTC
Still happening with kernel-5.11.11-300.fc34.x86_64

# sysctl net.ipv4.route.max_size
net.ipv4.route.max_size = 2147483647

# sysctl net.ipv6.route.max_size
net.ipv6.route.max_size = 4096

# ip -s route show cache
(empty)

# ip route show table all | wc -l
103

# lnstat -s1 -c1 -f rt_cache
rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
 entries|  in_hit|in_slow_|in_slow_|in_no_ro|  in_brd|in_marti|in_marti| out_hit|out_slow|out_slow|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|out_hlis|
        |        |     tot|      mc|     ute|        |  an_dst|  an_src|        |    _tot|     _mc|        |      ed|    miss| verflow| _search|t_search|
      37|       0|    1539|    2307|       5|     505|       0|       5|       0|   47605|   44968|       0|       0|       0|       0|       0|       0|


# journalctl -b -k --no-hostname -o short-monotonic
[11554.010238] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.010256] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.018918] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.021575] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.027614] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.036264] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.044940] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.053597] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.062262] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11554.070916] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.209053] kernel: dst_alloc: 762 callbacks suppressed
[11559.209060] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.209105] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.214157] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.222898] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.231516] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.240227] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.248859] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.257557] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.266206] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11559.274901] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[11623.752407] kernel: dst_alloc: 626 callbacks suppressed

Comment 16 Ewoud Kohl van Wijngaarden 2021-04-13 22:16:16 UTC
What helped for me was https://serverfault.com/questions/902161/linux-host-randomly-stops-answering-ipv6-neighbor-solicitation-requests/907895#907895

tl;dr: set IPv6_rpfilter to no in /etc/firewalld/firewalld.conf

Comment 17 Benjamin Xiao 2021-04-26 21:11:31 UTC
Thanks Ewoud, your workaround appears to be working for me on Arch Linux running firewalld 0.9.3 and kernel 5.11.16. Prior to using the workaround, I would get the "Route cache is full messages" and my IPv6 connection would stop working entirely.

Comment 18 Harry Coin 2021-05-12 15:50:30 UTC
Issue exists on fc34 workstation running dual stack freeipa, bind9 / named

May 12 10:48:13 registry1.1.quietfountain.com kernel: dst_alloc: 1474 callbacks suppressed
May 12 10:48:14 registry1.1.quietfountain.com kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
May 12 10:48:14 registry1.1.quietfountain.com kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
May 12 10:48:14 registry1.1.quietfountain.com kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

Comment 19 Michael Catanzaro 2021-06-28 19:45:19 UTC
FWIW I started seeing these errors in ABRT after enabling Wireguard as well.

(In reply to Ewoud Kohl van Wijngaarden from comment #16)
> What helped for me was
> https://serverfault.com/questions/902161/linux-host-randomly-stops-answering-
> ipv6-neighbor-solicitation-requests/907895#907895
> 
> tl;dr: set IPv6_rpfilter to no in /etc/firewalld/firewalld.conf

It's all gobbledygook to me, but I assume firewalld devs will likely understand what's going on, so I'm going to reassign this to the firewalld component for further triage, even though this might not be a bug in firewalld.

Comment 20 Eric Garver 2021-06-28 20:36:04 UTC
(In reply to Ewoud Kohl van Wijngaarden from comment #16)
> What helped for me was
> https://serverfault.com/questions/902161/linux-host-randomly-stops-answering-
> ipv6-neighbor-solicitation-requests/907895#907895
> 
> tl;dr: set IPv6_rpfilter to no in /etc/firewalld/firewalld.conf

Neighbor solicitations have been explicitly allowed since v0.6.0. It was to address bug 1575431.
https://github.com/firewalld/firewalld/commit/3d6a5063566319b5df58c6f738f203e88724961e

So something else is going on for this bug.

Comment 21 Michael Catanzaro 2021-06-28 20:42:52 UTC
I almost reassigned this back to the kernel, but comment #17 indicates the firewalld workaround is effective with firewalld 0.9.3. Something's up!

Comment 22 Eric Garver 2021-06-29 12:21:03 UTC
Can anyone experiencing this issue enable --set-log-denied=all, then post the rejected packets that get logged in dmesg. It would help to know what packets are being rejected so we can determine what might be going on.

Comment 23 Eric Garver 2021-06-29 12:24:49 UTC
Possibly related is this firewalld fix that address issues with IPv6_rpfilter and wireguard: https://github.com/firewalld/firewalld/commit/f250c2c507d63419a2c263f3adb47cef93613a5f
This fix has been backported to the stable-0.9 branch, but no release contains the fix yet.

Comment 24 Michael Catanzaro 2021-06-29 14:52:18 UTC
(In reply to Eric Garver from comment #22)
> Can anyone experiencing this issue enable --set-log-denied=all, then post
> the rejected packets that get logged in dmesg. It would help to know what
> packets are being rejected so we can determine what might be going on.

I ran 'firewall-cmd --set-log-denied=all' and then captured the following output:

[10031.187217] rpfilter_DROP: IN=enp4s0 OUT= MAC=33:33:00:00:00:01:d8:58:d7:00:1f:b0:86:dd SRC=fe80:0000:0000:0000:da58:d7ff:fe00:1fb0 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=72 TC=0 HOPLIMIT=1 FLOWLBL=0 PROTO=ICMPv6 TYPE=130 CODE=0 
[10034.247688] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10034.877687] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10037.255625] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10037.405620] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10046.546713] FINAL_REJECT: IN=enp4s0 OUT= MAC=01:00:5e:00:00:01:d8:58:d7:00:1f:b0:08:00 SRC=0.0.0.0 DST=224.0.0.1 LEN=32 TOS=0x00 PREC=0xC0 TTL=1 ID=0 DF PROTO=2 
[10046.737481] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10047.741342] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10051.642251] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10052.349237] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10054.646178] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10054.949155] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10061.418146] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10061.620976] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10069.084802] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10070.012788] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10081.086495] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10081.660479] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10083.340990] STATE_INVALID_DROP: IN=wg0 OUT= MAC= SRC=140.82.113.5 DST=192.168.2.2 LEN=40 TOS=0x00 PREC=0x00 TTL=51 ID=0 DF PROTO=TCP SPT=443 DPT=60324 WINDOW=0 RES=0x00 RST URGP=0 
[10083.344884] STATE_INVALID_DROP: IN=wg0 OUT= MAC= SRC=140.82.114.3 DST=192.168.2.2 LEN=40 TOS=0x00 PREC=0x00 TTL=52 ID=0 DF PROTO=TCP SPT=443 DPT=47630 WINDOW=0 RES=0x00 RST URGP=0 
[10083.344901] STATE_INVALID_DROP: IN=wg0 OUT= MAC= SRC=140.82.114.3 DST=192.168.2.2 LEN=40 TOS=0x00 PREC=0x00 TTL=52 ID=0 DF PROTO=TCP SPT=443 DPT=47630 WINDOW=0 RES=0x00 RST URGP=0 
[10090.726252] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10090.820254] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10095.515129] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10096.252097] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10103.894922] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10104.173025] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10104.173034] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10104.173038] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10104.173048] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2a04:4e42:0400:0000:0000:0000:0000:0323 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=58 FLOWLBL=650978 PROTO=TCP SPT=443 DPT=45298 WINDOW=65535 RES=0x00 ACK SYN URGP=0 
[10104.763888] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10105.202256] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10105.202264] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10105.202269] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10105.202279] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2a04:4e42:0400:0000:0000:0000:0000:0323 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=58 FLOWLBL=244046 PROTO=TCP SPT=443 DPT=45298 WINDOW=65535 RES=0x00 ACK SYN URGP=0 
[10107.222567] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10107.222575] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10107.222589] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2a04:4e42:0400:0000:0000:0000:0000:0323 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=58 FLOWLBL=300177 PROTO=TCP SPT=443 DPT=45298 WINDOW=65535 RES=0x00 ACK SYN URGP=0 
[10111.345993] dst_alloc: 1 callbacks suppressed
[10111.345997] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10111.346004] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10111.346009] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[10111.346018] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2a04:4e42:0400:0000:0000:0000:0000:0323 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=52 FLOWLBL=461719 PROTO=TCP SPT=443 DPT=45298 WINDOW=65535 RES=0x00 ACK SYN URGP=0

There's more, but that seems like probably enough.

(In reply to Eric Garver from comment #23)
> Possibly related is this firewalld fix that address issues with
> IPv6_rpfilter and wireguard:
> https://github.com/firewalld/firewalld/commit/
> f250c2c507d63419a2c263f3adb47cef93613a5f
> This fix has been backported to the stable-0.9 branch, but no release
> contains the fix yet.

Yeah probably. If you want, I could test a scratch-build...?

Comment 25 Eric Garver 2021-06-30 14:14:44 UTC
(In reply to Michael Catanzaro from comment #24)
> (In reply to Eric Garver from comment #23)
> > Possibly related is this firewalld fix that address issues with
> > IPv6_rpfilter and wireguard:
> > https://github.com/firewalld/firewalld/commit/
> > f250c2c507d63419a2c263f3adb47cef93613a5f
> > This fix has been backported to the stable-0.9 branch, but no release
> > contains the fix yet.
> 
> Yeah probably. If you want, I could test a scratch-build...?

I was due to cut upstream releases anyways. So here is the bodhi update for v0.9.4 package rebase. It includes the fix mentioned in comment 23.

https://bodhi.fedoraproject.org/updates/FEDORA-2021-5282b5cafd

Comment 26 Eric Garver 2021-07-01 12:00:32 UTC
(In reply to Eric Garver from comment #25)
> (In reply to Michael Catanzaro from comment #24)
> > (In reply to Eric Garver from comment #23)
> > > Possibly related is this firewalld fix that address issues with
> > > IPv6_rpfilter and wireguard:
> > > https://github.com/firewalld/firewalld/commit/
> > > f250c2c507d63419a2c263f3adb47cef93613a5f
> > > This fix has been backported to the stable-0.9 branch, but no release
> > > contains the fix yet.
> > 
> > Yeah probably. If you want, I could test a scratch-build...?
> 
> I was due to cut upstream releases anyways. So here is the bodhi update for
> v0.9.4 package rebase. It includes the fix mentioned in comment 23.
> 
> https://bodhi.fedoraproject.org/updates/FEDORA-2021-5282b5cafd

Was anyone willing to test this build? It's on its way to stable.

Comment 27 Michael Catanzaro 2021-07-01 14:25:29 UTC
I'm planning to test it today.

Comment 28 Michael Catanzaro 2021-07-01 18:55:54 UTC
(In reply to Michael Catanzaro from comment #27)
> I'm planning to test it today.

Unfortunately it's definitely not fixed.

(In reply to Michael Catanzaro from comment #24) 
> [10031.187217] rpfilter_DROP: IN=enp4s0 OUT=
> MAC=33:33:00:00:00:01:d8:58:d7:00:1f:b0:86:dd
> SRC=fe80:0000:0000:0000:da58:d7ff:fe00:1fb0
> DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=72 TC=0 HOPLIMIT=1 FLOWLBL=0
> PROTO=ICMPv6 TYPE=130 CODE=0 

I still see a lot of these.

Comment 29 Eric Garver 2021-07-02 12:47:53 UTC
(In reply to Michael Catanzaro from comment #28)
> (In reply to Michael Catanzaro from comment #27)
> > I'm planning to test it today.
> 
> Unfortunately it's definitely not fixed.

Do you mean you're still seeing the route cache warnings?

> 
> (In reply to Michael Catanzaro from comment #24) 
> > [10031.187217] rpfilter_DROP: IN=enp4s0 OUT=
> > MAC=33:33:00:00:00:01:d8:58:d7:00:1f:b0:86:dd
> > SRC=fe80:0000:0000:0000:da58:d7ff:fe00:1fb0
> > DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=72 TC=0 HOPLIMIT=1 FLOWLBL=0
> > PROTO=ICMPv6 TYPE=130 CODE=0 
> 
> I still see a lot of these.

That's an MLD packet (multicast). I don't think that being dropped is a concern.

Comment 30 Michael Catanzaro 2021-07-02 13:24:49 UTC
(In reply to Eric Garver from comment #29)
> Do you mean you're still seeing the route cache warnings?

Yes, loads of them, on both enp4s0 and wg0.

Comment 31 Michael Catanzaro 2021-07-02 14:42:27 UTC
(In reply to Ewoud Kohl van Wijngaarden from comment #16)
> tl;dr: set IPv6_rpfilter to no in /etc/firewalld/firewalld.conf

I tested this workaround and confirmed it "fixes" this bug.

Comment 32 Eric Garver 2021-07-02 14:48:02 UTC
(In reply to Michael Catanzaro from comment #30)
> (In reply to Eric Garver from comment #29)
> > Do you mean you're still seeing the route cache warnings?
> 
> Yes, loads of them, on both enp4s0 and wg0.

Can you show the dropped packets where IN=wg0 ?

Comment 33 Michael Catanzaro 2021-07-02 15:09:04 UTC
I've reverted back to IPv6_rpfilter=yes and rebooted, but the warnings haven't returned yet. I'll check again later today and see if they're back.

Note this means my assumption that IPv6_rpfilter=yes fixed the issue may be wrong. I assumed that lack of warnings meant the bug was definitely gone. But if reverting the change didn't immediately introduce warnings, it may be that I just need to run for longer....

Comment 34 Michael Catanzaro 2021-07-02 16:58:33 UTC
(In reply to Michael Catanzaro from comment #33)
> I've reverted back to IPv6_rpfilter=yes and rebooted, but the warnings
> haven't returned yet. I'll check again later today and see if they're back.

No luck yet. needinfo? me I suppose. Let's see if it happens again or not....

I am almost certain I saw this bug yesterday after installing the new firewalld update and rebooting, but it doesn't want to trigger today....

Comment 35 Juan Orti 2021-07-02 20:45:01 UTC
I'm not experiencing this bug since months ago. I've looked in the journal of several of my Fedora machines, one of them acting as a router with firewalld, and the message is not there anymore.

Comment 36 Michael Catanzaro 2021-07-02 22:21:14 UTC
(In reply to Michael Catanzaro from comment #34)
> I am almost certain I saw this bug yesterday after installing the new
> firewalld update and rebooting, but it doesn't want to trigger today....

OK, a few hours later, I've got lots of it. Here are the last few that involve wg0 (rather than the other interfaces):

[25466.838893] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2604:1580:fe00:0000:dead:beef:cafe:fed1 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=54 FLOWLBL=993086 PROTO=TCP SPT=80 DPT=35096 WINDOW=64260 RES=0x00 ACK SYN URGP=0 
[25468.414482] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25469.282464] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25473.294451] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[25474.834325] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25475.126373] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25475.126381] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25475.126386] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25475.126397] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2604:1580:fe00:0000:dead:beef:cafe:fed1 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=54 FLOWLBL=885495 PROTO=TCP SPT=80 DPT=35096 WINDOW=64260 RES=0x00 ACK SYN URGP=0 
[25475.194322] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25481.323172] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25481.498151] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25485.403055] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25486.242391] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25490.622920] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25491.509616] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25491.509623] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25491.509625] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25491.509633] rpfilter_DROP: IN=wg0 OUT= MAC= SRC=2604:1580:fe00:0000:dead:beef:cafe:fed1 DST=fdae:2785:cc95:0000:0000:0000:0000:0002 LEN=80 TC=0 HOPLIMIT=54 FLOWLBL=818984 PROTO=TCP SPT=80 DPT=35096 WINDOW=64260 RES=0x00 ACK SYN URGP=0 
[25491.553900] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25493.633873] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25494.433832] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25500.352891] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25500.396701] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25503.362595] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[25504.353572] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

I'm not certain what 2604:1580:fe00:0000:dead:beef:cafe:fed1 is, though it's similar to the public address of my Wireguard server running on DigitalOcean. At the risk of revealing my ignorance: maybe a router? :D

fdae:2785:cc95:0000:0000:0000:0000:0002 is the internal (private) address of my Wireguard server.

I see some similar drops on my enp4s0 as well, which is surely unrelated to Wireguard:

[25943.478366] rpfilter_DROP: IN=enp4s0 OUT= MAC=33:33:00:00:00:01:d8:58:d7:00:1f:b0:86:dd SRC=fe80:0000:0000:0000:da58:d7ff:fe00:1fb0 DST=ff02:0000:0000:0000:0000:0000:0000:0001 LEN=72 TC=0 HOPLIMIT=1 FLOWLBL=0 PROTO=ICMPv6 TYPE=130 CODE=0 

The SRC here is my router.

Comment 37 Eric Garver 2021-07-12 18:39:26 UTC
(In reply to Michael Catanzaro from comment #36)
> (In reply to Michael Catanzaro from comment #34)
[..]
> I'm not certain what 2604:1580:fe00:0000:dead:beef:cafe:fed1 is, though it's
> similar to the public address of my Wireguard server running on
> DigitalOcean. At the risk of revealing my ignorance: maybe a router? :D

A lookup says the address belongs to a could provider.

> fdae:2785:cc95:0000:0000:0000:0000:0002 is the internal (private) address of
> my Wireguard server.

Can you share your routing tables? Any chance there is asynchronous routing?

Comment 38 Michael Catanzaro 2021-07-12 18:48:35 UTC
(In reply to Eric Garver from comment #37)
> Can you share your routing tables? 

Sure. I'll do this in a private comment, just out of paranoia.

> Any chance there is asynchronous routing?

I don't know what that is.

Comment 59 Florian Westphal 2021-07-28 13:24:06 UTC
Works fine for me.  No FIB drops.

I have:

<internet> === <vmhost> ===  <fedora2>  === wg tunnel === <fedora1>

fedora1 and fedora2 are largely identical, only difference are their IP addresses, host name and
the wg configuration. fedora1 is configured to connect to fedora2 wg endpoint and sets
AllowedIps to 0.0.0.0/0, ::/0. Both are Fedora 34 (kernel 5.13.4-200 x86_64).
So fedora1 is a "client" that forwards all traffic via wg0 and fedora2 is the server that has indirect
internet connectivity via vmhost (router).

ip -6 route get <internet_ip6_address>
... on fedora1 shows "dev wg0".
If I do "ping6 <internet_ip6_address>" on fedora1, "tcpdump -n -i wg0" shows
the incoming icmpv6 echo requests, plus its reply.
Doing the same on public interface shows wg UDP packets plus the plaintext icmpv6 request/reply.
Using ping -s 1500 shows fragmented traffic, as expected.

firewalld is running on both fedora VMs and I can see the "fib" rule with nft.
No drops happen.

fedora1 has the "not from all fwmark ..." rule.  This rule is not present on fedora2, but
I guess thats expected since only fedora1 is configured with "AllowedIPs" set to /0.

Comment 60 Eric Garver 2021-07-28 14:01:50 UTC
Thanks for the testing in comment 59, fwestpha. This still isn't making sense.

Based on mcatanza's information we have the following:

 1. IPv6 route cache entries that should not exist
   - there shouldn't be any PMTU exceptions

 2. `ip -6 route show cache` yields nothing
   - even though the kernel is complaining the cache is full
   - possible the cache is GC'd fully before this command is run

Pinging sbrivio since he's recently spent time in this area.

Comment 61 Stefano Brivio 2021-10-25 15:01:28 UTC
Sorry for the enormous delay here, I tried on and off to reproduce this with Wireguard in background and couldn't find a way to hit it so far. There's no indication this comes from route exceptions (even though it would sound more likely), it might also be regular destination entries.

I wonder, if somebody reading this ticket can still reproduce the issue: how many CPU (threads) does your system have? Because, before:

commit d8882935fcae28bceb5f6f56f09cded8d36d85e6
Author: Eric Dumazet <edumazet>
Date:   Fri May 8 07:34:14 2020 -0700

    ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()

administratively-set routing entries would be counted per CPU -- I haven't checked yet what happens with IPv6_rpfilter set to true in firewalld (I guess that wouldn't result in additional per-CPU entries, but I'm not entirely sure yet).

This might be also interesting by the way: https://git.sbruder.de/simon/nixos-config/issues/26 -- it's not firewalld, but the logic appears to be similar. I'll try to replicate something like that with Wireguard next.

Comment 62 Michael Catanzaro 2021-10-25 16:12:16 UTC
(In reply to Stefano Brivio from comment #61)
> I wonder, if somebody reading this ticket can still reproduce the issue: how
> many CPU (threads) does your system have?

I've been running with IPv6_rpfilter=no in my firewalld.conf as a workaround, but I'll remove it now, and after a few hours we'll see if the bug is still happening (I assume so).

My system has 32 CPU threads.

Comment 63 Anthony Messina 2021-10-26 14:40:44 UTC
I've been able to reproduce the issue on F34 when a host's underlying connection has both IPv4 *and* IPv6 and the WireGuard connection has AllowedIPs = 0.0.0.0/0, ::/0

I do not see this issue when the host's underlying connection only has IPv4.

Comment 64 Michael Catanzaro 2021-10-26 14:54:16 UTC
(In reply to Anthony Messina from comment #63)
> I've been able to reproduce the issue on F34 when a host's underlying
> connection has both IPv4 *and* IPv6 and the WireGuard connection has
> AllowedIPs = 0.0.0.0/0, ::/0

I can confirm that my host has IPv6 enabled and my WireGuard connection has this same entry in AllowedIPs.

Comment 65 Florian Westphal 2021-10-29 10:41:27 UTC
Most likely caused by the bug described here:
https://lore.kernel.org/netdev/e022d597-302d-c061-0830-6ed20aa61e56@qtmlabs.xyz/T/#u

Quote: "The kernel leaks memory when a `fib` rule is present in ipv6 nftables firewall rules and a suppress_prefix rule
is present in the IPv6 routing rules (used by certain tools such as wg-quick)."

Comment 66 Eric Garver 2021-10-29 12:40:48 UTC
(In reply to Florian Westphal from comment #65)
> Most likely caused by the bug described here:
> https://lore.kernel.org/netdev/e022d597-302d-c061-0830-6ed20aa61e56@qtmlabs.
> xyz/T/#u
> 
> Quote: "The kernel leaks memory when a `fib` rule is present in ipv6
> nftables firewall rules and a suppress_prefix rule
> is present in the IPv6 routing rules (used by certain tools such as
> wg-quick)."

Reassigning this bug to the kernel based on this finding. Thanks Florian!

Comment 67 Qiyu Yan 2021-12-12 15:48:38 UTC
I am also getting this error, but since I am not in a trusted network environment that disabling rp_filter should not be a choice for me, some of my situations are that:
 - No matter wireguard is enabled or not (no suppress_prefix is set), I am seeing `Route cache is full: consider increasing sysctl net.ipv6.route.max_size.` when high ipv6 traffic (~ 200 Mbps)
 - When rp_filter for ipv6 disabled, no Route cache is full warning is seen
 - It don't seem to have a great impact on network speed (not seriously tested)

And I also tried to log all firewall denials, and seeing some interesting entries (following is recorded when rp_filter is set to yes)

rpfilter_DROP: IN=enp1s0 OUT= MAC=<> SRC=<A global address> DST=<A local ULA address> LEN=160 TC=0 HOPLIMIT=49 FLOWLBL=39131 PROTO=UDP SPT=2408 DPT=45235 LEN=120 
rpfilter_DROP: IN=enp2s0 OUT= MAC=<> SRC=<A local ULA address> DST=<A global address> LEN=1500 TC=0 HOPLIMIT=64 FLOWLBL=0 PROTO=UDP SPT=45235 DPT=2408 LEN=1460 

interface enp1s0 is connected to internet and set up as default route, enp2s0 is interface for local network, I assume that rpfilter should not drop packages from a global address to wan port nor from local address to lan port? Maybe something went wrong in rpfilter?

Also, I tried to enlarge net.ipv6.route.max_size to an excess value: 131072, seems the warning had gone (or gets not so easy to trigger?)

I am not using asynchronous routing, my ipv6 routing table for ipv6 is 

::1 dev lo proto kernel metric 256 pref medium
<global prefix>::/64 dev enp1s0 proto kernel metric 100 pref medium
fccd::/64 dev enp2s0 proto kernel metric 101 pref medium
fdc2::/64 dev cni-podman0 proto kernel metric 256 pref medium
<gateway of internet> dev enp1s0 proto static metric 100 pref medium
fe80::/64 dev enp1s0 proto kernel metric 100 pref medium
fe80::/64 dev enp2s0 proto kernel metric 101 pref medium
fe80::/64 dev vethf67356c8 proto kernel metric 256 pref medium
fe80::/64 dev cni-podman0 proto kernel metric 256 pref medium
default via <gateway of internet> dev enp1s0 proto static metric 100 pref medium

Comment 68 Juan Orti 2022-02-03 15:40:10 UTC
I'm reproducing the error messages again in a server that I use to store backups. It only has one network interface with IPv4 and IPv6, no wireguard. It has some disks in a btrfs filesystem that I use to receive the backups of other machines using btrfs send/receive via SSH. A few seconds after starting to receive the backups, the kernel logs many of these "Route cache is full" messages.

Environment:
  Fedora 35 Server
  kernel-5.15.18-200.fc35.x86_64


[  466.893983] sshd[1938]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
[  466.969437] ssh_filter_btrbk.sh[1961]: btrbk ACCEPT (Name: root; Remote: 2001:db8:0:10::3d4a 38386 22): btrfs receive /mnt/backups/send-receive/host1/
[  488.595535] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595548] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595557] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595569] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595581] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595593] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595608] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595618] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595631] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
[  488.595643] kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

Comment 69 Ben Cotton 2022-11-29 16:47:56 UTC
This message is a reminder that Fedora Linux 35 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '35'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 35 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 70 Christian Kujau 2022-11-29 21:37:16 UTC
I can't reproduce this any more with Fedora 37 (kernel-6.0.9-300.fc37.x86_64) and an active Wireguard connection, even with lots of traffic generated and net.ipv6.route.max_size set to its default value of 4096.

Comment 71 Ben Cotton 2022-12-13 15:14:31 UTC
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13.

Fedora Linux 35 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.