1677323 – iptables -X returns iptables: No buffer space available when huge amount of chains are used.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1677323 - iptables -X returns iptables: No buffer space available when huge amount of chains are used.

Summary: iptables -X returns iptables: No buffer space available when huge amount of c...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	iptables
Sub Component:
Version:	8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	8.0
Assignee:	Phil Sutter
QA Contact:	Jiri Peska
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-14 14:41 UTC by Robin Hack
Modified:	2020-11-14 06:47 UTC (History)
CC List:	5 users (show)
Fixed In Version:	iptables-1.8.2-16.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-11-05 22:17:43 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2019:3573	0	None	None	None	2019-11-05 22:17:52 UTC

Description Robin Hack 2019-02-14 14:41:09 UTC

Description of problem:
With kernel 4.18.0-64
time iptables -X
takes 47 minutes and then dies on:
iptables: No buffer space available.

With kernel 4.18.0-68
time iptables -X
takes 1m5s (improvement here!) but
it also dies with:
iptables: No buffer space available.



Version-Release number of selected component (if applicable):
iptables-1.8.2-9.el8.x86_64
kernel-4.18.0-68.el8.x86_64
libnftnl-1.1.1-4.el8.x86_64

How reproducible:
always

Steps to Reproduce:
1. follow reproducer from:
https://bugzilla.redhat.com/show_bug.cgi?id=1647306
(create 200000 chains)
2. try to remove 200000 chains by invoking: iptables -X

Actual results:
iptables -X dies with:
iptables: No buffer space available.

Expected results:
Removed chains.


Additional info:

Comment 1 Phil Sutter 2019-02-15 09:34:57 UTC

Hi Robin,

I can't reproduce this on a beaker machine (admittedly a pretty fat one):

[root@wsfd-netdev13 ~]# ./iptables-restore-reproducer.sh 
calling iptables-nft-restore

real	0m13.036s
user	0m12.262s
sys	0m4.245s
calling iptables-nft -F

real	1m2.740s
user	0m0.885s
sys	1m1.427s
calling iptables-nft -X

real	0m8.028s
user	0m0.266s
sys	0m7.734s
[root@wsfd-netdev13 ~]# uname -a
Linux wsfd-netdev13.ntdv.lab.eng.bos.redhat.com 4.18.0-68.el8.x86_64 #1 SMP Wed Feb 13 14:25:59 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@wsfd-netdev13 ~]# rpm -q iptables
iptables-1.8.2-9.el8.x86_64
[root@wsfd-netdev13 ~]# rpm -q libnftnl
libnftnl-1.1.1-4.el8.x86_64
[root@wsfd-netdev13 ~]# cat iptables-restore-reproducer.sh 
#! /bin/bash
echo "calling iptables-nft-restore"
time iptables-restore <(

echo "*filter"
for i in $(seq 0 200000);do
        printf ":chain_%06x - [0:0]\n" $i
done
for i in $(seq 0 200000);do
        printf -- "-A INPUT -j chain_%06x\n" $i
        printf -- "-A INPUT -j chain_%06x\n" $i
done
echo COMMIT

)
echo "calling iptables-nft -F"
time iptables -F
echo "calling iptables-nft -X"
time iptables -X
[root@wsfd-netdev13 ~]#

Comment 2 Robin Hack 2019-02-15 10:22:33 UTC

Hi Phil.

# free -m
              total        used        free      shared  buff/cache   available
Mem:           1829         141        1435           8         251        1534
Swap:             0           0           0

strace output of iptables -X:
sendto(3, {{len=20, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_GETCHAIN, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=0, pid=0}, {nfgen_family=AF_INET, version=NFNETLINK_V0, res_id=htons(0)}, 20, 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 20
recvmsg(3, ... tons of structs ) = 16488
brk(NULL)                               = 0x56131b143000
brk(0x56131b164000)                     = 0x56131b164000

... skipped ...

sendmsg(3, tons of data ... = 12000040
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(3, {msg_namelen=12}, 0)         = -1 ENOBUFS (No buffer space available)

I can attach whole strace output :).

Comment 3 Phil Sutter 2019-02-15 15:00:16 UTC

Hi Robin,

(In reply to Robin Hack from comment #2)
> Hi Phil.
> 
> # free -m
>               total        used        free      shared  buff/cache  
> available
> Mem:           1829         141        1435           8         251       
> 1534
> Swap:             0           0           0
> 
> strace output of iptables -X:
> sendto(3, {{len=20, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_GETCHAIN,
> flags=NLM_F_REQUEST|NLM_F_DUMP, seq=0, pid=0}, {nfgen_family=AF_INET,
> version=NFNETLINK_V0, res_id=htons(0)}, 20, 0, {sa_family=AF_NETLINK,
> nl_pid=0, nl_groups=00000000}, 12) = 20
> recvmsg(3, ... tons of structs ) = 16488
> brk(NULL)                               = 0x56131b143000
> brk(0x56131b164000)                     = 0x56131b164000
> 
> ... skipped ...
> 
> sendmsg(3, tons of data ... = 12000040
> select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left
> {tv_sec=0, tv_usec=0})
> recvmsg(3, {msg_namelen=12}, 0)         = -1 ENOBUFS (No buffer space
> available)

Receiving ENOBUFS when calling recvmsg() typically happens if a single netlink
message exceeds the 32k max buffer size supported by kernel. I wonder why this
doesn't happen for me though. Could you perhaps try with a smaller number of
chains and rules? The difference should still be noticeable and for use in a
test script, a delay of over a minute is probably too large, anyway.

Thanks, Phil

Comment 4 Robin Hack 2019-02-18 13:23:54 UTC

Hello.

Ok. It looks like 
          for ((i = 0; i < 300000; ++i)); do
                printf ":chain_%06x - [0:0]\n" $i
            done
is not a issue itself even with that big number.

However with combination with:
         for ((i = 0; i < 1000; ++i)); do
                printf -- "-A INPUT -j chain_%06x\n" $i
                printf -- "-A INPUT -j chain_%06x\n" $i
            done
it returns iptables: No buffer space available.
but with smaller numbers it's starts to return:
100 chains - iptables v1.8.2 (nf_tables):  CHAIN_USER_DEL failed (Device or resource busy): chain chain_000000
10 chains - iptables v1.8.2 (nf_tables):  CHAIN_USER_DEL failed (Device or resource busy): chain chain_000000

kernel-4.18.0-69.el8.x86_64
iptables-1.8.2-9.el8.x86_64
libnftnl-1.1.1-4.el8.x86_64

Comment 5 Phil Sutter 2019-07-02 15:15:59 UTC

I managed to reproduce the issue. Turned out I missed the fact that it happens
only if one doesn't call 'iptables -F' before calling 'iptables -X'. 

Sent a patch upstream, mostly to ask for advice on how to properly fix it:
https://marc.info/?l=netfilter-devel&m=156208033321053&w=2

Comment 6 Phil Sutter 2019-07-02 18:10:08 UTC

Fix sent upstream: https://marc.info/?l=netfilter-devel&m=156209061024148&w=2

Comment 7 Phil Sutter 2019-07-03 07:37:50 UTC

Upstream commit to backport:

commit d3e39e9c457f452540359e42fb58d64a28fe3e18 (origin/master, origin/HEAD)
Author: Phil Sutter <phil>
Date:   Tue Jul 2 20:30:49 2019 +0200

    nft: Set socket receive buffer
    
    When trying to delete user-defined chains in a large ruleset,
    iptables-nft aborts with "No buffer space available". This can be
    reproduced using the following script:
    
    | #! /bin/bash
    | iptables-nft-restore <(
    |
    | echo "*filter"
    | for i in $(seq 0 200000);do
    |         printf ":chain_%06x - [0:0]\n" $i
    | done
    | for i in $(seq 0 200000);do
    |         printf -- "-A INPUT -j chain_%06x\n" $i
    |         printf -- "-A INPUT -j chain_%06x\n" $i
    | done
    | echo COMMIT
    |
    | )
    | iptables-nft -X
    
    The problem seems to be the sheer amount of netlink error messages sent
    back to user space (one EBUSY for each chain). To solve this, set
    receive buffer size depending on number of commands sent to kernel.
    
    Suggested-by: Pablo Neira Ayuso <pablo>
    Signed-off-by: Phil Sutter <phil>
    Signed-off-by: Pablo Neira Ayuso <pablo>

Comment 8 Phil Sutter 2019-08-07 23:18:31 UTC

Tomas,

Please consider providing qa_ack+ here. We're a bit late with RHEL8.1 but since it is a bug fix I think it's worth trying.

Cheers, Phil

Comment 13 errata-xmlrpc 2019-11-05 22:17:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3573

Note You need to log in before you can comment on or make changes to this bug.