1840936 – iptables-restore --test hangs

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1840936 - iptables-restore --test hangs

Summary: iptables-restore --test hangs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	iptables
Sub Component:
Version:	8.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	8.0
Assignee:	Phil Sutter
QA Contact:	Tomas Dolezal
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1845725
TreeView+	depends on / blocked

Reported:	2020-05-27 23:57 UTC by Dan
Modified:	2023-12-15 18:01 UTC (History)
CC List:	4 users (show)
Fixed In Version:	iptables-1.8.4-13.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1845725 (view as bug list)
Environment:
Last Closed:	2020-11-04 01:54:58 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:4518	0	None	None	None	2020-11-04 01:55:11 UTC

Description Dan 2020-05-27 23:57:35 UTC

Description of problem:
When executing iptables-restore --test on a very simple multi-table firewall the command hangs forever. The command does not hang on a firewall with one table.  This does not happen in 8.0, or 8.1 RHEL. Most likely this is a regression of the iptables compatibility layer with nftables. 


Version-Release number of selected component (if applicable):
iptables-1.8.4-10.el8.x86_64

How reproducible:
always


Steps to Reproduce:
1. Create a very simple multi-table iptables file to restore.
2. Execute the iptables-restore --test <file>
3. Command will hang

Actual results:
hang


Expected results:
run to completion as documented.


Additional info:
This simple firewall will hang when trying to "test":

*filter
:FORWARD ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
COMMIT
*mangle
:FORWARD ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:PREROUTING ACCEPT [0:0]
COMMIT

Comment 1 Phil Sutter 2020-05-28 16:56:04 UTC

This is apparently a side-effect of backported commit 200bc39965149 ("nft:
cache: Fix iptables-save segfault under stress"). Due to other changes,
upstream doesn't suffer from the problem anymore.

While it wasn't easy to figure what exactly goes wrong, the fix is rather
trivial: If cache is rebuilt, the stored generation ID must be reset to zero as
it is never updated when fetching cache if it is non-zero. The change is a one-liner:

--- a/iptables/nft-cache.c
+++ b/iptables/nft-cache.c
@@ -629,6 +629,7 @@ void nft_rebuild_cache(struct nft_handle *h)
        if (h->cache_level)
                __nft_flush_cache(h);

+       h->nft_genid = 0;
        h->cache_level = NFT_CL_NONE;
        __nft_build_cache(h, level, NULL, NULL, NULL);
 }

Comment 2 Dan 2020-05-28 18:28:47 UTC

(In reply to Phil Sutter from comment #1)
> This is apparently a side-effect of backported commit 200bc39965149 ("nft:
> cache: Fix iptables-save segfault under stress"). Due to other changes,
> upstream doesn't suffer from the problem anymore.
> 
> While it wasn't easy to figure what exactly goes wrong, the fix is rather
> trivial: If cache is rebuilt, the stored generation ID must be reset to zero
> as
> it is never updated when fetching cache if it is non-zero. The change is a
> one-liner:
> 
> --- a/iptables/nft-cache.c
> +++ b/iptables/nft-cache.c
> @@ -629,6 +629,7 @@ void nft_rebuild_cache(struct nft_handle *h)
>         if (h->cache_level)
>                 __nft_flush_cache(h);
> 
> +       h->nft_genid = 0;
>         h->cache_level = NFT_CL_NONE;
>         __nft_build_cache(h, level, NULL, NULL, NULL);
>  }


Thank you for the update.  Will this fix be part of 8.2?  We use "iptables-restore --test" and this is causing considerable headaches.

Comment 3 Phil Sutter 2020-05-29 11:58:27 UTC

Hi,

(In reply to Dan from comment #2)
> (In reply to Phil Sutter from comment #1)
> > This is apparently a side-effect of backported commit 200bc39965149 ("nft:
> > cache: Fix iptables-save segfault under stress"). Due to other changes,
> > upstream doesn't suffer from the problem anymore.
> > 
> > While it wasn't easy to figure what exactly goes wrong, the fix is rather
> > trivial: If cache is rebuilt, the stored generation ID must be reset to zero
> > as
> > it is never updated when fetching cache if it is non-zero. The change is a
> > one-liner:
> > 
> > --- a/iptables/nft-cache.c
> > +++ b/iptables/nft-cache.c
> > @@ -629,6 +629,7 @@ void nft_rebuild_cache(struct nft_handle *h)
> >         if (h->cache_level)
> >                 __nft_flush_cache(h);
> > 
> > +       h->nft_genid = 0;
> >         h->cache_level = NFT_CL_NONE;
> >         __nft_build_cache(h, level, NULL, NULL, NULL);
> >  }
> 
> 
> Thank you for the update.  Will this fix be part of 8.2?  We use
> "iptables-restore --test" and this is causing considerable headaches.

Not by default. This ticket will address 8.3, backporting into 8.2 requires an explicit request. I'll discuss this with QA as they'll need to have capacity for it.

Cheers, Phil

Comment 5 Dan 2020-05-29 16:13:43 UTC

(In reply to Phil Sutter from comment #3)
> Hi,
> 
> (In reply to Dan from comment #2)
> > (In reply to Phil Sutter from comment #1)
> > > This is apparently a side-effect of backported commit 200bc39965149 ("nft:
> > > cache: Fix iptables-save segfault under stress"). Due to other changes,
> > > upstream doesn't suffer from the problem anymore.
> > > 
> > > While it wasn't easy to figure what exactly goes wrong, the fix is rather
> > > trivial: If cache is rebuilt, the stored generation ID must be reset to zero
> > > as
> > > it is never updated when fetching cache if it is non-zero. The change is a
> > > one-liner:
> > > 
> > > --- a/iptables/nft-cache.c
> > > +++ b/iptables/nft-cache.c
> > > @@ -629,6 +629,7 @@ void nft_rebuild_cache(struct nft_handle *h)
> > >         if (h->cache_level)
> > >                 __nft_flush_cache(h);
> > > 
> > > +       h->nft_genid = 0;
> > >         h->cache_level = NFT_CL_NONE;
> > >         __nft_build_cache(h, level, NULL, NULL, NULL);
> > >  }
> > 
> > 
> > Thank you for the update.  Will this fix be part of 8.2?  We use
> > "iptables-restore --test" and this is causing considerable headaches.
> 
> Not by default. This ticket will address 8.3, backporting into 8.2 requires
> an explicit request. I'll discuss this with QA as they'll need to have
> capacity for it.
> 
> Cheers, Phil

Yes please - this causes a pretty nasty 100% CPU condition and the command is basically broken.
Do I have to do anything to request it be backported to 8.2?

Dan

Comment 6 Phil Sutter 2020-05-29 17:00:44 UTC

(In reply to Dan from comment #5)
> Yes please - this causes a pretty nasty 100% CPU condition and the command
> is basically broken.
> Do I have to do anything to request it be backported to 8.2?

Opening a customer case linking to this ticket is always helpful in order to
justify capacities. Backporting to 8.2 is sufficient for you and we can leave
8.1 as-is?

Comment 12 Dan 2020-05-29 19:09:19 UTC

(In reply to Phil Sutter from comment #6)
> (In reply to Dan from comment #5)
> > Yes please - this causes a pretty nasty 100% CPU condition and the command
> > is basically broken.
> > Do I have to do anything to request it be backported to 8.2?
> 
> Opening a customer case linking to this ticket is always helpful in order to
> justify capacities. Backporting to 8.2 is sufficient for you and we can leave
> 8.1 as-is?

Yes - we would need this backported to the earliest RHEL 8 version it effects.
When I tried in ECS using RHEL 8.0 and 8.1 it reproduced on both.

I assume that is because when I installed iptables they both installed:
iptables-1.8.4-10.el8.x86_64

So I assume this is tied to iptables or iptables-lib rpm? 

Do you know the version of the the rpms that are effected?

Thanks,
Dan

Comment 13 Dan 2020-05-29 20:19:05 UTC

Customer case 02666328 has been opened referring to this issue.

Comment 14 Phil Sutter 2020-06-02 09:42:58 UTC

Hi Dan,

(In reply to Dan from comment #12)
> (In reply to Phil Sutter from comment #6)
> > (In reply to Dan from comment #5)
> > > Yes please - this causes a pretty nasty 100% CPU condition and the command
> > > is basically broken.
> > > Do I have to do anything to request it be backported to 8.2?
> > 
> > Opening a customer case linking to this ticket is always helpful in order to
> > justify capacities. Backporting to 8.2 is sufficient for you and we can leave
> > 8.1 as-is?
> 
> Yes - we would need this backported to the earliest RHEL 8 version it
> effects.
> When I tried in ECS using RHEL 8.0 and 8.1 it reproduced on both.
> 
> I assume that is because when I installed iptables they both installed:
> iptables-1.8.4-10.el8.x86_64
> 
> So I assume this is tied to iptables or iptables-lib rpm? 
> 
> Do you know the version of the the rpms that are effected?

Yes, it is a bug in iptables RPM, iptables-1.8.4-10.el8 is the first one being
affected. I just checked and the iptables-restore shipped in RHEL8.1 doesn't
expose the problem, hence backporting to RHEL8.2 is sufficient. Sorry for the
confusion, I assumed the problematic patch was in RHEL8 for longer than that.

Cheers, Phil

Comment 19 Jerry Rassier 2020-06-12 19:24:05 UTC

Just to throw some more weight behind backporting to 8.2, we've also been bitten by this one. Please see customer case 02672005.


Thanks
Jerry

Comment 23 errata-xmlrpc 2020-11-04 01:54:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (iptables bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4518

Note You need to log in before you can comment on or make changes to this bug.