Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1883299

Summary: Improve IP address match performance
Product: Red Hat Enterprise Linux 8 Reporter: Phil Sutter <psutter>
Component: iptablesAssignee: Phil Sutter <psutter>
Status: CLOSED ERRATA QA Contact: Tomas Dolezal <todoleza>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.4CC: alchan, cmarches, dornelas, dyocum, egarver, hmatsumo, iptables-maint-list, jmaxwell, ltitov, todoleza
Target Milestone: rcKeywords: Triaged, Upstream, ZStream
Target Release: 8.4Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: iptables-1.8.4-16.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1894619 1894620 (view as bug list) Environment:
Last Closed: 2021-05-18 14:58:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1186913, 1875967, 1894575, 1894619, 1894620    

Description Phil Sutter 2020-09-28 17:03:51 UTC
Due to a broken check, iptables-nft inserts a pointless bitwise expression when matching source or destination address. Fix submitted upstream:

https://lore.kernel.org/netfilter-devel/20200928170547.13857-1-phil@nwl.cc/

Note: This is not a regression within RHEL8 (the fixed commit introduces the check). Performance-wise, this is a regression from RHEL7.

Comment 2 Phil Sutter 2020-10-05 09:40:52 UTC
Upstream commit to backport:

commit 72ed608bf1ea550ac13b5b880afc7ad3ffa0afd0
Author: Phil Sutter <phil>
Date:   Mon Sep 28 18:57:18 2020 +0200

    nft: Fix for broken address mask match detection
    
    Trying to decide whether a bitwise expression is needed to match parts
    of a source or destination address only, add_addr() checks if all bytes
    in 'mask' are 0xff or not. The check is apparently broken though as each
    byte in 'mask' is cast to a signed char before comparing against 0xff,
    therefore the bitwise is always added:
    
    | # ./bad/iptables-nft -A foo -s 10.0.0.1 -j ACCEPT
    | # ./good/iptables-nft -A foo -s 10.0.0.2 -j ACCEPT
    | # nft --debug=netlink list chain ip filter foo
    | ip filter foo 5
    |   [ payload load 4b @ network header + 12 => reg 1 ]
    |   [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
    |   [ cmp eq reg 1 0x0100000a ]
    |   [ counter pkts 0 bytes 0 ]
    |   [ immediate reg 0 accept ]
    |
    | ip filter foo 6 5
    |   [ payload load 4b @ network header + 12 => reg 1 ]
    |   [ cmp eq reg 1 0x0200000a ]
    |   [ counter pkts 0 bytes 0 ]
    |   [ immediate reg 0 accept ]
    |
    | table ip filter {
    |       chain foo {
    |               ip saddr 10.0.0.1 counter packets 0 bytes 0 accept
    |               ip saddr 10.0.0.2 counter packets 0 bytes 0 accept
    |       }
    | }
    
    Fix the cast, safe an extra op and gain 100% performance in ideal cases.
    
    Fixes: 56859380eb328 ("xtables-compat: avoid unneeded bitwise ops")
    Signed-off-by: Phil Sutter <phil>

There's a soft follow-up which should be backported as well once it got accepted:

https://lore.kernel.org/netfilter-devel/20201002090334.29788-1-phil@nwl.cc/

Comment 11 Phil Sutter 2020-10-30 14:34:17 UTC
Just sent a v2 of the follow-up patch upstream which fixes output of address prefixes in arptables-nft:

https://lore.kernel.org/netfilter-devel/20201030132449.5576-1-phil@nwl.cc/

Meanwhile I discovered that ebtables-nft may benefit from the optimized mask matching as well, also submitted upstream:

https://lore.kernel.org/netfilter-devel/20201030132449.5576-2-phil@nwl.cc/

Comment 13 Phil Sutter 2020-11-04 15:01:24 UTC
Follow-ups are upstream:

commit 323259001d617ae359430a03ee3d3e7f107684e0
Author: Phil Sutter <phil>
Date:   Fri Oct 2 09:44:38 2020 +0200

    nft: Optimize class-based IP prefix matches
    
    Payload expression works on byte-boundaries, leverage this with suitable
    prefix lengths.
    
    Signed-off-by: Phil Sutter <phil>

commit 274cb05edc58d6fa982a34c84b2f4cf6acc3e335
Author: Phil Sutter <phil>
Date:   Fri Oct 30 14:08:33 2020 +0100

    ebtables: Optimize masked MAC address matches
    
    Just like with class-based prefix matches in iptables-nft, optimize
    masked MAC address matches if the mask is on a byte-boundary.
    
    To reuse the logic in add_addr(), extend it to accept the payload base
    value via parameter.
    
    Signed-off-by: Phil Sutter <phil>

Comment 17 Phil Sutter 2020-11-05 17:21:15 UTC
In addition to the test described in commit message of the first backport, one should make sure listing of added IP address/prefix matches is still correct after applying the backports:

# iptables -A FORWARD -s 10.0.0.1
# iptables -A FORWARD -s 10.0.0.1/32
# iptables -A FORWARD -s 10.0.0.1/30
# iptables -A FORWARD -s 10.0.0.1/24
# iptables -S FORWARD
-P FORWARD ACCEPT
-A FORWARD -s 10.0.0.1/32
-A FORWARD -s 10.0.0.1/32
-A FORWARD -s 10.0.0.0/30
-A FORWARD -s 10.0.0.0/24

Same for arptables:

# arptables -A OUTPUT -s 10.0.0.1
# arptables -A OUTPUT -s 10.0.0.1/32
# arptables -A OUTPUT -s 10.0.0.1/30
# arptables -A OUTPUT -s 10.0.0.1/24
# arptables -nL OUTPUT
Chain OUTPUT (policy ACCEPT)
-s 10.0.0.1
-s 10.0.0.1
-s 10.0.0.0/30
-s 10.0.0.0/24

Before this backport, arptables would create not optimal rules:

# nft --debug=netlink list chain arp filter OUTPUT
[...]
arp filter OUTPUT 24
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x0100000a ]
  [ counter pkts 0 bytes 0 ]

arp filter OUTPUT 25 24
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x0100000a ]
  [ counter pkts 0 bytes 0 ]

arp filter OUTPUT 26 25
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xfcffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x0000000a ]
  [ counter pkts 0 bytes 0 ]

arp filter OUTPUT 27 26
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00ffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x0000000a ]
  [ counter pkts 0 bytes 0 ]
[...]

With this backport in place, the VM code looks like this:

# nft --debug=netlink list chain arp filter OUTPUT
[...]
arp filter OUTPUT 28
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ cmp eq reg 1 0x0100000a ]
  [ counter pkts 0 bytes 0 ]

arp filter OUTPUT 29 28
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ cmp eq reg 1 0x0100000a ]
  [ counter pkts 0 bytes 0 ]

arp filter OUTPUT 30 29
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 4b @ network header + 14 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xfcffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x0000000a ]
  [ counter pkts 0 bytes 0 ]

arp filter OUTPUT 31 30
  [ payload load 2b @ network header + 0 => reg 1 ]
  [ cmp eq reg 1 0x00000100 ]
  [ payload load 1b @ network header + 4 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 1b @ network header + 5 => reg 1 ]
  [ cmp eq reg 1 0x00000004 ]
  [ payload load 3b @ network header + 14 => reg 1 ]
  [ cmp eq reg 1 0x0000000a ]
  [ counter pkts 0 bytes 0 ]
[...]

Similar with ebtables, here's the test input and expected output:

# ebtables -A FORWARD -s fe:ed:0f:c0:ff:ee
# ebtables -A FORWARD -s fe:ed:0f:c0:ff:ee/ff:ff:ff:ff:ff:ff
# ebtables -A FORWARD -s fe:ed:0f:c0:ff:ee/ff:ff:ff:ff:ff:f0
# ebtables -A FORWARD -s fe:ed:0f:c0:ff:ee/ff:ff:ff:ff:ff:00
# ebtables -L FORWARD
Bridge table: filter

Bridge chain: FORWARD, entries: 4, policy: ACCEPT
-s fe:ed:0f:c0:ff:ee -j CONTINUE
-s fe:ed:0f:c0:ff:ee -j CONTINUE
-s fe:ed:0f:c0:ff:e0/ff:ff:ff:ff:ff:f0 -j CONTINUE
-s fe:ed:0f:c0:ff:00/ff:ff:ff:ff:ff:00 -j CONTINUE

Unoptimized VM code:

# nft --debug=netlink list chain bridge filter FORWARD
bridge filter FORWARD 4
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ cmp eq reg 1 0xc00fedfe 0x0000eeff ]
  [ counter pkts 0 bytes 0 ]

bridge filter FORWARD 5 4
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ cmp eq reg 1 0xc00fedfe 0x0000eeff ]
  [ counter pkts 0 bytes 0 ]

bridge filter FORWARD 6 5
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff 0x0000f0ff ) ^ 0x00000000 0x00000000 ]
  [ cmp eq reg 1 0xc00fedfe 0x0000e0ff ]
  [ counter pkts 0 bytes 0 ]

bridge filter FORWARD 7 6
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff 0x000000ff ) ^ 0x00000000 0x00000000 ]
  [ cmp eq reg 1 0xc00fedfe 0x000000ff ]
  [ counter pkts 0 bytes 0 ]
[...]

New package VM code should look like this:

# nft --debug=netlink list chain bridge filter FORWARD
bridge filter FORWARD 8
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ cmp eq reg 1 0xc00fedfe 0x0000eeff ]
  [ counter pkts 0 bytes 0 ]

bridge filter FORWARD 9 8
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ cmp eq reg 1 0xc00fedfe 0x0000eeff ]
  [ counter pkts 0 bytes 0 ]

bridge filter FORWARD 10 9
  [ payload load 6b @ link header + 6 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff 0x0000f0ff ) ^ 0x00000000 0x00000000 ]
  [ cmp eq reg 1 0xc00fedfe 0x0000e0ff ]
  [ counter pkts 0 bytes 0 ]

bridge filter FORWARD 11 10
  [ payload load 5b @ link header + 6 => reg 1 ]
  [ cmp eq reg 1 0xc00fedfe 0x000000ff ]
  [ counter pkts 0 bytes 0 ]
[...]

Note that only the last rule changed as ebtables wasn't affected by the
incorrect bitwise expression bug. So we only see the optimization for masks
covering full bytes.

Tomas, is the above sufficient in getting you started with writing tests for
this ticket and the two z-stream clones? Please ping back if you need
additional info.

Comment 19 Phil Sutter 2020-11-10 13:56:33 UTC
Extended upstream testsuite to test generated bytecode: https://lore.kernel.org/netfilter-devel/20201110135247.22586-1-phil@nwl.cc/
I will backport this one as well, so we should get test coverage for free.

Comment 24 Ben Bennett 2021-03-22 14:51:37 UTC
*** Bug 1937949 has been marked as a duplicate of this bug. ***

Comment 26 errata-xmlrpc 2021-05-18 14:58:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (iptables bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1642