RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1921624 - 'nft -f' takes about triple times to restore nftables rules
Summary: 'nft -f' takes about triple times to restore nftables rules
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.4
Assignee: Richard Guy Briggs
QA Contact: Li Shuang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-28 10:25 UTC by Li Shuang
Modified: 2022-06-28 12:30 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-28 12:30:06 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Li Shuang 2021-01-28 10:25:39 UTC
Description of problem:
'nft -f' takes about triple times to restore nftables rules with the following steps:
# nft delete table bridge filter; time nft -f 1.rules; time nft -f 2.rules;

+----------------+---------+------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | real | 0m0.082s | 0m0.006s | 0m0.006s | 0m0.006s | 0m0.006s | 0m0.007s | 0m0.006s | 0m0.006s | 0m0.006s | 0m0.005s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                | 1.rules | user | 0m0.003s | 0m0.004s | 0m0.002s | 0m0.004s | 0m0.004s | 0m0.005s | 0m0.003s | 0m0.002s | 0m0.005s | 0m0.003s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | sys  | 0m0.005s | 0m0.002s | 0m0.004s | 0m0.002s | 0m0.002s | 0m0.002s | 0m0.003s | 0m0.004s | 0m0.001s | 0m0.002s |
| 4.18.0-240.el8 +---------+------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | real | 0m0.408s | 0m0.392s | 0m0.417s | 0m0.399s | 0m0.403s | 0m0.408s | 0m0.403s | 0m0.420s | 0m0.424s | 0m0.415s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                | 2.rules | user | 0m0.296s | 0m0.302s | 0m0.287s | 0m0.300s | 0m0.296s | 0m0.312s | 0m0.308s | 0m0.321s | 0m0.313s | 0m0.306s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | sys  | 0m0.104s | 0m0.089s | 0m0.125s | 0m0.098s | 0m0.104s | 0m0.094s | 0m0.094s | 0m0.097s | 0m0.109s | 0m0.107s |
+----------------+---------+------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | real | 0m0.088s | 0m0.006s | 0m0.007s | 0m0.005s | 0m0.006s | 0m0.006s | 0m0.006s | 0m0.006s | 0m0.006s | 0m0.005s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                | 1.rules | user | 0m0.003s | 0m0.005s | 0m0.002s | 0m0.001s | 0m0.005s | 0m0.002s | 0m0.002s | 0m0.004s | 0m0.003s | 0m0.001s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | sys  | 0m0.004s | 0m0.002s | 0m0.005s | 0m0.004s | 0m0.002s | 0m0.004s | 0m0.003s | 0m0.002s | 0m0.003s | 0m0.004s |
| 4.18.0-277.el8 +---------+------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | real | 0m1.142s | 0m1.071s | 0m1.136s | 0m1.084s | 0m1.125s | 0m1.138s | 0m1.144s | 0m1.157s | 0m1.129s | 0m1.129s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                | 2.rules | user | 0m0.311s | 0m0.310s | 0m0.319s | 0m0.316s | 0m0.310s | 0m0.301s | 0m0.303s | 0m0.320s | 0m0.303s | 0m0.286s |
|                |         +------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|                |         | sys  | 0m0.234s | 0m0.250s | 0m0.217s | 0m0.253s | 0m0.217s | 0m0.236s | 0m0.236s | 0m0.231s | 0m0.229s | 0m0.244s |
+----------------+---------+------+----------+----------+----------+----------+----------+----------+-----------+---------+----------+----------+


Version-Release number of selected component (if applicable):
RHEL-8.4.0-20210127.n.0 ==> kernel-4.18.0-277.el8
nftables-0.9.3-16.el8.x86_64


How reproducible:
always


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Li Shuang 2021-01-28 10:27:45 UTC
Created attachment 1751634 [details]
1.rules

Comment 2 Li Shuang 2021-01-28 10:28:52 UTC
Created attachment 1751635 [details]
2.rules

Comment 3 Li Shuang 2021-01-28 10:30:50 UTC
'nft -f 2.rules' takes about triple times on kernel-4.18.0-277.el8 compared with kernel-4.18.0-240.el8.

Comment 4 Phil Sutter 2021-01-29 13:58:02 UTC
When reproducing this, I got even much worse results and it looks like recently
added audit logging is killing performance:

[root@wsfd-netdev13 nft_performance]# nft flush ruleset
[root@wsfd-netdev13 nft_performance]# nft -f 1.rules
[root@wsfd-netdev13 nft_performance]# auditctl -e0
enabled 0
failure 1
pid 1126
rate_limit 0
backlog_limit 8192
lost 0
backlog 3
backlog_wait_time 60000
[root@wsfd-netdev13 nft_performance]# time nft -f 2.rules

real	0m0.538s
user	0m0.407s
sys	0m0.126s
[root@wsfd-netdev13 nft_performance]# nft flush ruleset
[root@wsfd-netdev13 nft_performance]# nft -f 1.rules
[root@wsfd-netdev13 nft_performance]# auditctl -e1
enabled 1
failure 1
pid 1126
rate_limit 0
backlog_limit 8192
lost 0
backlog 0
backlog_wait_time 60000
[root@wsfd-netdev13 nft_performance]# time nft -f 2.rules

real	0m23.890s
user	0m0.397s
sys	0m0.238s

IIUC, the kernel emits an audit log for each and every one of the 70k elements
being added. What's worse, they all look identical, so there's not even a
benefit in doing that:

type=NETFILTER_CFG msg=audit(1611927321.092:155): table=filter:13;xtest:2 family=7 entries=2 op=nft_register_setelem pid=2844 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm="nft"
type=NETFILTER_CFG msg=audit(1611927321.092:155): table=filter:13;xtest:2 family=7 entries=2 op=nft_register_setelem pid=2844 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm="nft"
type=NETFILTER_CFG msg=audit(1611927321.092:155): table=filter:13;xtest:2 family=7 entries=2 op=nft_register_setelem pid=2844 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm="nft"

Li Shuang, I wonder why your numbers are much better. Did you run the tests
with auditd disabled?

Richard, can we reduce the logging (for elements at least) to once per commit
instead of once per element?

Comment 5 Richard Guy Briggs 2021-01-29 15:48:47 UTC
(In reply to Phil Sutter from comment #4)
> When reproducing this, I got even much worse results and it looks like recently added audit logging is killing performance:

Yes, this was added to address the upstream issue: https://github.com/linux-audit/audit-kernel/issues/124

> IIUC, the kernel emits an audit log for each and every one of the 70k elements being added. What's worse, they all look identical, so there's not even a benefit in doing that:

This is similar to a recent fedora issue, and your suggestion will help there too.  https://bugzilla.redhat.com/show_bug.cgi?id=1918013
The records aren't identical if we add the network namspace identifiers (dev/inode).  I already have a patch for that, but is waiting for other work to go upstream first:
  https://github.com/linux-audit/audit-kernel/issues/79 "RFE: add net namespace ID to NETFILTER_CFG records"

> Richard, can we reduce the logging (for elements at least) to once per commit instead of once per element?

I sought that kind of feedback when the original patchset was posted upstream.  Can we work on this together to keep the essentials while ditching the noise?

Comment 6 Li Shuang 2021-02-01 07:50:04 UTC
Hi Phil,

I used the default auditd configure to do the above tests, so I think it's auditd serivce started status with default configure. I also rerun the tests on both kernel 240 and 279 with auditd disabled/enabled status, more details are as follows:

+----------------+----------+----------+-------------------------------------------------------------+
|                |  auditd  | auditctl |                         2.rules                             |
+----------------+----------+----------+------+----------+----------+----------+----------+----------+
|                |          |          | real | 0m0.394s | 0m0.395s | 0m0.413s | 0m0.389s | 0m0.417s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                | disabled |    -     | user | 0m0.299s | 0m0.301s | 0m0.314s | 0m0.291s | 0m0.311s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |          | sys  | 0m0.089s | 0m0.093s | 0m0.097s | 0m0.097s | 0m0.102s |
| 4.18.0-240.el8 +----------+----------+------+----------+----------+----------+----------+----------+
|                |          |          | real | 0m0.397s | 0m0.396s | 0m0.399s | 0m0.395s | 0m0.401s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                | enabled  |    -     | user | 0m0.298s | 0m0.298s | 0m0.304s | 0m0.296s | 0m0.299s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |          | sys  | 0m0.093s | 0m0.096s | 0m0.093s | 0m0.096s | 0m0.102s |
+----------------+----------+----------+------+----------+----------+----------+----------+----------+
|                |          |          | real | 0m0.421s | 0m0.439s | 0m0.418s | 0m0.432s | 0m0.430s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                | disabled |    -     | user | 0m0.306s | 0m0.303s | 0m0.305s | 0m0.304s | 0m0.304s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |          | sys  | 0m0.110s | 0m0.134s | 0m0.112s | 0m0.126s | 0m0.124s |
|                +----------+----------+------+----------+----------+----------+----------+----------+
|                |          |          | real | 0m1.093s | 0m1.075s | 0m1.137s | 0m1.127s | 0m1.128s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |    -     | user | 0m0.306s | 0m0.305s | 0m0.299s | 0m0.311s | 0m0.302s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |          | sys  | 0m0.258s | 0m0.252s | 0m0.285s | 0m0.260s | 0m0.247s |
+ 4.18.0-279.el8 +          +----------+------+----------+----------+----------+----------+----------+
|                |          |          | real | 0m0.447s | 0m0.433s | 0m0.431s | 0m0.425s | 0m0.427s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                | enabled  |    e0    | user | 0m0.306s | 0m0.304s | 0m0.309s | 0m0.305s | 0m0.307s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |          | sys  | 0m0.139s | 0m0.127s | 0m0.120s | 0m0.118s | 0m0.118s |
|                +          +----------+------+----------+----------+----------+----------+----------+
|                |          |          | real | 0m1.215s | 0m1.123s | 0m1.207s | 0m1.220s | 0m1.199s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |    e1    | user | 0m0.303s | 0m0.285s | 0m0.315s | 0m0.310s | 0m0.297s |
|                |          +          +------+----------+----------+----------+----------+----------+
|                |          |          | sys  | 0m0.246s | 0m0.259s | 0m0.220s | 0m0.233s | 0m0.236s |
+----------------+----------+----------+------+----------+----------+----------+----------+----------+

I'm not sure why the results are better in my tests, and I can provide my environment as follows:

dell-per740-09.rhts.eng.pek2.redhat.com root/redhat

Hope this helps.

Thanks,
Shuang

Comment 7 Phil Sutter 2021-02-01 14:21:25 UTC
Hi,

(In reply to Richard Guy Briggs from comment #5)
[...]
> > Richard, can we reduce the logging (for elements at least) to once per commit instead of once per element?
> 
> I sought that kind of feedback when the original patchset was posted
> upstream.  Can we work on this together to keep the essentials while
> ditching the noise?

Richard, yes, I have this on my TODO for this week. Will get back to you later.

Li Shuang, thanks for verifying. It is still unclear why your bad results are
so much less bad than mine, but I guess we have found the culprit as your
results show the relation to auditd as well.

Thanks, Phil

Comment 8 Li Shuang 2021-02-03 04:12:13 UTC
Created attachment 1754558 [details]
iptables.rules

Hi Phil,

I did "iptables -F; time iptables-restore < iptables.rules;" test on kernel-4.18.0-280.el8.x86_64 and it seems that iptables-restore also spend about double time to add the same number of iptables stateless rules when audit is enabled:

+----------------+----------+-------------------------------------------------------------+
|                | auditctl |                      iptables.rules                         |
+----------------+----------+------+----------+----------+----------+----------+----------+
|                |          | real | 0m2.467s | 0m2.313s | 0m2.385s | 0m2.455s | 0m2.477s |
|                |          +------+----------+----------+----------+----------+----------+
|                |    e0    | user | 0m1.212s | 0m1.159s | 0m1.188s | 0m1.187s | 0m1.215s |
|                |          +------+----------+----------+----------+----------+----------+
|                |          | sys  | 0m1.242s | 0m1.141s | 0m1.183s | 0m1.256s | 0m1.246s |
| 4.18.0-280.el8 +----------+------+----------+----------+----------+----------+----------+
|                |          | real | 0m4.078s | 0m3.867s | 0m3.848s | 0m3.922s | 0m3.880s |
|                |          +------+----------+----------+----------+----------+----------+
|                |    e1    | user | 0m1.270s | 0m1.171s | 0m1.172s | 0m1.234s | 0m1.210s |
|                |          +------+----------+----------+----------+----------+----------+
|                |          | sys  | 0m1.543s | 0m1.420s | 0m1.428s | 0m1.497s | 0m1.479s |
+----------------+----------+------+----------+----------+----------+----------+----------+

Do you think the bug need a RHEL-8.4.0 blocker?

Thanks,
Shuang

Comment 9 Phil Sutter 2021-02-03 13:01:41 UTC
Hi Li Shuang,

(In reply to Li Shuang from comment #8)
> I did "iptables -F; time iptables-restore < iptables.rules;" test on
> kernel-4.18.0-280.el8.x86_64 and it seems that iptables-restore also spend
> about double time to add the same number of iptables stateless rules when
> audit is enabled:

Thanks a lot for doing these tests!

I wonder if there's a functional difference to legacy iptables which presumably
has the auditd calls as well. Maybe there's just a single log per commit
instead of one per rule? You do see one line of audit log for each rule
applied, right?

> Do you think the bug need a RHEL-8.4.0 blocker?

I am not sure. On one hand it is clearly a performance regression. On the
other, not having audit logging technically qualifies as regression from legacy
iptables. Also, I assume there's a business case for having this.

Cheers, Phil

Comment 10 Richard Guy Briggs 2021-02-03 17:48:05 UTC
(In reply to Phil Sutter from comment #9)
> I wonder if there's a functional difference to legacy iptables which presumably has the auditd calls as well. Maybe there's just a single log per commit instead of one per rule? You do see one line of audit log for each rule applied, right?

Legacy iptables has a single audit event per table change.  I was hoping for the same when auditting nft, but it wasn't so simple.

> > Do you think the bug need a RHEL-8.4.0 blocker?
> 
> I am not sure. On one hand it is clearly a performance regression. On the other, not having audit logging technically qualifies as regression from legacy iptables. Also, I assume there's a business case for having this.

Yes, security certification.

Comment 11 Li Shuang 2021-02-04 12:51:53 UTC
Created attachment 1755060 [details]
audit.tar.gz

Hi Phil,

I run "iptables -F; time iptables-restore < iptables.rules;" again with "auditctl -e1" status and found that all log files under /var/log/audit folder have been rewritten. I attached the log files as audit.tar.gz.

It seems that a lot of entries as follows are added in /var/log/audit/audit.log during iptables rules restoring:
type=NETFILTER_CFG msg=audit(1612442571.873:119): table=filter:6;FORWARD:2 family=2 entries=35545 op=nft_unregister_rule pid=2461 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm="iptables"
type=NETFILTER_CFG msg=audit(1612442571.873:119): table=filter:6;FORWARD:2 family=2 entries=35546 op=nft_unregister_rule pid=2461 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm="iptables"
type=NETFILTER_CFG msg=audit(1612442571.873:119): table=filter:6;FORWARD:2 family=2 entries=35547 op=nft_unregister_rule pid=2461 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm="iptables"

Thanks,
Shuang

Comment 14 Richard Guy Briggs 2021-03-22 21:05:39 UTC
2021-03-18 v1 patch posted upstream https://listman.redhat.com/archives/linux-audit/2021-March/msg00109.html
2021-03-22 v2 patch posted upstream https://listman.redhat.com/archives/linux-audit/2021-March/msg00149.html

Comment 15 Richard Guy Briggs 2021-03-23 20:08:07 UTC
2021-03-23 v3 patch posted upstream https://listman.redhat.com/archives/linux-audit/2021-March/msg00152.html

Comment 16 Richard Guy Briggs 2021-03-25 04:10:57 UTC
v4 posted upstream https://listman.redhat.com/archives/linux-audit/2021-March/msg00157.html

Comment 17 Richard Guy Briggs 2021-03-26 18:00:41 UTC
v5 posted upstream  https://listman.redhat.com/archives/linux-audit/2021-March/msg00159.html

Comment 18 Richard Guy Briggs 2021-03-31 20:49:56 UTC
2021-03-31 v5 merged upstream in nf-next bb4052e57b5b ("audit: log nftables configuration change events once per table")
2021-03-31 FW found UAF: nf-next fix e4d272948d25 <pablo> ("netfilter: nf_tables: use-after-free")

Comment 19 Richard Guy Briggs 2021-03-31 21:33:17 UTC
2021-03-31 v5 merged upstream in nf-next forced update c520292f29b8 ("audit: log nftables configuration change events once per table")

Comment 20 Richard Guy Briggs 2021-04-15 01:21:08 UTC
2021-04-02 07:45 Dan Carpenter dadf33c9f6b5 ("netfilter: nftables: fix a warning message in nf_tables_commit_audit_collect()")

Comment 22 Richard Guy Briggs 2022-06-28 12:30:06 UTC
resolved by bz1957052, in rhel8.5 kernel-4.18.0-323.el8, also in rhel8.4.z


Note You need to log in before you can comment on or make changes to this bug.