1280435 – 'ip addr flush' much slower than in RHEL6?

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1280435 - 'ip addr flush' much slower than in RHEL6?

Summary: 'ip addr flush' much slower than in RHEL6?

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	iproute
Sub Component:
Version:	7.6
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	Andrea Claudi
QA Contact:	BaseOS QE Security Team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1641094
TreeView+	depends on / blocked

Reported:	2015-11-11 17:51 UTC by Phil Sutter
Modified:	2019-12-05 16:31 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1641094 (view as bug list)
Environment:
Last Closed:	2019-12-05 16:31:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
ip_addr_flush_reproducer.sh (558 bytes, application/x-shellscript) 2016-01-18 12:58 UTC, Phil Sutter	no flags	Details
perf record for promote_secondaries=1 (10.39 MB, application/octet-stream) 2016-04-21 11:17 UTC, Phil Sutter	no flags	Details
perf record for promote_secondaries=0 (2.45 MB, application/octet-stream) 2016-04-21 11:18 UTC, Phil Sutter	no flags	Details
View All

Description Phil Sutter 2015-11-11 17:51:02 UTC

This might be a regression: During testing, flushing many (~40k) addresses from an interface took very long in RHEL7 compared to the same on RHEL6.

Comment 3 Phil Sutter 2016-01-18 12:58:53 UTC

Created attachment 1115847 [details]
ip_addr_flush_reproducer.sh

Comment 7 Jaroslav Aster 2016-04-14 15:11:35 UTC

Hi Phil,

I can confirm it. Tested on all rhel-6 architectures and x86_64 rhel-7. There is a huge performance regression between rhel-6 and rhel-7.

Comment 10 Phil Sutter 2016-04-15 10:51:24 UTC

I ran a few more tests:

RHEL7, upstream kernel, RHEL iproute: 0m53.384s
RHEL7, upstream kernel, upstream iproute: 1m2.577s
RHEL6, RHEL6 kernel, RHEL iproute: 0m8.199s
RHEL6, RHEL6 kernel, upstream iproute: 0m7.594s

So this very much looks like a kernel issue, at least iproute version seems unrelated.

As the output in comment 4 shows, iproute flushes many more addresses on RHEL7 than on RHEL6. This reminded me of the 'promote_secondaries' sysctl setting, which is indeed disabled in RHEL6 and enabled in RHEL7. Running the test again in RHEL7 with promote_secondaries disabled helps in run time, but shows a new error:

flushing all ip addresses
Failed to send flush request: No buffer space available
Flush terminated

real	0m31.644s
user	0m0.000s
sys	0m31.305s

This needs further investigation, as well as a possible way to disable promote_secondaries temporarily while flushing the interface as it hinders operation.

Comment 11 Phil Sutter 2016-04-21 11:15:33 UTC

Regarding the error message printed with promote_secondaries=0:

recv() in rtnl_send_check() sets errno to ENOBUFS. This call is just an early check for errors, so it comes from the kernel. Another prove to this is that with upstream kernel the error message does not show.

Despite what one might think, the flush completes in both cases, so this is again rather a cosmetic issue.

As for the performance issue, I have created perf records for promote_secondaries on/off on RHEL kernel, preliminary analysis did not yield a result yet, though. Address deletion is obviously quite complicated due to the necessary management of routing table adjustments.

Comment 12 Phil Sutter 2016-04-21 11:17:20 UTC

Created attachment 1149429 [details]
perf record for promote_secondaries=1

Comment 13 Phil Sutter 2016-04-21 11:18:45 UTC

Created attachment 1149430 [details]
perf record for promote_secondaries=0

Comment 14 Phil Sutter 2016-08-04 12:44:04 UTC

Removing devel_ack+ since it is still unclear where the performance regression comes from and whether it can be fixed or simply has to be accepted as a side effect of increased complexity in RHEL7 kernel over RHEL6.

Comment 17 Jakub Sitnicki 2018-03-23 16:12:24 UTC

Thanks to Phil for providing a reproducer.

Gave it a run on RHEL7 and FC27 VMs. With 40k addresses the flush took:

* 1m2.566s on RHEL7, 3.10.0-861.el7, iproute-4.11.0-14.el7
* 0m15.713s on FC27, 4.15.7-300.fc27, iproute-4.15.0-1.fc27

So latest(ish) stable kernel is not as fast as RHEL6 (basing on Phil's report here, I haven't tested it yet), but much better than RHEL7.

Also, on Fedora the flush happens in just 2 rounds, just as on RHEL6, and not all the addresses are accounted for.

Will look what has changed upstream that could have brought back the performance.

Comment 19 Jakub Sitnicki 2018-05-19 21:20:21 UTC

(In reply to Jakub Sitnicki from comment #17)
> Thanks to Phil for providing a reproducer.
> 
> Gave it a run on RHEL7 and FC27 VMs. With 40k addresses the flush took:
> 
> * 1m2.566s on RHEL7, 3.10.0-861.el7, iproute-4.11.0-14.el7
> * 0m15.713s on FC27, 4.15.7-300.fc27, iproute-4.15.0-1.fc27
> 

The above measurement on FC27 was wrong. While I was running the FC27 kernel and FC27 user-space, systemd was not running so net.ipv4.conf.default.promote_secondaries was not enabled. This was the reason of seen speed-up. Once I've enabled promoting secondaries, the flush took as long as on RHEL7.

NB: I did not get the "Failed to send flush request: No buffer space available" error message brought up by Phil on either RHEL7 or FC27.

In the end is this a performance regression at all?

Comparing flushing with promoting secondaries disabled (the default on RHEL6) with flushing with promoting secondaries enabled (the default on RHEL7) does not look to me like comparing apples to apples.

In the kernel we can't remove all addresses in one run when we have to do promotions. Essentially we are testing and comparing two different operations. 

Please note that the default does not come from changes in the kernel, it was introduced in systemd 216 as stated in NEWS file:

CHANGES WITH 216:
[...]
        * The default sysctl.d/ snippets will now set

                net.ipv4.conf.default.promote_secondaries=1

          This has the benefit of no flushing secondary IP addresses
          when primary addresses are removed.

And this is indeed what we have in RHEL7 (and FC27):

[root@localhost ~]# rpm -q systemd
systemd-219-57.el7.x86_64
[root@localhost ~]# grep promote_secondaries /usr/lib/sysctl.d/50-default.conf
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1

@Jaroslav Aster, what is your take on this? Since the default sysctl has changed between RHEL6 and RHEL7, I'm inclined to close it as NOTABUG. At least from kernel/iproute point of view.

Comment 20 Jakub Sitnicki 2018-05-21 14:05:42 UTC

As per earlier Phil's suggestion. I think we can turn this into an RFE and attempt to convince upstream to make the flush with promote_secondaries on faster. I've prepared a PoC that introduces a new flag to be used with RTM_DELADDR requests to let the kernel know that we're going for a flush and it seems to work.

It will make it easier to motivate the change if there is a use-case (among Layered Products perhaps?) for it. Now, I'm not sure if this BZ is a result of QE investigation or feedback from a customer. Can anyone shed some light on the background of this BZ?

Comment 21 Jakub Sitnicki 2018-06-07 10:16:54 UTC

Proposed a new flag for RTM_DELADDR requests that would speed-up the flush upstream:

https://marc.info/?l=linux-netdev&m=152836638902888

Comment 22 Jaroslav Aster 2018-06-12 10:52:55 UTC

Hi Jakub,

I agree with you, it does not seem like a regression. My suggestion is close this bug as NOTABUG and create a new one for RFE.

I think, there is no customer behind it. It is a result of qe work.

Comment 23 Jakub Sitnicki 2018-08-02 10:25:07 UTC

(In reply to Jakub Sitnicki from comment #21)
> Proposed a new flag for RTM_DELADDR requests that would speed-up the flush
> upstream:
> 
> https://marc.info/?l=linux-netdev&m=152836638902888

It has been suggested by Michal Kubecek that the new 'FLUSH' flag handling can be improved so that all interface addresses are flushed in one go in response to a single request from userspace:

https://marc.info/?l=linux-netdev&m=152836923403698&w=2

To differentiate a flush request from an address removal request, the former would carry no addresses as netlink attributes and would have the flush flag set.

It seems worth pursing. The expected benefits are:
- increased performance of the flush operation when secondary address promotion is enabled (as demonstrated in proposed patch),
- easier to use API for userspace to remove all addresses from an interface; at the moment multiple requests have to be issued to flush all addresses.

Moving the BZ back to NEW as I will be no longer working on it.

Comment 25 Phil Sutter 2018-11-19 16:32:25 UTC

Interestingly, current upstream kernel already supports RTM_DELADDR request without any address specified (at least in IPv4). The effect is that kernel's inet_rtm_deladdr() removes all addresses. Sadly, this doesn't increase performance over the current method.

Assuming that address promotion in kernel is the actual bottleneck, this is what has to be improved (or eliminated) while flushing (obviously).

OTOH, if the above works since reasonably old kernels, a fallback from passing the new FLUSH flag and omitting addresses from the request might not be required.

Comment 26 Andrea Claudi 2019-06-06 13:47:05 UTC

Moving it to RHEL-7.8.

Comment 27 Andrea Claudi 2019-12-05 16:31:11 UTC

This will be fixed for rhel8 (issue cloned here: https://bugzilla.redhat.com/show_bug.cgi?id=1641094).
Feel free to reopen if needed.

Note You need to log in before you can comment on or make changes to this bug.