RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2173996 - systemd update causes network performance regression
Summary: systemd update causes network performance regression
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: systemd
Version: 9.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Michal Sekletar
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks: 2176899
TreeView+ depends on / blocked
 
Reported: 2023-02-28 16:09 UTC by Adam Okuliar
Modified: 2023-05-09 10:34 UTC (History)
10 users (show)

Fixed In Version: systemd-252-10.el9_2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2176899 (view as bug list)
Environment:
Last Closed: 2023-05-09 08:22:35 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github redhat-plumbers systemd-rhel9 pull 147 0 None open (#2173996) Revert "user: delegate cpu controller, assign weights to user slices" 2023-03-09 09:41:42 UTC
Red Hat Issue Tracker RHELPLAN-150082 0 None None None 2023-02-28 16:11:16 UTC
Red Hat Product Errata RHBA-2023:2531 0 None None None 2023-05-09 08:22:52 UTC

Comment 1 Adam Okuliar 2023-02-28 16:27:27 UTC
Please note that this regression is visible only with specific conditions.

Hardware used:
Intel IceLake cpus (EPYCs are unaffected) 
Mellanox Connectx-6 200gbit nic (100g nics are not affected)

Only when running 16 parallel iperf streams on 16 core cpus. IRQs are pinned, exact command sequence used:

# tuna --irqs=mlx5* --cpus=0-15 --spread

# iperf3 --json --client 172.16.1.26 --time 30 --port 5201  --affinity 0,0 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5202  --affinity 1,1 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5203  --affinity 2,2 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5204  --affinity 3,3 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5205  --affinity 4,4 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5206  --affinity 5,5 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5207  --affinity 6,6 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5208  --affinity 7,7 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5209  --affinity 8,8 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5210  --affinity 9,9 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5211  --affinity 10,10 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5212  --affinity 11,11 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5213  --affinity 12,12 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5214  --affinity 13,13 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5215  --affinity 14,14 --parallel 8
# iperf3 --json --client 172.16.1.26 --time 30 --port 5216  --affinity 15,15 --parallel 8

Comment 4 Michal Sekletar 2023-03-03 09:17:22 UTC
The performance regression is caused by ksoftirqd eating up a lot of CPU time in comparison with the case when NIC bandwidth is expected. This is a systemd change which appeared in between version v250 and v252 that has this side effect on ksoftirqd.

https://github.com/systemd/systemd/commit/b8df7f8629cb310beac982a4779b27eabe5362c6

After reverting the change the performance recovers. This change effectively enables CPU cgroup controller globally which adds some overhead on the kernel side and that exhibits in the test case. I have some intuitive understanding why this happens but more explanation from kernel cgroup expert would be welcome. On systemd side we will revert the change until we have a full understanding of the performance regression and maybe even some fixes on kernel side.

Comment 11 errata-xmlrpc 2023-05-09 08:22:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (systemd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2531


Note You need to log in before you can comment on or make changes to this bug.