This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 2270408 - 20% performance regression on XDP drop with mlx5 in ELN kernels (compared to RHEL 9 candidates)
Summary: 20% performance regression on XDP drop with mlx5 in ELN kernels (compared to ...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-03-20 07:51 UTC by Samuel Dobroň
Modified: 2024-07-27 22:42 UTC (History)
21 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-07-22 08:37:03 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   FC-1196 0 None None None 2024-07-27 22:42:44 UTC

Description Samuel Dobroň 2024-03-20 07:51:18 UTC
1. Please describe the problem:
We compared performance of ELN and RHEL9 candidate kernels and noticed significant drop in XDP drop [1] on mlx5 (25G).

On any rhel9 candidate kernel we are able to drop 19-20M pkts/sec but on an ELN kernels, we are reaching just 15M pkts/sec (CPU utillization remains the same - around 100%). 

We don't see such regression on ixgbe or i40e.


[1] https://github.com/xdp-project/xdp-tools/tree/master/xdp-bench#the-drop-command

2. What is the Version-Release number of the kernel:

kernel-6.4.0-0.rc6.20230616git40f71e7cd3c6.50.eln126
  https://koji.fedoraproject.org/koji/buildinfo?buildID=2217297

We tested just x86_64, not sure about other archs.

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Yes, it (most likely) comes from some patch in mentioned kernel, the previous one (kernel-6.4.0-0.rc6.20230614gitb6dad5178cea.49.eln126 - https://koji.fedoraproject.org/koji/taskinfo?taskID=102148156) is ok.



4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Reproducible always.

Steps:
- install affected kernel, kernel-modules-extra and kernel-modules-internal packages (for pktgen)
Traffic generator machine (wsfd-advnetlab65.anl.eng.rdu2.dc.redhat.com):
git clone https://github.com/torvalds/linux.git
cd linux/samples/pktgen/
./pktgen_sample03_burst_single_flow.sh -m MAC -d IP -i INF

DUT machine (receiver) (wsfd-advnetlab66.anl.eng.rdu2.dc.redhat.com):
dnf install -y git nano cmake clang llvm elfutils-libelf-devel libpcap-devel perf bpftool m4
git clone https://github.com/xdp-project/xdp-tools.git
cd xdp-tools/
./configure
make
./xdp-bench drop INF


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Not sure, i've not checked.


6. Are you running any modules that not shipped with directly Fedora's kernel?:
No.


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.
-

Reproducible: Always

Comment 5 Samuel Dobroň 2024-07-22 08:37:03 UTC
Migrated to jira - https://issues.redhat.com/browse/FC-1196 

Closing.


Note You need to log in before you can comment on or make changes to this bug.