Bug 1945040
Summary: | 18% rx pps performance regression with rhel9 guest compared with rhel8 guest | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Quan Wenli <wquan> |
Component: | kernel | Assignee: | Laurent Vivier <lvivier> |
kernel sub component: | KVM | QA Contact: | Quan Wenli <wquan> |
Status: | CLOSED CURRENTRELEASE | Docs Contact: | Daniel Vozenilek <davozeni> |
Severity: | unspecified | ||
Priority: | high | CC: | aadam, chayang, jasowang, jherrman, lulu, lvivier, mst, mvanderw, nilal, pasik, virt-maint, ymao |
Version: | 9.0 | Keywords: | Regression, Triaged |
Target Milestone: | beta | Flags: | pm-rhel:
mirror+
|
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
.Network traffic performance in virtual machines is no longer reduced when under heavy load
Previously, RHEL virtual machines had, in some cases, decreased performance when handling high levels of network traffic. The underlying code has been fixed and network traffic performance now works as expected in the described circumstances.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-23 10:34:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Comment 1
Quan Wenli
2021-03-31 09:55:01 UTC
After bisect, I found the first bad commit. Bad commit: 3226b158e67cfaa677fd180152bfb28989cb2fac is the first bad commit commit 3226b158e67cfaa677fd180152bfb28989cb2fac Author: Eric Dumazet <edumazet> Date: Wed Jan 13 08:18:19 2021 -0800 net: avoid 32 x truesize under-estimation for tiny skbs Both virtio net and napi_get_frags() allocate skbs with a very small skb->head While using page fragments instead of a kmalloc backed skb->head might give a small performance improvement in some cases, there is a huge risk of under estimating memory usage. For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations per page (order-3 page in x86), or even 64 on PowerPC We have been tracking OOM issues on GKE hosts hitting tcp_mem limits but consuming far more memory for TCP buffers than instructed in tcp_mem[2] Even if we force napi_alloc_skb() to only use order-0 pages, the issue would still be there on arches with PAGE_SIZE >= 32768 This patch makes sure that small skb head are kmalloc backed, so that other objects in the slab page can be reused instead of being held as long as skbs are sitting in socket queues. Note that we might in the future use the sk_buff napi cache, instead of going through a more expensive __alloc_skb() Another idea would be to use separate page sizes depending on the allocated length (to never have more than 4 frags per page) I would like to thank Greg Thelen for his precious help on this matter, analysing crash dumps is always a time consuming task. Fixes: fd11a83dd363 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb") Signed-off-by: Eric Dumazet <edumazet> Cc: Paolo Abeni <pabeni> Cc: Greg Thelen <gthelen> Reviewed-by: Alexander Duyck <alexanderduyck> Acked-by: Michael S. Tsirkin <mst> Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba> net/core/skbuff.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) With commit 3226b158e67cfaa677fd180152bfb28989cb2fac: 1.84 mpps on rx With commit 7da17624e7948d5d9660b910f8079d26d26ce453: 2.37 mpps on rx @Ariel, could you look at this ? Thanks, wenli Note that this has been fixed with the following commits upstream: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=0f6925b3e8da0dbbb52447ca8a8b42b371aac7db https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=38ec4944b593fd90c5ef42aaaa53e66ae5769d04 Thanks (In reply to jason wang from comment #3) > Note that this has been fixed with the following commits upstream: > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/ > ?id=0f6925b3e8da0dbbb52447ca8a8b42b371aac7db > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/ > ?id=38ec4944b593fd90c5ef42aaaa53e66ae5769d04 > > Thanks I apply above two patches in our latest downstream(5.12.0-rc5), the rx pps from 1.84 up to 1.93 mpps, but still not good as 2.37 mpps in comment#2. (In reply to Quan Wenli from comment #5) > (In reply to jason wang from comment #3) > > Note that this has been fixed with the following commits upstream: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/ > > ?id=0f6925b3e8da0dbbb52447ca8a8b42b371aac7db > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/ > > ?id=38ec4944b593fd90c5ef42aaaa53e66ae5769d04 > > > > Thanks > > I apply above two patches in our latest downstream(5.12.0-rc5), the rx pps > from 1.84 up to 1.93 mpps, but still not good as 2.37 mpps in comment#2. Thanks for the testing. I've proposed another idea to increase the performance. Engineer from Ali Cloud is working on that. Will give you the commit id when it was applied. (Actually the patch has been applied but has bugs, we're working on solving them). Thanks Here're the patches: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=fb32856b16ad9d5bcd75b76a274e2c515ac7b9d7 https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f5d7872a8b8a3176e65dc6f7f0705ce7e9a699e6 https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=af39c8f72301b268ad8b04bae646b6025918b82b Thanks (In reply to jason wang from comment #7) > Here're the patches: > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/ > ?id=fb32856b16ad9d5bcd75b76a274e2c515ac7b9d7 > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/ > ?id=f5d7872a8b8a3176e65dc6f7f0705ce7e9a699e6 > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/ > ?id=af39c8f72301b268ad8b04bae646b6025918b82b > > Thanks Apply above 3 patches base on kernel in comment#5, the rx pps turns back to 1.84 mpps. Can you remove the check: len > GOOD_COPY_LEN in page_to_skb() and retry? This check bascially suppresses the optimization for small packet (e.g 64B). Thanks (In reply to jason wang from comment #9) > Can you remove the check: > > len > GOOD_COPY_LEN > > in page_to_skb() and retry? > > This check bascially suppresses the optimization for small packet (e.g 64B). > > Thanks Cool, after remove it, then re-build kernel, the performance is back to 2.33 mpps. According to BZ 2069047 comment 10, RHEL 9.0 guest on RHEL 8.6 guest works well RHEL 9.0 guest on RHEL 9.0 meets the regression host guest rx pps 4.18.0-353 5.14.0-70.2.1.el9_0 2.39 mpps (8.6 host with 9.0 guest) 5.14.0-70.2.1.el9_0 5.14.0-70.2.1.el9_0 1.74 mpps (9.0 host with 9.0 guest) As the ITR is 9.1, could you test: host guest 4.18.0-402.el8 5.14.0-127.el9 5.14.0-127.el9 5.14.0-127.el9 Thanks (In reply to Laurent Vivier from comment #23) > According to BZ 2069047 comment 10, > > RHEL 9.0 guest on RHEL 8.6 guest works well > RHEL 9.0 guest on RHEL 9.0 meets the regression > > host guest rx pps > > 4.18.0-353 5.14.0-70.2.1.el9_0 2.39 mpps (8.6 host with > 9.0 guest) > 5.14.0-70.2.1.el9_0 5.14.0-70.2.1.el9_0 1.74 mpps (9.0 host with > 9.0 guest) > > As the ITR is 9.1, could you test: > > host guest > > 4.18.0-402.el8 5.14.0-127.el9 > 5.14.0-127.el9 5.14.0-127.el9 > > Thanks I will update the results when I got. Currently the ITM is 18. could you help review it and re-set the ITM? Thanks, wenli (In reply to Laurent Vivier from comment #23) > According to BZ 2069047 comment 10, > > RHEL 9.0 guest on RHEL 8.6 guest works well > RHEL 9.0 guest on RHEL 9.0 meets the regression > > host guest rx pps > > 4.18.0-353 5.14.0-70.2.1.el9_0 2.39 mpps (8.6 host with > 9.0 guest) > 5.14.0-70.2.1.el9_0 5.14.0-70.2.1.el9_0 1.74 mpps (9.0 host with > 9.0 guest) > > As the ITR is 9.1, could you test: > > host guest > > 4.18.0-402.el8 5.14.0-127.el9 > 5.14.0-127.el9 5.14.0-127.el9 host guest rx results 4.18.0-402.el8 5.14.0-127.el9 1.97 mpps 5.14.0-127.el9 5.14.0-127.el9 1.71 mpps Detail results: http://10.73.60.69/results/request/Bug1945040/rhel9.0host/kernel-5.14.0-127/pktgen_perf.html > > Thanks I've not been able to reproduce the problem on my system, I have a better performance with latest upstream kernel (v5.19+, b2a88c212e65) than with kernel without 3226b158e67c (5.11.0-rc2+, 7da17624e794). My command line is: /usr/libexec/qemu-kvm \ -nodefaults \ -nographic \ -machine q35 \ -m 4066 \ -smp 4 \ -blockdev node-name=file_image1,driver=file,filename=$IMAGE \ -blockdev node-name=drive_image1,driver=qcow2,file=file_image1 \ -device virtio-blk,id=virtioblk0,drive=drive_image1 \ -enable-kvm \ -cpu host \ -serial mon:stdio \ -device virtio-net,mac=52:54:00:7b:3f:6b,id=virtionet0,netdev=tap0 \ -netdev tap,id=tap0,vhost=on My results are: HOST 5.11.0-rc2+ rhel870 4.18.0-411.el8.x86_64 TX tap0: 0 pkts/s RX tap0: 928046 pkts/s rhel910 5.14.0-136.el9.x86_64 TX tap0: 0 pkts/s RX tap0: 927145 pkts/s HOST 5.19.0+ rhel870 4.18.0-411.el8.x86_64 TX tap0: 1 pkts/s RX tap0: 1106153 pkts/s rhel910 5.14.0-136.el9.x86_64 TX tap0: 1 pkts/s RX tap0: 1088796 pkts/s What did I miss? (In reply to Laurent Vivier from comment #29) > I've not been able to reproduce the problem on my system, I have a better > performance with latest upstream kernel (v5.19+, b2a88c212e65) than with > kernel without 3226b158e67c (5.11.0-rc2+, 7da17624e794). > > My command line is: > > /usr/libexec/qemu-kvm \ > -nodefaults \ > -nographic \ > -machine q35 \ > -m 4066 \ > -smp 4 \ > -blockdev node-name=file_image1,driver=file,filename=$IMAGE \ > -blockdev node-name=drive_image1,driver=qcow2,file=file_image1 \ > -device virtio-blk,id=virtioblk0,drive=drive_image1 \ > -enable-kvm \ > -cpu host \ > -serial mon:stdio \ > -device virtio-net,mac=52:54:00:7b:3f:6b,id=virtionet0,netdev=tap0 \ > -netdev tap,id=tap0,vhost=on > > My results are: > > HOST 5.11.0-rc2+ > > rhel870 4.18.0-411.el8.x86_64 TX tap0: 0 pkts/s RX tap0: 928046 pkts/s > rhel910 5.14.0-136.el9.x86_64 TX tap0: 0 pkts/s RX tap0: 927145 pkts/s your data are around 0.9 mpps, it maybe the rx performance issue can not reproduced with slowly pps rate ? > > HOST 5.19.0+ > > rhel870 4.18.0-411.el8.x86_64 TX tap0: 1 pkts/s RX tap0: 1106153 pkts/s > rhel910 5.14.0-136.el9.x86_64 TX tap0: 1 pkts/s RX tap0: 1088796 pkts/s > > What did I miss? (In reply to Quan Wenli from comment #30) > (In reply to Laurent Vivier from comment #29) > > I've not been able to reproduce the problem on my system, I have a better > > performance with latest upstream kernel (v5.19+, b2a88c212e65) than with > > kernel without 3226b158e67c (5.11.0-rc2+, 7da17624e794). > > > > My command line is: > > > > /usr/libexec/qemu-kvm \ > > -nodefaults \ > > -nographic \ > > -machine q35 \ > > -m 4066 \ > > -smp 4 \ > > -blockdev node-name=file_image1,driver=file,filename=$IMAGE \ > > -blockdev node-name=drive_image1,driver=qcow2,file=file_image1 \ > > -device virtio-blk,id=virtioblk0,drive=drive_image1 \ > > -enable-kvm \ > > -cpu host \ > > -serial mon:stdio \ > > -device virtio-net,mac=52:54:00:7b:3f:6b,id=virtionet0,netdev=tap0 \ > > -netdev tap,id=tap0,vhost=on > > > > My results are: > > > > HOST 5.11.0-rc2+ > > > > rhel870 4.18.0-411.el8.x86_64 TX tap0: 0 pkts/s RX tap0: 928046 pkts/s > > rhel910 5.14.0-136.el9.x86_64 TX tap0: 0 pkts/s RX tap0: 927145 pkts/s > > your data are around 0.9 mpps, it maybe the rx performance issue can not > reproduced with slowly pps rate ? > So you mean it depends on the machine performance? Could you try to reproduce the problem with QEMU command line above? I try to have a simplified reproducer (no libvirt, minimum devices). (In reply to Laurent Vivier from comment #31) > (In reply to Quan Wenli from comment #30) > > (In reply to Laurent Vivier from comment #29) > > > I've not been able to reproduce the problem on my system, I have a better > > > performance with latest upstream kernel (v5.19+, b2a88c212e65) than with > > > kernel without 3226b158e67c (5.11.0-rc2+, 7da17624e794). > > > > > > My command line is: > > > > > > /usr/libexec/qemu-kvm \ > > > -nodefaults \ > > > -nographic \ > > > -machine q35 \ > > > -m 4066 \ > > > -smp 4 \ > > > -blockdev node-name=file_image1,driver=file,filename=$IMAGE \ > > > -blockdev node-name=drive_image1,driver=qcow2,file=file_image1 \ > > > -device virtio-blk,id=virtioblk0,drive=drive_image1 \ > > > -enable-kvm \ > > > -cpu host \ > > > -serial mon:stdio \ > > > -device virtio-net,mac=52:54:00:7b:3f:6b,id=virtionet0,netdev=tap0 \ > > > -netdev tap,id=tap0,vhost=on > > > > > > My results are: > > > > > > HOST 5.11.0-rc2+ > > > > > > rhel870 4.18.0-411.el8.x86_64 TX tap0: 0 pkts/s RX tap0: 928046 pkts/s > > > rhel910 5.14.0-136.el9.x86_64 TX tap0: 0 pkts/s RX tap0: 927145 pkts/s > > > > your data are around 0.9 mpps, it maybe the rx performance issue can not > > reproduced with slowly pps rate ? > > > > So you mean it depends on the machine performance? > Could you try to reproduce the problem with QEMU command line above? > I try to have a simplified reproducer (no libvirt, minimum devices). Hi wenli, I have tried the same step, but I met the same problem. I can not reproduce the issue would you help verify this? also would you help verify this in latest el9 kernel ? since after the code sync, the commit mentioned in this bz are all included in rhel9 source code Thanks cindy host 5.11/ guest 5.11 (without the commits ) [root@localhost ~]# ./pps.sh eth1 TX eth1: 182062 pkts/s RX eth1: 0 pkts/s TX eth1: 185255 pkts/s RX eth1: 0 pkts/s TX eth1: 181486 pkts/s RX eth1: 0 pkts/s TX eth1: 182976 pkts/s RX eth1: 0 pkts/s TX eth1: 182016 pkts/s RX eth1: 0 pkts/s TX eth1: 181440 pkts/s RX eth1: 0 pkts/s TX eth1: 181858 pkts/s RX eth1: 0 pkts/s TX eth1: 182922 pkts/s RX eth1: 0 pkts/s TX eth1: 173002 pkts/s RX eth1: 0 pkts/s TX eth1: 183056 pkts/s RX eth1: 0 pkts/s TX eth1: 184366 pkts/s RX eth1: 0 pkts/s TX eth1: 183168 pkts/s RX eth1: 0 pkts/s TX eth1: 183045 pkts/s RX eth1: 0 pkts/s host 5.12+/ guest 5.11 (after the commit merged) TX eth1: 203760 pkts/s RX eth1: 0 pkts/s TX eth1: 203136 pkts/s RX eth1: 0 pkts/s TX eth1: 186802 pkts/s RX eth1: 0 pkts/s TX eth1: 182638 pkts/s RX eth1: 0 pkts/s TX eth1: 188829 pkts/s RX eth1: 0 pkts/s TX eth1: 193052 pkts/s RX eth1: 0 pkts/s TX eth1: 204710 pkts/s RX eth1: 0 pkts/s TX eth1: 202590 pkts/s RX eth1: 0 pkts/s TX eth1: 203758 pkts/s RX eth1: 0 pkts/s TX eth1: 203273 pkts/s RX eth1: 0 pkts/s TX eth1: 199357 pkts/s RX eth1: 0 pkts/s TX eth1: 190603 pkts/s RX eth1: 0 pkts/s TX eth1: 199098 pkts/s RX eth1: 0 pkts/s TX eth1: 197656 pkts/s RX eth1: 0 pkts/s |