Bug 2215362
| Summary: | [Azure][RHEL-9][CVM][Network] Very low TCP throughput between 2 CVMs | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Li Tian <litian> |
| Component: | kernel | Assignee: | Vitaly Kuznetsov <vkuznets> |
| kernel sub component: | Hyper-V | QA Contact: | Li Tian <litian> |
| Status: | VERIFIED --- | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | andavis, bdas, litian, vkuznets, xuli, xxiong, yacao, yuxisun |
| Version: | 9.3 | Keywords: | Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | kernel-5.14.0-351.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Li Tian
2023-06-15 17:03:12 UTC
(In reply to Li Tian from comment #0) > 1. The issue has been bisected to kernel 5.14.0-195.el9.x86_64. In other > words, 5.14.0-194.el9.x86_64 was still good: Did you figure out if the issue is on the receiver side or on the sender's? I.e. did you try downgrading the kernel to -194 on one side only? Alternatively, you can try to use only one CVM and a 'normal' VM on the other side. I'm a bit surprised the issue seems to be between 5.14.0-194.el9 and 5.14.0-195.el9 as I don't see much besides https://bugzilla.redhat.com/show_bug.cgi?id=2136491 but maybe that's the one? In case it is, the question is why only CVMs are affected... So issue is on the client side. When I have -194 on client and -195 on server, this issue is gone. And another observation is that this issue is only reproducible at a big buffer length. In other words, with 'iperf3 -l 32' this the performance is always ~0.2Gbps regardless of kernel version. With 'iperf -l 4096' this issue is brought to light from -194 (~3Gbps) to -195 (~0.2Gbps). 5.14.0-325.el9.x86_64 (client) on Standard_D64s_v4 does not have this issue: # iperf3 -c 10.0.0.4 -b 0 -f g -i10 -l 4096 -t 30 -p 750 -P 1 -4 Connecting to host 10.0.0.4, port 750 [ 5] local 10.0.0.5 port 39402 connected to 10.0.0.4 port 750 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 3.55 GBytes 3.05 Gbits/sec 87 1.57 MBytes [ 5] 10.00-20.00 sec 3.71 GBytes 3.18 Gbits/sec 2 2.14 MBytes (same server - -195 on CVM) *** Bug 2227799 has been marked as a duplicate of this bug. *** Tested on 5.14.0-349.2880_954641462.el9.x86_64: TCP throughput is good: [ 5] 0.00-10.00 sec 3.83 GBytes 3.29 Gbits/sec 15 2.26 MBytes [ 5] 10.00-20.00 sec 3.79 GBytes 3.25 Gbits/sec 4 2.63 MBytes Disk IOPS is good: bw ( KiB/s): min=26712, max=55272, per=100.00%, avg=49412.88, stdev=5239.23, samples=59 iops : min= 6678, max=13818, avg=12353.25, stdev=1309.81, samples=59 Tested good on 5.14.0-351.el9.x86_64: TCP throughput is good: [ 5] 0.00-10.00 sec 3.84 GBytes 3.30 Gbits/sec 31 2.25 MBytes [ 5] 10.00-20.00 sec 3.87 GBytes 3.33 Gbits/sec 9 1.96 MBytes Disk IOPS is good: bw ( KiB/s): min=26360, max=55272, per=100.00%, avg=50867.22, stdev=5781.01, samples=59 iops : min= 6590, max=13818, avg=12716.76, stdev=1445.23, samples=59 |