Description of problem: TCP throughput is very low between 2 CVMs, e.g. Standard_DC96as_v5. Tested with latest 9.3. # iperf3 -c 10.0.0.4 -b 0 -f g -i 10 -l 4096 -t 30 -p 750 -P 1 -4 Connecting to host 10.0.0.4, port 750 [ 5] local 10.0.0.5 port 49690 connected to 10.0.0.4 port 750 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 253 MBytes 0.21 Gbits/sec 0 261 KBytes [ 5] 10.00-20.00 sec 260 MBytes 0.22 Gbits/sec 0 261 KBytes ntttcp with 32 connections: Throughput in Gbps: Tx: .43 , Rx: 0.43 Version-Release number of selected component (if applicable): 5.14.0-316.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. 2 CVMs, run below command on VM#1: iperf3 -s -1 -i10 -f g -p 750 run below command on VM#2: iperf3 -c 10.0.0.4 -b 0 -f g -i10 -l 4096 -t 300 -p 750 -P 1 -4 Or test with ntttcp on multiple connections. Actual results: Less than 1Gpbs throughput. Expected results: Reach or be close to advertised throughput on DC96as_v5. Additional info: 1. The issue has been bisected to kernel 5.14.0-195.el9.x86_64. In other words, 5.14.0-194.el9.x86_64 was still good: # iperf3 -c 10.0.0.4 -b 0 -f g -i10 -l 4096 -t 30 -p 750 -P 1 -4 Connecting to host 10.0.0.4, port 750 [ 5] local 10.0.0.5 port 47966 connected to 10.0.0.4 port 750 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 4.02 GBytes 3.45 Gbits/sec 40 2.01 MBytes [ 5] 10.00-20.00 sec 3.97 GBytes 3.41 Gbits/sec 0 2.64 MBytes ntttcp with 32 connections: Thu Jun 15 04:35:33 2023 : Throughput in Gbps: Tx: 17.68 , Rx: 17.68 2. No such issue on RHEL 8.8 (4.18.0-477.el8.x86_64) # iperf3 -c 10.0.0.4 -b 0 -f g -i30 -l 4096 -t 300 -p 750 -P 1 -4 Connecting to host 10.0.0.4, port 750 [ 5] local 10.0.0.5 port 58400 connected to 10.0.0.4 port 750 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-30.00 sec 12.1 GBytes 3.47 Gbits/sec 76 2.63 MBytes [ 5] 30.00-60.00 sec 12.1 GBytes 3.45 Gbits/sec 49 2.23 MBytes
(In reply to Li Tian from comment #0) > 1. The issue has been bisected to kernel 5.14.0-195.el9.x86_64. In other > words, 5.14.0-194.el9.x86_64 was still good: Did you figure out if the issue is on the receiver side or on the sender's? I.e. did you try downgrading the kernel to -194 on one side only? Alternatively, you can try to use only one CVM and a 'normal' VM on the other side. I'm a bit surprised the issue seems to be between 5.14.0-194.el9 and 5.14.0-195.el9 as I don't see much besides https://bugzilla.redhat.com/show_bug.cgi?id=2136491 but maybe that's the one? In case it is, the question is why only CVMs are affected...
So issue is on the client side. When I have -194 on client and -195 on server, this issue is gone. And another observation is that this issue is only reproducible at a big buffer length. In other words, with 'iperf3 -l 32' this the performance is always ~0.2Gbps regardless of kernel version. With 'iperf -l 4096' this issue is brought to light from -194 (~3Gbps) to -195 (~0.2Gbps). 5.14.0-325.el9.x86_64 (client) on Standard_D64s_v4 does not have this issue: # iperf3 -c 10.0.0.4 -b 0 -f g -i10 -l 4096 -t 30 -p 750 -P 1 -4 Connecting to host 10.0.0.4, port 750 [ 5] local 10.0.0.5 port 39402 connected to 10.0.0.4 port 750 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 3.55 GBytes 3.05 Gbits/sec 87 1.57 MBytes [ 5] 10.00-20.00 sec 3.71 GBytes 3.18 Gbits/sec 2 2.14 MBytes (same server - -195 on CVM)
*** Bug 2227799 has been marked as a duplicate of this bug. ***
Tested on 5.14.0-349.2880_954641462.el9.x86_64: TCP throughput is good: [ 5] 0.00-10.00 sec 3.83 GBytes 3.29 Gbits/sec 15 2.26 MBytes [ 5] 10.00-20.00 sec 3.79 GBytes 3.25 Gbits/sec 4 2.63 MBytes Disk IOPS is good: bw ( KiB/s): min=26712, max=55272, per=100.00%, avg=49412.88, stdev=5239.23, samples=59 iops : min= 6678, max=13818, avg=12353.25, stdev=1309.81, samples=59
Tested good on 5.14.0-351.el9.x86_64: TCP throughput is good: [ 5] 0.00-10.00 sec 3.84 GBytes 3.30 Gbits/sec 31 2.25 MBytes [ 5] 10.00-20.00 sec 3.87 GBytes 3.33 Gbits/sec 9 1.96 MBytes Disk IOPS is good: bw ( KiB/s): min=26360, max=55272, per=100.00%, avg=50867.22, stdev=5781.01, samples=59 iops : min= 6590, max=13818, avg=12716.76, stdev=1445.23, samples=59