Bug 1902855
Summary: | [RHEL8.4] performance degradation with "ib_send_lat RC" test when tested on mlx5 MT27700 CX-4 ROCE device | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Brian Chae <bchae> |
Component: | perftest | Assignee: | Honggang LI <honli> |
Status: | CLOSED ERRATA | QA Contact: | Brian Chae <bchae> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.4 | CC: | dledford, knweiss, mstowe, rdma-dev-team, tmichael |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 8.4 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | perftest-4.4-8.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-05-18 14:45:12 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1903942 |
Description
Brian Chae
2020-11-30 19:57:08 UTC
[root@rdma-dev-21 ~]$ lspci -nn | grep 8086:6f0 00:00.0 Host bridge [0600]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D DMI2 [8086:6f00] (rev 01) 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 1 [8086:6f02] (rev 01) 00:02.0 PCI bridge [0604]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 2 [8086:6f04] (rev 01) 00:03.0 PCI bridge [0604]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 3 [8086:6f08] (rev 01) 00:03.1 PCI bridge [0604]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 3 [8086:6f09] (rev 01) 80:01.0 PCI bridge [0604]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 1 [8086:6f02] (rev 01) 80:03.0 PCI bridge [0604]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 3 [8086:6f08] (rev 01) https://lore.kernel.org/patchwork/patch/820922/ The CPU of rdma-dev-21/22 is not PCIe Relaxed Ordering compliant, so please run perftest with '--disable_pcie_relaxed'. [root@rdma-dev-21 ~]$ ib_send_lat --disable_pcie_relaxed -a -c RC -d mlx5_0 -i 1 -F -R <snip> PCIe relax order: OFF <==== <snip> [root@rdma-dev-22 ~]$ ib_send_lat --disable_pcie_relaxed -a -c RC -d mlx5_0 -i 1 -F -R 172.31.45.121 The perftest was re-tested with the latest build, RHEL-8.4.0-20210205.n.0, on mlx5 MT27700 CX-4 ROCE device. o RDMA lab hots rdma-dev-21(server) / 22(client) host pair. o Build info DISTRO=RHEL-8.4.0-20210205.n.0 + [21-02-05 06:32:51] cat /etc/redhat-release Red Hat Enterprise Linux release 8.4 Beta (Ootpa) + [21-02-05 06:32:51] uname -a Linux rdma-dev-22.lab.bos.redhat.com 4.18.0-282.el8.x86_64 #1 SMP Tue Feb 2 14:09:52 EST 2021 x86_64 x86_64 x86_64 GNU/Linux + [21-02-05 06:32:51] cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-282.el8.x86_64 root=/dev/mapper/rhel_rdma--dev--22-root ro intel_idle.max_cstate=0 processor.max_cstate=0 intel_iommu=on iommu=on console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=/dev/mapper/rhel_rdma--dev--22-swap rd.lvm.lv=rhel_rdma-dev-22/root rd.lvm.lv=rhel_rdma-dev-22/swap console=ttyS1,115200n81 + [21-02-05 06:32:51] rpm -q rdma-core linux-firmware rdma-core-32.0-4.el8.x86_64 linux-firmware-20201218-102.git05789708.el8.noarch + [21-02-05 06:32:51] tail /sys/class/infiniband/mlx5_0/fw_ver /sys/class/infiniband/mlx5_1/fw_ver /sys/class/infiniband/mlx5_2/fw_ver ==> /sys/class/infiniband/mlx5_0/fw_ver <== 12.28.1002 ==> /sys/class/infiniband/mlx5_1/fw_ver <== 12.28.1002 ==> /sys/class/infiniband/mlx5_2/fw_ver <== 12.28.1002 + [21-02-05 06:32:51] lspci + [21-02-05 06:32:51] grep -i -e ethernet -e infiniband -e omni -e ConnectX 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 04:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 82:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 82:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] o Test result Test results for perftest on rdma-dev-22: 4.18.0-282.el8.x86_64, rdma-core-32.0-4.el8, mlx5, roce.45, & mlx5_0 Result | Status | Test ---------+--------+------------------------------------ PASS | 0 | ib_atomic_bw RC PASS | 0 | ib_atomic_lat RC PASS | 0 | ib_read_bw RC PASS | 0 | ib_read_lat RC PASS | 0 | ib_send_bw RC PASS | 0 | ib_send_lat RC PASS | 0 | ib_write_bw RC PASS | 0 | ib_write_lat RC PASS | 0 | raw_ethernet_bw RC PASS | 0 | raw_ethernet_lat RC Checking for failures and known issues: no test failures o ib_send_lat perftest result, showing the performace data + [21-02-05 06:34:37] timeout 3m ib_send_lat -a -c RC -d mlx5_0 -i 1 -F -R 172.31.45.121 <<<============= --------------------------------------------------------------------------------------- Send Latency Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF PCIe relax order: Unsupported ibv_wr* API : ON TX depth : 1 Mtu : 4096[B] Link type : Ethernet GID index : 7 Max inline data : 236[B] rdma_cm QPs : ON Data ex. method : rdma_cm --------------------------------------------------------------------------------------- local address: LID 0000 QPN 0x0111 PSN 0x1be678 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:45:122 remote address: LID 0000 QPN 0x0111 PSN 0x88219a GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:40:121 --------------------------------------------------------------------------------------- #bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec] 2 1000 1.18 1.97 1.22 1.22 0.02 1.31 1.97 4 1000 1.18 2.09 1.22 1.22 0.04 1.27 2.09 8 1000 1.18 2.14 1.22 1.22 0.04 1.27 2.14 16 1000 1.17 2.05 1.22 1.22 0.04 1.27 2.05 32 1000 1.18 3.22 1.22 1.22 0.07 1.28 3.22 64 1000 1.25 2.43 1.28 1.29 0.04 1.38 2.43 128 1000 1.26 2.25 1.30 1.31 0.04 1.37 2.25 256 1000 1.63 2.91 1.67 1.68 0.06 1.83 2.91 512 1000 1.70 3.00 1.75 1.76 0.06 1.95 3.00 1024 1000 1.82 3.19 1.88 1.91 0.08 2.08 3.19 2048 1000 2.05 2.38 2.11 2.12 0.04 2.30 2.38 4096 1000 2.53 3.21 2.58 2.60 0.06 2.74 3.21 8192 1000 2.89 4.05 2.95 2.97 0.07 3.17 4.05 16384 1000 3.56 5.01 3.65 3.72 0.13 4.15 5.01 32768 1000 4.91 6.11 5.02 5.09 0.16 5.59 6.11 65536 1000 8.01 9.37 8.26 8.28 0.13 8.65 9.37 131072 1000 17.81 19.05 18.17 18.19 0.13 18.51 19.05 262144 1000 28.55 29.96 29.18 29.20 0.32 29.85 29.96 524288 1000 50.20 55.86 52.25 52.61 1.03 55.51 55.86 1048576 1000 92.93 97.86 94.78 94.62 0.79 96.69 97.86 2097152 1000 178.64 184.00 180.88 181.31 1.38 183.81 184.00 4194304 1000 349.77 356.88 350.70 351.28 1.41 356.23 356.88 <<<============ 8388608 1000 692.43 699.28 694.66 694.76 1.25 698.49 699.28 <<<============ --------------------------------------------------------------------------------------- Now, the above perftest of "ib_send_lat" shows the performance is in par with RHEL8.3 perftest test results. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RDMA stack bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1594 Question: So if I get this right --disable_pcie_relaxed is an option that (manually) works around the issue in perftest(!) on affected platforms. However, what about other Infiniband-using software with similar traffic patterns? Is every other program supposed to introduce such an option, too? I would appreciate if some could explain the situation. |