RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1916670 - [RHEL8.4] VMA incurs performance degradation on UDP test cases when run over network team interface
Summary: [RHEL8.4] VMA incurs performance degradation on UDP test cases when run over ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: libvma
Version: 8.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.4
Assignee: Honggang LI
QA Contact: Brian Chae
URL:
Whiteboard:
Depends On:
Blocks: 1903942
TreeView+ depends on / blocked
 
Reported: 2021-01-15 11:42 UTC by Brian Chae
Modified: 2021-05-18 14:46 UTC (History)
1 user (show)

Fixed In Version: libvma-9.2.2-2.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-18 14:46:02 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Brian Chae 2021-01-15 11:42:52 UTC
Description of problem:

With new libvma on RHEL8.4, libvma-9.2.2-1.el8.x86_64, all of VMA testcases failed on MLX5 IB0 and ROCE device. However, this applies only on rdma-dev-19 and rdma-dev-20 as RDMA client.

This is a REGRESSION ISSUE - with RHEL8.3.0 all VMA testcases PASSED on rdma-dev-19 and rdma-dev-20 hosts.

However, these two hosts are equipped with BONDING interfaces.


Version-Release number of selected component (if applicable):

DISTRO=RHEL-8.4.0-20210108.n.0

+ [21-01-13 11:59:07] cat /etc/redhat-release

Red Hat Enterprise Linux release 8.4 Beta (Ootpa)

+ [21-01-13 11:59:07] uname -a
Linux rdma-dev-20.lab.bos.redhat.com 4.18.0-270.el8.x86_64 #1 SMP Wed Jan 6 07:28:47 EST 2021 x86_64 x86_64 x86_64 GNU/Linux

+ [21-01-13 11:59:07] cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-270.el8.x86_64 root=/dev/mapper/rhel_rdma--dev--20-root ro intel_idle.max_cstate=0 processor.max_cstate=0 intel_iommu=on iommu=on console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=/dev/mapper/rhel_rdma--dev--20-swap rd.lvm.lv=rhel_rdma-dev-20/root rd.lvm.lv=rhel_rdma-dev-20/swap console=ttyS1,115200n81

+ [21-01-13 11:59:07] rpm -q rdma-core linux-firmware
rdma-core-32.0-3.el8.x86_64

linux-firmware-20201022-100.gitdae4b4cd.el8.noarch
+ [21-01-13 11:59:07] tail /sys/class/infiniband/mlx5_2/fw_ver /sys/class/infiniband/mlx5_3/fw_ver /sys/class/infiniband/mlx5_bond_0/fw_ver
==> /sys/class/infiniband/mlx5_2/fw_ver <==
12.23.1020

==> /sys/class/infiniband/mlx5_3/fw_ver <==
12.23.1020

==> /sys/class/infiniband/mlx5_bond_0/fw_ver <==
14.25.1020

04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
04:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
82:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
82:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]


How reproducible:

100%

Steps to Reproduce:

server interface [ rdma-dev-19 ] :

31: mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:2e:ef:fe:80:00:00:00:00:00:00:24:8a:07:03:00:49:d3:38 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.0.119/24 brd 172.31.0.255 scope global dynamic noprefixroute mlx5_ib0
       valid_lft 2638sec preferred_lft 2638sec
    inet6 fe80::268a:703:49:d338/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

client interface [ rdma-dev-20 ] :

31: mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:2e:4e:fe:80:00:00:00:00:00:00:24:8a:07:03:00:49:d4:68 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.0.120/24 brd 172.31.0.255 scope global dynamic noprefixroute mlx5_ib0
       valid_lft 2759sec preferred_lft 2759sec
    inet6 fe80::268a:703:49:d468/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever



1. Setup the hosts the above build and libvma-9.2.2-1.el8.x86_64
2. download and install sockperf
3. run the following command on VMA server host

timeout --preserve-status --kill-after=5m 3m sockperf server -i 172.31.0.121

4. run the following command on the VMA client host

vma_test sockperf tp -i 172.31.0.119 -t 10 --msg-size=1472

Actual results:


+ [21-01-11 16:33:22] vma_test sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
+ [21-01-11 16:33:22] rm -rf /tmp/vma.txt
+ [21-01-11 16:33:22] tee /tmp/vma.txt
[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.2.2-1 Release built on Dec 16 2020 18:55:02
[0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m+ [21-01-11 16:33:22] LD_PRELOAD=libvma.so
[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.2.2-1 Release built on Dec 16 2020 18:55:02
[0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m+ [21-01-11 16:33:22] timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.2.2-1 Release built on Dec 16 2020 18:55:02
[0m[0m VMA INFO: Cmd Line: timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.2.2-1 Release built on Dec 16 2020 18:55:02
[0m[0m VMA INFO: Cmd Line: sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/mode (errno 2 No such file or directory)
[0m[2;35m VMA WARNING: ******************************************************************************
[0m[2;35m VMA WARNING: VMA doesn't support current bonding configuration of mlx5_team_roce.
[0m[2;35m VMA WARNING: The only supported bonding mode is "802.3ad 4(#4)" or "active-backup(#1)"
[0m[2;35m VMA WARNING: with "fail_over_mac=1" or "fail_over_mac=0".
[0m[2;35m VMA WARNING: The effect of working in unsupported bonding mode is undefined.
[0m[2;35m VMA WARNING: Read more about Bonding in the VMA's User Manual
[0m[2;35m VMA WARNING: ******************************************************************************
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/active_slave (errno 2 No such file or directory)
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/slaves (errno 2 No such file or directory)
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/mode (errno 2 No such file or directory)
[0m[2;35m VMA WARNING: ******************************************************************************
[0m[2;35m VMA WARNING: VMA doesn't support current bonding configuration of mlx5_team_roce.
[0m[2;35m VMA WARNING: The only supported bonding mode is "802.3ad 4(#4)" or "active-backup(#1)"
[0m[2;35m VMA WARNING: with "fail_over_mac=1" or "fail_over_mac=0".
[0m[2;35m VMA WARNING: The effect of working in unsupported bonding mode is undefined.
[0m[2;35m VMA WARNING: Read more about Bonding in the VMA's User Manual
[0m[2;35m VMA WARNING: ******************************************************************************
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/active_slave (errno 2 No such file or directory)
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/slaves (errno 2 No such file or directory)
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/mode (errno 2 No such file or directory)
[0m[2;35m VMA WARNING: ******************************************************************************
[0m[2;35m VMA WARNING: VMA doesn't support current bonding configuration of mlx5_team_roce.
[0m[2;35m VMA WARNING: The only supported bonding mode is "802.3ad 4(#4)" or "active-backup(#1)"
[0m[2;35m VMA WARNING: with "fail_over_mac=1" or "fail_over_mac=0".
[0m[2;35m VMA WARNING: The effect of working in unsupported bonding mode is undefined.
[0m[2;35m VMA WARNING: Read more about Bonding in the VMA's User Manual
[0m[2;35m VMA WARNING: ******************************************************************************
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/active_slave (errno 2 No such file or directory)
[0m[0;31m VMA ERROR: utils:425:priv_read_file() ERROR while opening file /sys/class/net/mlx5_team_roce/bonding/slaves (errno 2 No such file or directory)
[0msockperf: [2;35m== version #3.7-1.gitb741ab3c60b1 == [0m
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)

[ 0] IP = 172.31.0.119    PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=10.000 sec; Warm up time=400 msec; SentMessages=1700071; ReceivedMessages=1700070
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=9.550 sec; SentMessages=1658447; ReceivedMessages=1658447
sockperf: [2;35m====> avg-latency=2.867 (std-dev=0.282)[0m
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 2.867 usec
sockperf: [2;35mTotal 1658447 observations[0m; each percentile contains 16584.47 observations
sockperf: ---> <MAX> observation =  146.062
sockperf: ---> percentile 99.999 =   40.779
sockperf: ---> percentile 99.990 =    4.418
sockperf: ---> percentile 99.900 =    3.903
sockperf: ---> percentile 99.000 =    3.139
sockperf: ---> percentile 90.000 =    2.955
sockperf: ---> percentile 75.000 =    2.899
sockperf: ---> percentile 50.000 =    2.852
sockperf: ---> percentile 25.000 =    2.811
sockperf: ---> <MIN> observation =    2.670

-----------------------------------------------------------
Test results for vma on rdma-dev-20:
4.18.0-270.el8.x86_64, rdma-core-32.0-3.el8, mlx5, ib0, & mlx5_2
    Result | Status | Test
  ---------+--------+------------------------------------
      FAIL |      1 | sockperf pingpong multicast
      FAIL |      1 | sockperf throughput multicast
      FAIL |      1 | sockperf throughput unicast
      FAIL |      1 | sockperf pingpong unicast
      FAIL |      1 | sockperf (100 sockets) pingpong multicast
      FAIL |      1 | sockperf (100 sockets) pingpong unicast
      FAIL |      1 | sockperf pingpong multicast pkey/vlan
      FAIL |      1 | sockperf pingpong unicast pkey/vlan

Expected results:

+ [21-01-14 16:17:57] vma_test sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
+ [21-01-14 16:17:57] rm -rf /tmp/vma.txt
+ [21-01-14 16:17:57] tee /tmp/vma.txt
[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.0.2-0 Development Snapshot built on Apr 15 2020 11:33:59
[0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S]
[0m[0m VMA INFO: Current Time: Thu Jan 14 16:17:57 2021
[0m[0m VMA INFO: Pid: 54636
[0m[0m VMA INFO: Architecture: x86_64
[0m[0m VMA INFO: Node: rdma-dev-20.lab.bos.redhat.com
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m+ [21-01-14 16:17:57] LD_PRELOAD=libvma.so
[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.0.2-0 Development Snapshot built on Apr 15 2020 11:33:59
[0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S]
[0m[0m VMA INFO: Current Time: Thu Jan 14 16:17:57 2021
[0m[0m VMA INFO: Pid: 54642
[0m[0m VMA INFO: Architecture: x86_64
[0m[0m VMA INFO: Node: rdma-dev-20.lab.bos.redhat.com
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m+ [21-01-14 16:17:57] timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.0.2-0 Development Snapshot built on Apr 15 2020 11:33:59
[0m[0m VMA INFO: Cmd Line: timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
[0m[0m VMA INFO: Current Time: Thu Jan 14 16:17:58 2021
[0m[0m VMA INFO: Pid: 54634
[0m[0m VMA INFO: Architecture: x86_64
[0m[0m VMA INFO: Node: rdma-dev-20.lab.bos.redhat.com
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: VMA_VERSION: 9.0.2-0 Development Snapshot built on Apr 15 2020 11:33:59
[0m[0m VMA INFO: Cmd Line: sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
[0m[0m VMA INFO: Current Time: Thu Jan 14 16:17:58 2021
[0m[0m VMA INFO: Pid: 54651
[0m[0m VMA INFO: Architecture: x86_64
[0m[0m VMA INFO: Node: rdma-dev-20.lab.bos.redhat.com
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0m[0m VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
[0m[0m VMA INFO: ---------------------------------------------------------------------------
[0msockperf: [2;35m== version #3.7-3.gita722495b9d92 == [0m
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)

[ 0] IP = 172.31.0.119    PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=10.000 sec; Warm up time=400 msec; SentMessages=1700465; ReceivedMessages=1700464
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=9.550 sec; SentMessages=1662049; ReceivedMessages=1662049
sockperf: [2;35m====> avg-latency=2.859 (std-dev=0.303)[0m
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 2.859 usec
sockperf: [2;35mTotal 1662049 observations[0m; each percentile contains 16620.49 observations
sockperf: ---> <MAX> observation =  171.997
sockperf: ---> percentile 99.999 =   38.738
sockperf: ---> percentile 99.990 =    4.969
sockperf: ---> percentile 99.900 =    3.907
sockperf: ---> percentile 99.000 =    3.144
sockperf: ---> percentile 90.000 =    2.980
sockperf: ---> percentile 75.000 =    2.883
sockperf: ---> percentile 50.000 =    2.830
sockperf: ---> percentile 25.000 =    2.800
sockperf: ---> <MIN> observation =    2.687

------------------------------------------------------------

Test results for vma on rdma-dev-20:
4.18.0-240.el8.x86_64, rdma-core-29.0-3.el8, mlx5, ib0, & mlx5_2
    Result | Status | Test
  ---------+--------+------------------------------------
      PASS |      0 | sockperf pingpong multicast
      PASS |      0 | sockperf throughput multicast
      PASS |      0 | sockperf throughput unicast
      PASS |      0 | sockperf pingpong unicast
      PASS |      0 | sockperf (100 sockets) pingpong multicast
      PASS |      0 | sockperf (100 sockets) pingpong unicast
      PASS |      0 | sockperf pingpong multicast pkey/vlan
      PASS |      0 | sockperf pingpong unicast pkey/vlan

Checking for failures and known issues:
  no test failures



Additional info:

The above shown results were from running the build, RHEL8.3.0.

Comment 2 Honggang LI 2021-01-18 12:00:28 UTC
The real issue is UDP performance issue.

https://beaker.engineering.redhat.com/jobs/4986532 (RHEL-8.3 GA distro)
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2021/01/49865/4986532/9397386/120339614/562438453/resultoutputfile.log

https://beaker.engineering.redhat.com/jobs/4997634 (RHEL-8.3 GA distro with update RHEL-8.4 libvma-9.2.2 scratch build)
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2021/01/49976/4997634/9412263/120454213/563145553/resultoutputfile.log

[test 1] vma_test sockperf pp -i 172.31.40.119 -t 10 --msg-size=1472
-sockperf: Summary: Latency is 6.420 usec
+sockperf: Summary: Latency is 163.464 usec

[test 2] vma_test sockperf tp -i 172.31.40.119 -t 10 --msg-size=1472
-sockperf: Summary: Message Rate is 1987027 [msg/sec]
-sockperf: Summary: BandWidth is 2789.406 MBps (22315.245 Mbps)
+sockperf: Summary: Message Rate is 548598 [msg/sec]
+sockperf: Summary: BandWidth is 770.127 MBps (6161.013 Mbps)

[test 3] vma_test sockperf tp -i 172.31.40.119 -t 10 --msg-size=1472 --giga-size
-sockperf: Summary: Message Rate is 1986003 [msg/sec]
-sockperf: Summary: BandWidth is 2.788 GBps (22.304 Gbps)
+sockperf: Summary: Message Rate is 544454 [msg/sec]
+sockperf: Summary: BandWidth is 0.764 GBps (6.114 Gbps)

[test 4] vma_test sockperf pp -i 172.31.40.119 --tcp -t 10 --msg-size=1472
-sockperf: Summary: Latency is 7.058 usec
+sockperf: Summary: Latency is 7.219 usec

[test 5] vma_test sockperf pp -f /tmp/feed.txt -t 10 -F e --msg-size=1472
-sockperf: Summary: Latency is 8.424 usec
+sockperf: Summary: Latency is 12.042 usec

[test 6] vma_test sockperf tp -f /tmp/feed.txt -t 10 -F e --msg-size=1472
-sockperf: Summary: Message Rate is 1933027 [msg/sec]
-sockperf: Summary: BandWidth is 2713.600 MBps (21708.799 Mbps)
+sockperf: Summary: Message Rate is 466121 [msg/sec]
+sockperf: Summary: BandWidth is 654.345 MBps (5234.757 Mbps)

[test 7] vma_test sockperf pp -i 172.31.43.119 -t 10 --msg-size=1472
-sockperf: Summary: Latency is 22.315 usec
+sockperf: Summary: Latency is 176.737 usec

[test 8] vma_test sockperf pp -i 172.31.43.119 --tcp -t 10 --msg-size=1472
-sockperf: Summary: Latency is 10.114 usec
+sockperf: Summary: Latency is 7.764 usec

NOTE: line start with '-' is libvma-9.0.2-1.el8.x86_64 output. line starts with '+' is libvma-9.2.2-1.scratch.el8_3.x86_64 output.

Comment 4 Honggang LI 2021-01-25 09:37:41 UTC
https://github.com/Mellanox/libvma/pull/935

Comment 5 Honggang LI 2021-01-27 08:13:37 UTC
(In reply to Honggang LI from comment #4)
> https://github.com/Mellanox/libvma/pull/935

Upstream approve this PR. Set devel+ flag.

Comment 11 Brian Chae 2021-02-08 14:27:51 UTC
Tested as the following:

1. versions

DISTRO=RHEL-8.4.0-20210205.n.0
+ [21-02-08 07:54:14] cat /etc/redhat-release
Red Hat Enterprise Linux release 8.4 Beta (Ootpa)
+ [21-02-08 07:54:14] uname -a
Linux rdma-dev-20.lab.bos.redhat.com 4.18.0-282.el8.x86_64 #1 SMP Tue Feb 2 14:09:52 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
+ [21-02-08 07:54:14] cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-282.el8.x86_64 root=/dev/mapper/rhel_rdma--dev--20-root ro intel_idle.max_cstate=0 processor.max_cstate=0 intel_iommu=on iommu=on console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=/dev/mapper/rhel_rdma--dev--20-swap rd.lvm.lv=rhel_rdma-dev-20/root rd.lvm.lv=rhel_rdma-dev-20/swap console=ttyS1,115200n81
+ [21-02-08 07:54:14] rpm -q rdma-core linux-firmware
rdma-core-32.0-4.el8.x86_64
linux-firmware-20201218-102.git05789708.el8.noarch
+ [21-02-08 07:54:14] tail /sys/class/infiniband/mlx5_2/fw_ver /sys/class/infiniband/mlx5_3/fw_ver /sys/class/infiniband/mlx5_bond_0/fw_ver
==> /sys/class/infiniband/mlx5_2/fw_ver <==
12.23.1020

==> /sys/class/infiniband/mlx5_3/fw_ver <==
12.23.1020

==> /sys/class/infiniband/mlx5_bond_0/fw_ver <==
14.25.1020
+ [21-02-08 07:54:14] lspci
+ [21-02-08 07:54:14] grep -i -e ethernet -e infiniband -e omni -e ConnectX
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
04:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
82:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
82:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]

libvma-9.2.2-2.el8.x86_64
+ [21-02-08 07:54:14] vma_setup

RDMA hosts:

Clients: rdma-dev-20

31: mlx5_team_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:fe:90:cb:76:2a brd ff:ff:ff:ff:ff:ff
    inet 172.31.40.120/24 brd 172.31.40.255 scope global dynamic noprefixroute mlx5_team_roce
       valid_lft 3391sec preferred_lft 3391sec
    inet6 fe80::7efe:90ff:fecb:762a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
32: mlx5_team_ro.43@mlx5_team_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:fe:90:cb:76:2a brd ff:ff:ff:ff:ff:ff
    inet 172.31.43.120/24 brd 172.31.43.255 scope global dynamic noprefixroute mlx5_team_ro.43
       valid_lft 3455sec preferred_lft 3455sec
    inet6 fe80::7efe:90ff:fecb:762a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
33: mlx5_team_ro.45@mlx5_team_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:fe:90:cb:76:2a brd ff:ff:ff:ff:ff:ff
    inet 172.31.45.120/24 brd 172.31.45.255 scope global dynamic noprefixroute mlx5_team_ro.45
       valid_lft 3459sec preferred_lft 3459sec
    inet6 fe80::7efe:90ff:fecb:762a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

Servers: rdma-dev-19

25: mlx5_bond_roce: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:fe:90:cb:74:3a brd ff:ff:ff:ff:ff:ff
    inet 172.31.40.119/24 brd 172.31.40.255 scope global dynamic noprefixroute mlx5_bond_roce
       valid_lft 3259sec preferred_lft 3259sec
    inet6 fe80::7efe:90ff:fecb:743a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
26: mlx5_bond_ro.45@mlx5_bond_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:fe:90:cb:74:3a brd ff:ff:ff:ff:ff:ff
    inet 172.31.45.119/24 brd 172.31.45.255 scope global dynamic noprefixroute mlx5_bond_ro.45
       valid_lft 3346sec preferred_lft 3346sec
    inet6 fe80::7efe:90ff:fecb:743a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
27: mlx5_bond_ro.43@mlx5_bond_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:fe:90:cb:74:3a brd ff:ff:ff:ff:ff:ff
    inet 172.31.43.119/24 brd 172.31.43.255 scope global dynamic noprefixroute mlx5_bond_ro.43
       valid_lft 3342sec preferred_lft 3342sec
    inet6 fe80::7efe:90ff:fecb:743a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


2. Results:

MLX5 ROCE:

Test results for vma on rdma-dev-20:
4.18.0-282.el8.x86_64, rdma-core-32.0-4.el8, mlx5, roce.45, & mlx5_bond_0
    Result | Status | Test
  ---------+--------+------------------------------------
      PASS |      0 | sockperf pingpong multicast
      PASS |      0 | sockperf throughput multicast
      PASS |      0 | sockperf throughput unicast
      PASS |      0 | sockperf pingpong unicast
      PASS |      0 | sockperf (100 sockets) pingpong multicast
      PASS |      0 | sockperf (100 sockets) pingpong unicast
      PASS |      0 | sockperf pingpong multicast pkey/vlan
      PASS |      0 | sockperf pingpong unicast pkey/vlan

Checking for failures and known issues:
  no test failures

Performance INFO:

+ [21-02-08 07:59:32] vma_test sockperf pp -i 172.31.40.119 -t 10 --msg-size=1472
sockperf: Summary: Latency is 3.656 usec

+ [21-02-08 07:59:58] vma_test sockperf tp -i 172.31.40.119 -t 10 --msg-size=1472
sockperf: Summary: Message Rate is 1988059 [msg/sec]
sockperf: Summary: BandWidth is 2790.854 MBps (22326.834 Mbps)

+ [21-02-08 08:00:24] vma_test sockperf tp -i 172.31.40.119 -t 10 --msg-size=1472 --giga-size
sockperf: Summary: Message Rate is 1988267 [msg/sec]
sockperf: Summary: BandWidth is 2.791 GBps (22.329 Gbps)


+ [21-02-08 08:00:50] vma_test sockperf pp -i 172.31.40.119 --tcp -t 10 --msg-size=1472
sockperf: Summary: Latency is 3.736 usec

+ [21-02-08 08:01:15] vma_test sockperf pp -f /tmp/feed.txt -t 10 -F e --msg-size=1472
sockperf: Summary: Latency is 4.310 usec


+ [21-02-08 08:01:42] vma_test sockperf tp -f /tmp/feed.txt -t 10 -F e --msg-size=1472
sockperf: Summary: Message Rate is 1935439 [msg/sec]
sockperf: Summary: BandWidth is 2716.986 MBps (21735.887 Mbps)

+ [21-02-08 08:02:08] vma_test sockperf pp -i 172.31.43.119 -t 10 --msg-size=1472
sockperf: Summary: Latency is 22.688 usec

+ [21-02-08 08:02:34] vma_test sockperf pp -i 172.31.43.119 --tcp -t 10 --msg-size=1472
sockperf: Summary: Latency is 7.206 usec


                   --------------------------------------


MLX5 IB0

Test results for vma on rdma-dev-20:
4.18.0-282.el8.x86_64, rdma-core-32.0-4.el8, mlx5, ib0, & mlx5_2
    Result | Status | Test
  ---------+--------+------------------------------------
      PASS |      0 | sockperf pingpong multicast
      PASS |      0 | sockperf throughput multicast
      PASS |      0 | sockperf throughput unicast
      PASS |      0 | sockperf pingpong unicast
      PASS |      0 | sockperf (100 sockets) pingpong multicast
      PASS |      0 | sockperf (100 sockets) pingpong unicast
      PASS |      0 | sockperf pingpong multicast pkey/vlan
      PASS |      0 | sockperf pingpong unicast pkey/vlan

Checking for failures and known issues:
  no test failures


Performance INFO:
+ [21-02-08 08:34:05] vma_test sockperf pp -i 172.31.0.119 -t 10 --msg-size=1472
sockperf: Summary: Latency is 2.890 usec

+ [21-02-08 08:34:31] vma_test sockperf tp -i 172.31.0.119 -t 10 --msg-size=1472
sockperf: Summary: Message Rate is 3053068 [msg/sec]
sockperf: Summary: BandWidth is 4285.923 MBps (34287.385 Mbps)

+ [21-02-08 08:34:57] vma_test sockperf tp -i 172.31.0.119 -t 10 --msg-size=1472 --giga-size
sockperf: Summary: Message Rate is 3513380 [msg/sec]
sockperf: Summary: BandWidth is 4.932 GBps (39.457 Gbps)

+ [21-02-08 08:35:23] vma_test sockperf pp -i 172.31.0.119 --tcp -t 10 --msg-size=1472
sockperf: Summary: Latency is 2.989 usec

+ [21-02-08 08:35:49] vma_test sockperf pp -f /tmp/feed.txt -t 10 -F e --msg-size=1472
sockperf: Summary: Latency is 3.225 usec

+ [21-02-08 08:36:15] vma_test sockperf tp -f /tmp/feed.txt -t 10 -F e --msg-size=1472
sockperf: Summary: Message Rate is 2422217 [msg/sec]
sockperf: Summary: BandWidth is 3400.329 MBps (27202.632 Mbps)

+ [21-02-08 08:36:42] vma_test sockperf pp -i 172.31.2.119 -t 10 --msg-size=1472
sockperf: Summary: Latency is 2.871 usec

+ [21-02-08 08:37:08] vma_test sockperf pp -i 172.31.2.119 --tcp -t 10 --msg-size=1472
sockperf: Summary: Latency is 2.818 usec


o The performance numbers are on par with that of RHEL8.3 VMA test results.

Comment 13 errata-xmlrpc 2021-05-18 14:46:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RDMA stack bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1594


Note You need to log in before you can comment on or make changes to this bug.