Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2034948

Summary: [RHEL9.0.0] ib_send_lat RC and ib_write_lat RC tests on BXNT ROCE device always ends with error - failed status 2
Product: Red Hat Enterprise Linux 9 Reporter: Brian Chae <bchae>
Component: rdma-coreAssignee: Nobody <nobody>
Status: CLOSED ERRATA QA Contact: Brian Chae <bchae>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 9.0CC: anantha.subramanyam, dledford, rdma-dev-team, selvin.xavier, zguo
Target Milestone: rcKeywords: Regression, Triaged
Target Release: 9.0Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rdma-core-37.2-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2024865 Environment:
Last Closed: 2022-05-17 15:53:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2024865    
Bug Blocks: 2026666, 2027125    

Description Brian Chae 2021-12-22 14:41:25 UTC
+++ This bug was initially created as a clone of Bug #2024865 +++

Description of problem:

When perftest is tested on BXNT ROCE device, "ib_send_lat RC" test always ends with the following error, on the client side:

+ [21-11-18 05:19:24] timeout 3m ib_send_lat -a -c RC -d bnxt_re0 -i 1 -F -R 172.31.45.125
 Completion with error at client
 Failed status 2: wr_id 0 syndrom 0x0
scnt=1, ccnt=1


On the server side, "ib_write_lat RC" test always ends with the same error:

[root@rdma-dev-25 ~]$ timeout 10m ib_write_lat -a -c RC -d bnxt_re0 -i 1 -F -R

************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
                    RDMA_Write Latency Test
 Dual-port       : OFF          Device         : bnxt_re0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: OFF
 ibv_wr* API     : OFF
 Mtu             : 4096[B]
 Link type       : Ethernet
 GID index       : 7
 Max inline data : 96[B]
 rdma_cm QPs     : ON
 Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
 Waiting for client rdma_cm QP to connect
 Please run the same command with the IB/RoCE interface IP
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x060a PSN 0x8e58c6
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:43:125
 remote address: LID 0000 QPN 0x0609 PSN 0xd6ed72
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:43:126
---------------------------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]    t_avg[usec]    t_stdev[usec]   99% percentile[usec]   99.9% percentile[usec] 
 Completion with error at client
 Failed status 2: wr_id 0 syndrom 0xc6212cb0
scnt=1, ccnt=0
Test exited with Error



This is a regression from RHEL8.5 where this test always passes.



Version-Release number of selected component (if applicable):


Clients: rdma-dev-26
Servers: rdma-dev-25


DISTRO=RHEL-8.6.0-20211112.1

+ [21-11-18 05:18:42] cat /etc/redhat-release
Red Hat Enterprise Linux release 8.6 Beta (Ootpa)


+ [21-11-18 05:18:42] uname -a
Linux rdma-dev-26.rdma.lab.eng.rdu2.redhat.com 4.18.0-348.6.el8.x86_64 #1 SMP Mon Nov 8 09:36:54 EST 2021 x86_64 x86_64 x86_64 GNU/Linux


+ [21-11-18 05:18:42] cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-348.6.el8.x86_64 root=/dev/mapper/rhel_rdma--dev--26-root ro intel_idle.max_cstate=0 intremap=no_x2apic_optout processor.max_cstate=0 console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=/dev/mapper/rhel_rdma--dev--26-swap rd.lvm.lv=rhel_rdma-dev-26/root rd.lvm.lv=rhel_rdma-dev-26/swap console=ttyS1,115200n81


+ [21-11-18 05:18:42] rpm -q rdma-core linux-firmware
rdma-core-37.1-1.el8.x86_64
linux-firmware-20211007-104.git7a300505.el8.noarch


+ [21-11-18 05:18:42] tail /sys/class/infiniband/bnxt_re0/fw_ver /sys/class/infiniband/bnxt_re1/fw_ver
==> /sys/class/infiniband/bnxt_re0/fw_ver <==
218.0.153.0

==> /sys/class/infiniband/bnxt_re1/fw_ver <==
218.0.153.0


+ [21-11-18 05:18:42] lspci
+ [21-11-18 05:18:42] grep -i -e ethernet -e infiniband -e omni -e ConnectX

04:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet (rev 11)
04:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet (rev 11)
+ [21-11-18 05:18:42] lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              16
On-line CPU(s) list: 0-15
Thread(s) per core:  2
Core(s) per socket:  4
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz
BIOS Model name:     Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz
Stepping:            2
CPU MHz:             3500.000
CPU max MHz:         3500.0000
CPU min MHz:         1200.0000
BogoMIPS:            5993.33
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            10240K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear flush_l1d


+ [21-11-18 05:18:42] ibstat
CA 'bnxt_re0'
	CA type: Broadcom NetXtreme-C/E RoCE Driver HCA
	Number of ports: 1
	Firmware version: 218.0.153.0
	Hardware version: 0x14e4
	Node GUID: 0xbe97e1fffe703d80
	System image GUID: 0xbe97e1fffe703d80
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 100
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x001d0000
		Port GUID: 0xbe97e1fffe703d80
		Link layer: Ethernet
CA 'bnxt_re1'
	CA type: Broadcom NetXtreme-C/E RoCE Driver HCA
	Number of ports: 1
	Firmware version: 218.0.153.0
	Hardware version: 0x14e4
	Node GUID: 0xbe97e1fffe703d81
	System image GUID: 0xbe97e1fffe703d81
	Port 1:
		State: Down
		Physical state: Disabled
		Rate: 2.5
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x001d0000
		Port GUID: 0xbe97e1fffe703d81
		Link layer: Ethernet


+ [21-11-18 05:18:43] rpm -q perftest
perftest-4.5-12.el8.x86_64




How reproducible:

100%


Server host:

9: bnxt_roce.45@bnxt_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether bc:97:e1:70:2d:20 brd ff:ff:ff:ff:ff:ff
    inet 172.31.45.125/24 brd 172.31.45.255 scope global dynamic noprefixroute bnxt_roce.45
       valid_lft 3367sec preferred_lft 3367sec
    inet6 fe80::be97:e1ff:fe70:2d20/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


Clinet hosts:

9: bnxt_roce.45@bnxt_roce: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether bc:97:e1:70:3d:80 brd ff:ff:ff:ff:ff:ff
    inet 172.31.45.126/24 brd 172.31.45.255 scope global dynamic noprefixroute bnxt_roce.45
       valid_lft 3384sec preferred_lft 3384sec
    inet6 fe80::be97:e1ff:fe70:3d80/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


Steps to Reproduce:
1. After a pair of BXNT ROCE devices are up and ready for perftest in both RDMA server and client hosts

2. issue the following command in the server

timeout 3m ib_send_lat -a -c RC -d bnxt_re0 -i 1 -F -R

3. issue the following command in the client 

timeout 3m ib_send_lat -a -c RC -d bnxt_re0 -i 1 -F -R 172.31.45.125


Actual results:

 Completion with error at client
 Failed status 2: wr_id 0 syndrom 0x0
scnt=1, ccnt=1
---------------------------------------------------------------------------------------
                    Send Latency Test
 Dual-port       : OFF		Device         : bnxt_re0
 Number of qps   : 1		Transport type : IB
 Connection type : RC		Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : OFF
 TX depth        : 1
 Mtu             : 4096[B]
 Link type       : Ethernet
 GID index       : 7
 Max inline data : 96[B]
 rdma_cm QPs	 : ON
 Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x0609 PSN 0x7c8ce6
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:43:126
 remote address: LID 0000 QPN 0x0609 PSN 0x32afa4
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:43:125
---------------------------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]    t_avg[usec]    t_stdev[usec]   99% percentile[usec]   99.9% percentile[usec] 
+ [21-11-18 05:19:24] RQA_check_result -r 17 -t 'ib_send_lat RC'



Expected results:



The following output is from RHEL8.5 of the same perftest for "ib_send_lat" test.


+ [21-11-18 12:06:28] timeout 3m ib_send_lat -a -c RC -d bnxt_re0 -i 1 -F -R 172.31.45.125
---------------------------------------------------------------------------------------
                    Send Latency Test
 Dual-port       : OFF		Device         : bnxt_re0
 Number of qps   : 1		Transport type : IB
 Connection type : RC		Using SRQ      : OFF
 PCIe relax order: Unsupported
 ibv_wr* API     : OFF
 TX depth        : 1
 Mtu             : 4096[B]
 Link type       : Ethernet
 GID index       : 7
 Max inline data : 96[B]
 rdma_cm QPs	 : ON
 Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x0609 PSN 0x4f28fe
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:43:126
 remote address: LID 0000 QPN 0x0609 PSN 0x985a0d
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:43:125
---------------------------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]    t_avg[usec]    t_stdev[usec]   99% percentile[usec]   99.9% percentile[usec] 
 2       1000          3.17           4.79         3.31     	       3.32        	0.11   		3.51    		4.79   
 4       1000          3.19           4.10         3.33     	       3.31        	0.09   		3.50    		4.10   
 8       1000          3.14           4.18         3.33     	       3.32        	0.09   		3.60    		4.18   
 16      1000          3.15           4.79         3.33     	       3.32        	0.11   		3.72    		4.79   
 32      1000          3.19           4.37         3.33     	       3.32        	0.10   		3.73    		4.37   
 64      1000          3.21           4.47         3.33     	       3.33        	0.10   		3.63    		4.47   
 128     1000          3.69           5.29         3.86     	       3.87        	0.09   		4.06    		5.29   
 256     1000          3.75           5.18         3.92     	       3.91        	0.09   		4.31    		5.18   
 512     1000          3.77           4.76         3.92     	       3.96        	0.08   		4.17    		4.76   
 1024    1000          4.01           5.07         4.17     	       4.14        	0.10   		4.43    		5.07   
 2048    1000          4.44           5.88         4.65     	       4.62        	0.09   		4.82    		5.88   
 4096    1000          5.49           6.83         5.68     	       5.68        	0.14   		6.02    		6.83   
 8192    1000          6.53           7.76         6.80     	       6.81        	0.12   		7.10    		7.76   
 16384   1000          8.15           9.39         8.53     	       8.54        	0.15   		8.94    		9.39   
 32768   1000          11.86          13.38        12.40    	       12.40       	0.16   		12.78   		13.38  
 65536   1000          19.18          20.45        19.74    	       19.75       	0.22   		20.29   		20.45  
 131072  1000          33.79          36.26        34.55    	       34.55       	0.26   		35.18   		36.26  
 262144  1000          62.67          65.25        63.59    	       63.60       	0.31   		64.33   		65.25  
 524288  1000          119.92         123.24       121.34   	       121.36      	0.40   		122.35  		123.24 
 1048576 1000          238.79         242.06       240.46   	       240.48      	0.53   		241.72  		242.06 
 2097152 1000          530.41         558.80       556.45   	       556.44      	0.98   		557.82  		558.80 
 4194304 1000          1103.42        1108.17      1105.78  	       1105.79     	0.74   		1107.81 		1108.17
 8388608 1000          2478.89        3053.86      2999.97  	       2997.29     	31.79  		3021.85 		3053.86
---------------------------------------------------------------------------------------
+ [21-11-18 12:06:49] RQA_check_result -r 0 -t 'ib_send_lat RC'


Additional info:

For the "ib_write_lat RC" test, replace the above "ib_send_lat RC" commands on both server and client sides with "ib_write_lat" in place of "ib_send_lat" command.

--- Additional comment from Brian Chae on 2021-12-04 20:57:24 UTC ---

This same bug exists on RHEL9.0.0, as well.

RHEL-9.0.0-20211026.10

      FAIL |     17 | ib_send_lat RC
      FAIL |    124 | ib_write_lat RC

      - rdma-core-37.1-1.el9.x86_64
        perftest-4.5-12.el9.x86_64

However, as of RHEL-9.0.0-20211026.10 

      - perftest-4.5-3.el9.x86_64   
        rdma-core-35.0-3.el9.x86_64

        o all passed

Comment 1 Honggang LI 2021-12-24 11:53:55 UTC
bad rdma-core commit 66aba73d4a7a025689154676048d34e8915bd74b.

commit 66aba73d4a7a025689154676048d34e8915bd74b
Author: Selvin Xavier <selvin.xavier>
Date:   Mon Aug 2 10:03:07 2021 -0700

    bnxt_re/lib: Move hardware queue to 16B aligned indices
    
    Move SQ and RQ indices from WQE boundary to
    16B boundary alignment. Changing the SQ-wqe posting
    algorithm accordingly. The new alignment needs to pull
    a 16B slot from the hardware queue and initialize the
    current 16B into the hardware buffer. Depending on the
    max possible wqe size supported by hardware, the number
    of 16B slots are calculated and pulled for initialization.
    Currently 128B wqe is supported and it requires 8 slots.

Comment 2 Honggang LI 2021-12-24 12:20:03 UTC
*** Bug 2029137 has been marked as a duplicate of this bug. ***

Comment 3 Honggang LI 2021-12-28 11:33:43 UTC
hi, selvin

this one is a serious libbnxt_re regression. It impacts perftest, openmpi, libfabric/fabtests, librdmacm.

Comment 4 selvin.xavier 2022-01-04 10:03:10 UTC
Created a rdma-core pull request for this regression. Need to pick this patch once the fix is merged to rdma-core.

https://github.com/linux-rdma/rdma-core/pull/1120

thanks,
Selvin

Comment 5 selvin.xavier 2022-01-05 16:59:54 UTC
Honggang,

The patch is merged to rdma-core. Can you pull this patch to 9.0?

Thanks,
Selvin

Comment 6 Honggang LI 2022-01-06 11:53:16 UTC
> The patch is merged to rdma-core. Can you pull this patch to 9.0?

yes. thanks

Comment 9 Brian Chae 2022-01-30 14:56:48 UTC
Verification was conducted as the following:

1. build and packages

DISTRO=RHEL-9.0.0-20220128.1

+ [22-01-30 08:40:07] cat /etc/redhat-release
Red Hat Enterprise Linux release 9.0 Beta (Plow)

+ [22-01-30 08:40:07] uname -a
Linux rdma-qe-25.rdma.lab.eng.rdu2.redhat.com 5.14.0-48.el9.x86_64 #1 SMP PREEMPT Mon Jan 24 22:40:42 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

+ [22-01-30 08:40:07] cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-48.el9.x86_64 root=/dev/mapper/rhel_rdma--qe--25-root ro crashkernel=1G-4G:192M,4G-64G:256M,64G-102400T:512M resume=/dev/mapper/rhel_rdma--qe--25-swap rd.lvm.lv=rhel_rdma-qe-25/root rd.lvm.lv=rhel_rdma-qe-25/swap console=ttyS0,115200n81

+ [22-01-30 08:40:07] rpm -q rdma-core linux-firmware
rdma-core-37.2-1.el9.x86_64
linux-firmware-20211027-123.el9.noarch

+ [22-01-30 08:40:07] tail /sys/class/infiniband/bnxt_re0/fw_ver /sys/class/infiniband/bnxt_re1/fw_ver /sys/class/infiniband/bnxt_re2/fw_ver /sys/class/infiniband/bnxt_re3/fw_ver
==> /sys/class/infiniband/bnxt_re0/fw_ver <==
20.8.30.0

==> /sys/class/infiniband/bnxt_re1/fw_ver <==
20.8.30.0

==> /sys/class/infiniband/bnxt_re2/fw_ver <==
216.0.51.0

==> /sys/class/infiniband/bnxt_re3/fw_ver <==
216.0.51.0
+ [22-01-30 08:40:07] lspci
+ [22-01-30 08:40:07] grep -i -e ethernet -e infiniband -e omni -e ConnectX
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
1a:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01)
1a:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01)
5e:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)
5e:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)

+ [22-01-30 08:40:09] rpm -q perftest
perftest-4.5-12.el9.x86_64


2. Result


Test results for perftest on rdma-qe-25:
5.14.0-48.el9.x86_64, rdma-core-37.2-1.el9, bnxt_en, roce.45, & bnxt_re3
    Result | Status | Test
  ---------+--------+------------------------------------
      PASS |      0 | ib_read_bw RC
      PASS |      0 | ib_read_lat RC
      PASS |      0 | ib_send_bw RC
      PASS |      0 | ib_send_lat RC
      PASS |      0 | ib_write_bw RC
      PASS |      0 | ib_write_lat RC

  - successful

Comment 12 errata-xmlrpc 2022-05-17 15:53:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: RDMA stack), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:3950