Description of problem: All of perftest testcases failed with the following error: Unexpected CM event bl blka 7 Unable to perform rdma_client function Unable to init the socket connection and return code of 1. perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7883970: 4.18.0-492.el8.x86_64, rdma-core-44.0-2.el8.1, i40e, roce.45, E810-XXV & irdma0 Result | Status | Test ---------+--------+------------------------------------ FAIL | 1 | ib_read_bw RC FAIL | 1 | ib_read_lat RC FAIL | 1 | ib_send_bw RC FAIL | 1 | ib_send_lat RC FAIL | 1 | ib_write_bw RC FAIL | 1 | ib_write_lat RC This is a regression from RHEL-8.8.0-20230228.22. Version-Release number of selected component (if applicable): Clients: rdma-qe-39 Servers: rdma-qe-38 DISTRO=RHEL-8.9.0-20230521.41 + [23-05-24 12:44:25] cat /etc/redhat-release Red Hat Enterprise Linux release 8.9 Beta (Ootpa) + [23-05-24 12:44:25] uname -a Linux rdma-qe-39.rdma.lab.eng.rdu2.redhat.com 4.18.0-492.el8.x86_64 #1 SMP Tue May 9 14:50:21 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux + [23-05-24 12:44:25] cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-492.el8.x86_64 root=UUID=94414d8d-4218-4f56-85b5-9b558923a596 ro crashkernel=auto resume=UUID=bdafcf70-7355-4bad-b6ae-07711eee4ce1 console=ttyS0,115200n81 + [23-05-24 12:44:25] rpm -q rdma-core linux-firmware rdma-core-44.0-2.el8.1.x86_64 linux-firmware-20230515-115.gitd1962891.el8.noarch + [23-05-24 12:44:25] tail /sys/class/infiniband/irdma0/fw_ver /sys/class/infiniband/irdma1/fw_ver ==> /sys/class/infiniband/irdma0/fw_ver <== 1.57 ==> /sys/class/infiniband/irdma1/fw_ver <== 1.57 + [23-05-24 12:44:25] lspci + [23-05-24 12:44:25] grep -i -e ethernet -e infiniband -e omni -e ConnectX 41:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 41:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) c1:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe c1:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe + [23-05-24 12:44:25] rpm -q perftest perftest-4.5.0.20-4.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1. On the server host, issue the following perftest commands timeout 3m ib_read_bw -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_read_lat -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_send_bw -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_send_lat -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_write_bw -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_write_lat -a -c RC -d irdma0 -i 1 -F -R 2. On the client host, issue the following perftest commands timeout 3m ib_read_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_read_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_send_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_send_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_write_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_write_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 3. Actual results: All of the above perftest commands resulted in the same errors: + [23-05-24 12:44:29] timeout 3m ib_read_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 Unexpected CM event bl blka 7 <<<================== Unable to perform rdma_client function <<<================== Unable to init the socket connection <<<================== + [23-05-24 12:44:29] RQA_check_result -r 1 -t 'ib_read_bw RC' + [23-05-24 12:48:01] timeout 3m ib_read_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 Unexpected CM event bl blka 7 <<<================== Unable to perform rdma_client function <<<================== Unable to init the socket connection <<<================== + [23-05-24 12:48:01] RQA_check_result -r 1 -t 'ib_read_lat RC' Down to the last in the test sequence... + [23-05-24 13:00:01] timeout 3m ib_write_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 Unexpected CM event bl blka 7 <<<================== Unable to perform rdma_client function <<<================== Unable to init the socket connection <<<================== + [23-05-24 13:00:01] RQA_check_result -r 1 -t 'ib_write_lat RC' Expected results: With RHEL-8.8.0-20230531.2 build, the following results are expected. perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7927291: 4.18.0-477.10.1.el8_8.x86_64, rdma-core-44.0-2.el8.1, i40e, roce.45, E810-XXV & irdma0 Result | Status | Test ---------+--------+------------------------------------ PASS | 0 | ib_read_bw RC PASS | 0 | ib_read_lat RC PASS | 0 | ib_send_bw RC FAIL | 135 | ib_send_lat RC PASS | 0 | ib_write_bw RC FAIL | 135 | ib_write_lat RC Refer to the following beaker test job ID: https://beaker.engineering.redhat.com/jobs/7927291 for perftest : https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/06/79272/7927291/14023642/161221433/754135404/resultoutputfile.log Additional info:
By applying "limit inline data size" to 96 (-I 96), tests are consistently passing with perftest-23.04.0.0.23-1.el8. Some examples of test commands on client side are as follows: $ timeout 3m ib_send_bw -a -c RC -d irdma1 -i 1 -F -R -I 96 172.31.50.38 $ timeout 3m ib_write_lat -a -c RC -d irdma1 -i 1 -F -R -I 96 172.31.50.38 $ timeout 3m ib_send_bw -a -c RC -d irdma0 -i 1 -F -R -I 96 172.31.45.38 $ timeout 3m ib_send_lat -a -c RC -d irdma0 -i 1 -F -R -I 96 172.31.45.38 perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7983292: 4.18.0-497.el8.x86_64, rdma-core-46.0-1.el8.1, i40e, iw, E810-XXV & irdma1 Result | Status | Test ---------+--------+------------------------------------ PASS | 0 | ib_read_bw RC PASS | 0 | ib_read_lat RC PASS | 0 | ib_send_bw RC PASS | 0 | ib_send_lat RC PASS | 0 | ib_write_bw RC PASS | 0 | ib_write_lat RC Checking for failures and known issues: no test failures perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7983292: 4.18.0-497.el8.x86_64, rdma-core-46.0-1.el8.1, i40e, roce.45, E810-XXV & irdma0 Result | Status | Test ---------+--------+------------------------------------ PASS | 0 | ib_read_bw RC PASS | 0 | ib_read_lat RC PASS | 0 | ib_send_bw RC PASS | 0 | ib_send_lat RC PASS | 0 | ib_write_bw RC PASS | 0 | ib_write_lat RC Checking for failures and known issues: no test failures
Closing this bz as NOTABUG.