Bug 2212215
| Summary: | [RHEL8.9] all test cases in perftest failed with "Unexpected CM event bl blka 7" error on iRDMA RoCE devices | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Brian Chae <bchae> |
| Component: | perftest | Assignee: | Kamal Heib <kheib> |
| Status: | CLOSED NOTABUG | QA Contact: | Infiniband QE <infiniband-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 8.9 | CC: | dacampbe, dledford, hwkernel-mgr, rdma-dev-team, tmichael |
| Target Milestone: | rc | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-19 02:24:31 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
By applying "limit inline data size" to 96 (-I 96), tests are consistently passing with perftest-23.04.0.0.23-1.el8.
Some examples of test commands on client side are as follows:
$ timeout 3m ib_send_bw -a -c RC -d irdma1 -i 1 -F -R -I 96 172.31.50.38
$ timeout 3m ib_write_lat -a -c RC -d irdma1 -i 1 -F -R -I 96 172.31.50.38
$ timeout 3m ib_send_bw -a -c RC -d irdma0 -i 1 -F -R -I 96 172.31.45.38
$ timeout 3m ib_send_lat -a -c RC -d irdma0 -i 1 -F -R -I 96 172.31.45.38
perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7983292:
4.18.0-497.el8.x86_64, rdma-core-46.0-1.el8.1, i40e, iw, E810-XXV & irdma1
Result | Status | Test
---------+--------+------------------------------------
PASS | 0 | ib_read_bw RC
PASS | 0 | ib_read_lat RC
PASS | 0 | ib_send_bw RC
PASS | 0 | ib_send_lat RC
PASS | 0 | ib_write_bw RC
PASS | 0 | ib_write_lat RC
Checking for failures and known issues:
no test failures
perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7983292:
4.18.0-497.el8.x86_64, rdma-core-46.0-1.el8.1, i40e, roce.45, E810-XXV & irdma0
Result | Status | Test
---------+--------+------------------------------------
PASS | 0 | ib_read_bw RC
PASS | 0 | ib_read_lat RC
PASS | 0 | ib_send_bw RC
PASS | 0 | ib_send_lat RC
PASS | 0 | ib_write_bw RC
PASS | 0 | ib_write_lat RC
Checking for failures and known issues:
no test failures
Closing this bz as NOTABUG. |
Description of problem: All of perftest testcases failed with the following error: Unexpected CM event bl blka 7 Unable to perform rdma_client function Unable to init the socket connection and return code of 1. perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7883970: 4.18.0-492.el8.x86_64, rdma-core-44.0-2.el8.1, i40e, roce.45, E810-XXV & irdma0 Result | Status | Test ---------+--------+------------------------------------ FAIL | 1 | ib_read_bw RC FAIL | 1 | ib_read_lat RC FAIL | 1 | ib_send_bw RC FAIL | 1 | ib_send_lat RC FAIL | 1 | ib_write_bw RC FAIL | 1 | ib_write_lat RC This is a regression from RHEL-8.8.0-20230228.22. Version-Release number of selected component (if applicable): Clients: rdma-qe-39 Servers: rdma-qe-38 DISTRO=RHEL-8.9.0-20230521.41 + [23-05-24 12:44:25] cat /etc/redhat-release Red Hat Enterprise Linux release 8.9 Beta (Ootpa) + [23-05-24 12:44:25] uname -a Linux rdma-qe-39.rdma.lab.eng.rdu2.redhat.com 4.18.0-492.el8.x86_64 #1 SMP Tue May 9 14:50:21 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux + [23-05-24 12:44:25] cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-492.el8.x86_64 root=UUID=94414d8d-4218-4f56-85b5-9b558923a596 ro crashkernel=auto resume=UUID=bdafcf70-7355-4bad-b6ae-07711eee4ce1 console=ttyS0,115200n81 + [23-05-24 12:44:25] rpm -q rdma-core linux-firmware rdma-core-44.0-2.el8.1.x86_64 linux-firmware-20230515-115.gitd1962891.el8.noarch + [23-05-24 12:44:25] tail /sys/class/infiniband/irdma0/fw_ver /sys/class/infiniband/irdma1/fw_ver ==> /sys/class/infiniband/irdma0/fw_ver <== 1.57 ==> /sys/class/infiniband/irdma1/fw_ver <== 1.57 + [23-05-24 12:44:25] lspci + [23-05-24 12:44:25] grep -i -e ethernet -e infiniband -e omni -e ConnectX 41:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 41:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) c1:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe c1:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe + [23-05-24 12:44:25] rpm -q perftest perftest-4.5.0.20-4.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1. On the server host, issue the following perftest commands timeout 3m ib_read_bw -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_read_lat -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_send_bw -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_send_lat -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_write_bw -a -c RC -d irdma0 -i 1 -F -R timeout 3m ib_write_lat -a -c RC -d irdma0 -i 1 -F -R 2. On the client host, issue the following perftest commands timeout 3m ib_read_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_read_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_send_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_send_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_write_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 timeout 3m ib_write_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 3. Actual results: All of the above perftest commands resulted in the same errors: + [23-05-24 12:44:29] timeout 3m ib_read_bw -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 Unexpected CM event bl blka 7 <<<================== Unable to perform rdma_client function <<<================== Unable to init the socket connection <<<================== + [23-05-24 12:44:29] RQA_check_result -r 1 -t 'ib_read_bw RC' + [23-05-24 12:48:01] timeout 3m ib_read_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 Unexpected CM event bl blka 7 <<<================== Unable to perform rdma_client function <<<================== Unable to init the socket connection <<<================== + [23-05-24 12:48:01] RQA_check_result -r 1 -t 'ib_read_lat RC' Down to the last in the test sequence... + [23-05-24 13:00:01] timeout 3m ib_write_lat -a -c RC -d irdma0 -i 1 -F -R 172.31.45.39 Unexpected CM event bl blka 7 <<<================== Unable to perform rdma_client function <<<================== Unable to init the socket connection <<<================== + [23-05-24 13:00:01] RQA_check_result -r 1 -t 'ib_write_lat RC' Expected results: With RHEL-8.8.0-20230531.2 build, the following results are expected. perftest test results on rdma-qe-38/rdma-qe-39 & Beaker job J:7927291: 4.18.0-477.10.1.el8_8.x86_64, rdma-core-44.0-2.el8.1, i40e, roce.45, E810-XXV & irdma0 Result | Status | Test ---------+--------+------------------------------------ PASS | 0 | ib_read_bw RC PASS | 0 | ib_read_lat RC PASS | 0 | ib_send_bw RC FAIL | 135 | ib_send_lat RC PASS | 0 | ib_write_bw RC FAIL | 135 | ib_write_lat RC Refer to the following beaker test job ID: https://beaker.engineering.redhat.com/jobs/7927291 for perftest : https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/06/79272/7927291/14023642/161221433/754135404/resultoutputfile.log Additional info: