Description of problem: This may be related to Bug #1971174 & 2014054 but in this case, the core files out of fabtests are observed in all BXNT ROCE devices, including BCM57414. Version-Release number of selected component (if applicable): Clients: rdma-qe-25 Servers: rdma-qe-24 DISTRO=RHEL-8.6.0-20220131.1 + [22-02-01 09:29:32] cat /etc/redhat-release Red Hat Enterprise Linux release 8.6 Beta (Ootpa) + [22-02-01 09:29:32] uname -a Linux rdma-qe-24.rdma.lab.eng.rdu2.redhat.com 4.18.0-361.el8.x86_64 #1 SMP Mon Jan 24 10:45:51 EST 2022 x86_64 x86_64 x86_64 GNU/Linux + [22-02-01 09:29:32] cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-361.el8.x86_64 root=UUID=cb9a3f0d-b595-409b-a8f3-47752a9f1e96 ro crashkernel=auto resume=UUID=f61be3cc-7371-41ce-832f-274b8cbae8b3 console=ttyS0,115200n81 + [22-02-01 09:29:32] rpm -q rdma-core linux-firmware rdma-core-37.2-1.el8.x86_64 linux-firmware-20211119-105.gitf5d51956.el8.noarch + [22-02-01 09:29:32] tail /sys/class/infiniband/bnxt_re0/fw_ver /sys/class/infiniband/bnxt_re1/fw_ver /sys/class/infiniband/bnxt_re2/fw_ver /sys/class/infiniband/bnxt_re3/fw_ver ==> /sys/class/infiniband/bnxt_re0/fw_ver <== 20.8.30.0 ==> /sys/class/infiniband/bnxt_re1/fw_ver <== 20.8.30.0 ==> /sys/class/infiniband/bnxt_re2/fw_ver <== 216.0.51.0 ==> /sys/class/infiniband/bnxt_re3/fw_ver <== 216.0.51.0 + [22-02-01 09:29:32] lspci + [22-02-01 09:29:32] grep -i -e ethernet -e infiniband -e omni -e ConnectX 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe 1a:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) 1a:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) 5e:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01) 5e:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01) How reproducible: 100% Steps to Reproduce: 1. With the above build, run the following fabtests command 2. On the server run the fabtests, first /usr/bin/runfabtests.sh -T 60 -vvv -t quick psm3 172.31.45.125 172.31.45.126 | tee -a fabtests_psm3_quick.log 3. On the client run the fabtests, afterwards /usr/bin/runfabtests.sh -T 60 -vvv -t quick psm3 172.31.45.125 172.31.45.126 | tee -a fabtests_psm3_quick.log Actual results: After "journal -a", the following messages show on both hosts: [ 1021.052855] qperf[116003]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757f3d18 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1021.065024] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1023.378131] qperf[116011]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757f3d28 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1023.390300] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1025.695144] qperf[116024]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757ffcb8 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1025.707335] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1028.016585] qperf[116033]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757ffd48 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1028.028774] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1030.340159] qperf[116041]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757ffd48 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1030.352355] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1032.662261] qperf[116049]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757f3d28 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1032.674449] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1034.981671] qperf[116057]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757ffcb8 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1034.993840] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 [ 1037.305999] qperf[116065]: segfault at 0 ip 00007f0926fcc0b4 sp 00007ffe757ffcb8 error 4 in libibverbs.so.1.14.37.2[7f0926fb4000+1e000] [ 1037.318172] Code: 01 00 00 85 c0 75 0c 83 e5 01 74 07 41 8b 14 24 89 53 38 5b 5d 41 5c c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 07 48 8b 40 90 ff a0 28 01 00 00 66 66 2e 0f 1f 84 00 00 00 Expected results: Fabtests run without any such core files Additional info:
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.