Bug 2050979 - [RHEL8.6] when qperf tests are run on ALL ROCE & QEDE IW devices, segfault core dumps generated in the server host
Summary: [RHEL8.6] when qperf tests are run on ALL ROCE & QEDE IW devices, segfault c...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qperf
Version: 8.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: beta
: ---
Assignee: Nobody
QA Contact: Infiniband QE
URL:
Whiteboard:
Depends On: 1950272
Blocks: 1903942 2049647
TreeView+ depends on / blocked
 
Reported: 2022-02-05 13:57 UTC by Brian Chae
Modified: 2023-08-05 07:28 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1950272
Environment:
Last Closed: 2023-08-05 07:28:19 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-111229 0 None None None 2022-02-05 14:02:33 UTC

Comment 1 Brian Chae 2022-02-05 14:00:20 UTC
Additional info on RHEL8.6.0 qperf test results.
All of the tests passed unlike in RHEL8.4.

Test results for qperf on rdma-virt-01:
4.18.0-363.el8.x86_64, rdma-core-37.2-1.el8, mlx4, roce.45, & mlx4_1
    Result | Status | Test
  ---------+--------+------------------------------------
      PASS |      0 | ping server
      PASS |      0 | conf
      PASS |      0 | rc_bi_bw
      PASS |      0 | rc_bw
      PASS |      0 | rc_lat
      PASS |      0 | rc_rdma_read_bw
      PASS |      0 | rc_rdma_read_lat
      PASS |      0 | rc_rdma_write_bw
      PASS |      0 | rc_rdma_write_lat
      PASS |      0 | rc_rdma_write_poll_lat
      PASS |      0 | rc_compare_swap_mr
      PASS |      0 | rc_fetch_add_mr
      PASS |      0 | ver_rc_compare_swap
      PASS |      0 | ver_rc_fetch_add
      PASS |      0 | quit

Checking for failures and known issues:
  no test failures


Yet the same segfaults in the server host as stated in the problem description.

Comment 4 Brian Chae 2023-06-05 10:58:57 UTC
As of RHEL-8.9, the qperf cores files, as well, when qperf test was run on iRDMA iWARP device.


TIME                            PID   UID   GID SIG COREFILE  EXE
Sun 2023-06-04 13:57:53 EDT  104747     0     0  11 none      /usr/bin/qperf
Sun 2023-06-04 13:57:56 EDT  104750     0     0  11 none      /usr/bin/qperf
Sun 2023-06-04 13:57:58 EDT  104762     0     0  11 none      /usr/bin/qperf
Sun 2023-06-04 13:58:02 EDT  104778     0     0  11 none      /usr/bin/qperf
Sun 2023-06-04 13:58:04 EDT  104781     0     0  11 none      /usr/bin/qperf
total 0
Red Hat Enterprise Linux release 8.9 Beta (Ootpa)


Clients: rdma-qe-38
Servers: rdma-qe-39

DISTRO=RHEL-8.9.0-20230521.41

+ [23-06-04 13:56:08] cat /etc/redhat-release
Red Hat Enterprise Linux release 8.9 Beta (Ootpa)

+ [23-06-04 13:56:08] uname -a
Linux rdma-qe-39.rdma.lab.eng.rdu2.redhat.com 4.18.0-492.el8.x86_64 #1 SMP Tue May 9 14:50:21 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

+ [23-06-04 13:56:08] cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-492.el8.x86_64 root=UUID=429d9f1d-500d-4006-a3b5-2455ab53eebb ro crashkernel=auto resume=UUID=f4509d89-d3e0-454b-8fe1-f347d8d3cbf9 console=ttyS0,115200n81

+ [23-06-04 13:56:08] rpm -q rdma-core linux-firmware
rdma-core-44.0-2.el8.1.x86_64
linux-firmware-20230515-115.gitd1962891.el8.noarch

+ [23-06-04 13:56:08] tail /sys/class/infiniband/irdma0/fw_ver /sys/class/infiniband/irdma1/fw_ver
==> /sys/class/infiniband/irdma0/fw_ver <==
1.57

==> /sys/class/infiniband/irdma1/fw_ver <==
1.57

+ [23-06-04 13:56:08] lspci
+ [23-06-04 13:56:08] grep -i -e ethernet -e infiniband -e omni -e ConnectX
41:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
41:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
c1:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
c1:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe

Comment 6 RHEL Program Management 2023-08-05 07:28:19 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.