This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2212070 - [RHEL8.9] tracebacks observed while testing qperf on QEDR RoCE device
Summary: [RHEL8.9] tracebacks observed while testing qperf on QEDR RoCE device
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qperf
Version: 8.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Nobody
QA Contact: Infiniband QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-03 14:40 UTC by Brian Chae
Modified: 2023-09-21 13:43 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-21 13:43:59 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-6081 0 None Migrated None 2023-09-21 13:44:01 UTC
Red Hat Issue Tracker RHELPLAN-158906 0 None None None 2023-06-03 14:41:26 UTC

Description Brian Chae 2023-06-03 14:40:29 UTC
Description of problem:

During "qperf" tests, a lot of tracebacks were observed on the RDMA client host console. Also, there was no indication of any "qperf" test cases ran with data outputs.

This is a regression from  RHEL-8.8.0-20230228.22 build.


Version-Release number of selected component (if applicable):

Clients: rdma-dev-03
Servers: rdma-dev-02

DISTRO=RHEL-8.9.0-20230531.26

+ [23-05-31 18:23:09] cat /etc/redhat-release
Red Hat Enterprise Linux release 8.9 Beta (Ootpa)

+ [23-05-31 18:23:09] uname -a
Linux rdma-dev-03.rdma.lab.eng.rdu2.redhat.com 4.18.0-494.el8.x86_64 #1 SMP Mon May 22 11:16:32 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

+ [23-05-31 18:23:09] cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-494.el8.x86_64 root=UUID=f2550ab5-318d-4fc6-9720-3d753ef5b845 ro console=tty0 rd_NO_PLYMOUTH intel_iommu=on iommu=on crashkernel=auto resume=UUID=b36d9352-dc3a-4059-b5aa-6cb606ad4bce console=ttyS1,115200

+ [23-05-31 18:23:09] rpm -q rdma-core linux-firmware
rdma-core-44.0-2.el8.1.x86_64
linux-firmware-20230515-115.gitd1962891.el8.noarch

+ [23-05-31 18:23:09] tail /sys/class/infiniband/qedr0/fw_ver /sys/class/infiniband/qedr1/fw_ver
==> /sys/class/infiniband/qedr0/fw_ver <==
8. 59. 1. 0

==> /sys/class/infiniband/qedr1/fw_ver <==
8. 59. 1. 0

+ [23-05-31 18:23:09] lspci
+ [23-05-31 18:23:09] grep -i -e ethernet -e infiniband -e omni -e ConnectX
02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
08:00.0 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)
08:00.1 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)


How reproducible:
100%

Steps to Reproduce:
1. On the RDMA server host, issue
   qperf
2. On the RDMA client host, issue

qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_bi_bw
qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_bw
qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_lat
qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_rdma_read_bw
qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_compare_swap_mr
qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_fetch_add_mr
qperf -v -i qedr1:1 -cm 1 172.31.50.102 ver_rc_compare_swap
qperf -v -i qedr1:1 -cm 1 172.31.50.102 ver_rc_fetch_add



Actual results:

The actual tracebacks are at with "======= Start of the tracebacks ==============" below.


[ 3398.097186] restraintd[1961]: ** Completed Task : 161077355 
[ 3398.520118] restraintd[1961]: ** Fetching task: 161077356 [/mnt/tests/gerrit-git.engineering.redhat.com/kernel-tests/infiniband/qperf] 
[ 3398.893610] restraintd[1961]: ** Preparing metadata 
[ 3399.143978] restraintd[1961]: ** Refreshing peer role hostnames: Retries 0 
[ 3399.374951] restraintd[1961]: ** Updating env vars 
[ 3399.375063] restraintd[1961]: ** Updating external watchdog: 5400 seconds 
[ 3399.721661] restraintd[1961]: ** Installing dependencies 
[ 3401.751084] restraintd[1961]: ** Running task: 161077356 [/kernel/infiniband/qperf] 
[ 3413.601870] [qed_pglueb_rbc_attn_handler:319(qede_iw-0)]ZLR error - 031ff806 [Address 00000002:7d037000] 
[ 3413.601873] [qed_pglueb_rbc_attn_handler:319(qede_roce-0)]ZLR error - 031ff806 [Address 00000002:7d037000] 
[ 3413.612490] [qed_pglueb_rbc_attn_handler:335(qede_iw-0)]ILT error - Details ff001026 Details2 00e00020 [Address 00000000:00010000] 
[ 3413.623290] [qed_pglueb_rbc_attn_handler:335(qede_roce-0)]ILT error - Details ff001026 Details2 00e00020 [Address 00000000:00010000] 
[ 3413.636424] [qed_dbg_parse_attn:7812(qede_iw-0)]pglue_b (Interrupt) : master_zlr_attn [address 0x00aa0630, bit 10] 
[ 3413.649746] [qed_dbg_parse_attn:7812(qede_roce-0)]pglue_b (Interrupt) : pgl_pcie_attn [address 0x00aa0630, bit 19] [masked] 
[ 3413.661331] [qed_dbg_parse_attn:7812(qede_iw-0)]pglue_b (Interrupt) : pgl_pcie_attn [address 0x00aa0630, bit 19] [masked] 
[ 3413.686063] [qed_dbg_parse_attn:7812(qede_iw-0)]pglue_b (Interrupt) : vf_ilt_err [address 0x00aa0630, bit 22] [masked] 
[ 3430.649853] [qed_hw_err_notify:830(qede_roce-0)]Ramrod is stuck [CID ff000007 ROCE_RAMROD_MODIFY_QP:0e PROTOCOLID_ROCE:02 echo 04fa] 
[ 3430.663190] [qed_hw_error_occurred:2775(qede_roce-0)]HW error occurred [Ramrod Failure] 
[ 3430.672148] [qede_atomic_hw_err_handler:2691(qede_roce)]Generic non-sleepable HW error handling started - err_flags 0x80000007 
[ 3430.684895] [qede_atomic_hw_err_handler:2700(qede_roce)]Generic non-sleepable HW error handling is done 
[ 3430.698248] [qede_generic_hw_err_handler:2707(qede_roce)]Generic sleepable HW error handling started - err_flags 0x80000007 
[ 3430.710701] [qede_generic_hw_err_handler:2710(qede_roce)]Reporting fatal error to devlink 
[ 3430.720730] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["idle_chk"] 
[ 3430.736248]  
[ 3430.736248] fw-version: 8_59_1_1 
[ 3430.736248] fw-image: main 
[ 3430.736248] mfw-version: 8_34_8_0 
[ 3430.736248] chip-revision: b0 
[ 3430.736248] tools-version: 11 
[ 3430.736248] chip: bb 
[ 3430.736248] platform: asic 
[ 3430.736248] pci-func: 0 
[ 3430.736248] epoch: 1685571838 
[ 3430.736248] path: 0 
[ 3430.736248] dump-type: idle-chk 
[ 3430.736248]  
[ 3430.736248] FW_IDLE_CHECK: 
[ 3430.736248] Error if no traffic: PSWRQ2: sr_cnt is not equal to sr_num_cfg. Registers: PSWRQ2_REG_SR_CNT=0x7e PSWRQ2_REG_SR_NUM_CFG=0x7f 
[ 3430.736248] Error if no traffic: PSWRQ2: SR_CNT_per_vq6 is not equal to MAX_SRS_vq. Registers: PSWRQ2_REG_SR_CNT_PER_VQ_6=0x7e PSWRQ2_REG_MAX_SRS_VQ6=0x7f 
[ 3430.736248] Error if no traffic: PSWRQ2: blk_cnt is not equal to blk_num_cfg. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3430.736248] Error if no traffic: PSWRQ2: There are more than eight used blocks. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3430.736257] ing: PBF: Packet/s were sent to BTB with 'drop' indication. Registers: PBF_REG_NUM_PKTS_SENT_WITH_DROP_TO_BTB=0xc2 
[ 3430.736257] Error if no traffic: XSDM: There are outstanding PXP requests. Registers: XSDM_REG_DST_PXP_SRC_PEND_EMPTY=0x0 
[ 3430.736257] Error if no traffic: XSDM: The PXP requests linked-list is not empty. Registers: XSDM_REG_DST_PXP_LINK_EMPTY=0x0 
[ 3430.736257] Error if no traffic: XCM: There are messages in the XX RAM. Registers: XCM_REG_XX_FREE_CNT=0x3f 
[ 3430.736257] Error if no traffic: XCM: XX CAM has active lcids. Registers: XCM_REG_XX_LCID_CAM_FILL_LVL=0x1 
[ 3430.736257] Error if no traffic: XCM: has locked lcids in xxLock. Registers: XCM_REG_XX_LOCK_CNT=0x1 
[ 3430.736257] Error if no traffic: XSEM: Thread not freed. Registers: XSEM_REG_NUM_OF_THREADS=0x17 
[ 3430.736257] Error if no traffic: QM: VOQ LINE credit on LB queues is not equal to init credit. Registers: QM_REG_VOQCRDLINE[16]=0x80d0 QM_REG_VOQINITCRDLINE[16]=0x8124 MISC_REG_PORT_MODE=0x0 
[ 3430.835691] dit. Registers: QM_REG_VOQCRDLINE[0]=0x83aa QM_REG_VOQINITCRDLINE[0]=0x841a MISC_REG_PORT_MODE=0x0 
[ 3430.835691] Error if no traffic: QM: VOQ BYTE credit on LB queues is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[16]=0x815a1c QM_REG_VOQINITCRDBYTE[16]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3430.835691] Error if no traffic: QM: VOQ BYTE credit on NW queues is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[0]=0x815a1c QM_REG_VOQINITCRDBYTE[0]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3430.835691] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[4]=0x2 
[ 3430.835691] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[9]=0x1 
[ 3430.835691]  
[ 3430.835691] LSI_IDLE_CHECK: 
[ 3430.835691] Error if no traffic: PSWRQ2: sr_cnt is not equal to sr_num_cfg. Registers: PSWRQ2_REG_SR_CNT=0x7e PSWRQ2_REG_SR_NUM_CFG=0x7f 
[ 3430.936354] traffic: PSWRQ2: blk_cnt is not equal to blk_num_cfg. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3430.936354] Error if no traffic: PSWRQ2: There are more than eight used blocks. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3430.936354] Error if no traffic: PSWRQ2: blk_cnt_per_vq6 is not equal to max_blks_vq. Registers: PSWRQ2_REG_BLK_CNT_PER_VQ_6=0x120 PSWRQ2_REG_MAX_BLKS_VQ6=0x140 
[ 3430.936354] Error: PGLUE: PCIE error status is set. Registers: PGLUE_B_REG_PCIE_ERR_STATUS=0x4 
[ 3430.936354] Warning: PBF: Packet/s were sent to BTB with 'drop' indication. Registers: PBF_REG_NUM_PKTS_SENT_WITH_DROP_TO_BTB=0xc2 
[ 3430.936354] Error if no traffic: XSDM: XSDM_REG_DST_PXP_SRC_PEND_EMPTY = 0. Registers: XSDM_REG_DST_PXP_SRC_PEND_EMPTY=0x0 
[ 3430.936354] Error if no traffic: XSDM: XSDM_REG_DST_PXP_LINK_EMPTY = 0. Registers: XSDM_REG_DST_PXP_LINK_EMPTY=0x0 
[ 3431.029799] n CAM is not empty. Registers: XCM_REG_XX_LCID_CAM_FILL_LVL=0x1 
[ 3431.029799] Error if no traffic: XCM: there is locked LCID in XX. Registers: XCM_REG_XX_LOCK_CNT=0x1 
[ 3431.029799] Error if no traffic: XSEM: Thread not freed. Registers: XSEM_REG_NUM_OF_THREADS=0x17 
[ 3431.029799] Error if no traffic: QM: VOQ LINE credit is not equal to init credit. Registers: QM_REG_VOQCRDLINE[16]=0x80d0 QM_REG_VOQINITCRDLINE[16]=0x8124 MISC_REG_PORT_MODE=0x0 
[ 3431.029799] Error if no traffic: QM: VOQ LINE credit is not equal to init credit. Registers: QM_REG_VOQCRDLINE[0]=0x83aa QM_REG_VOQINITCRDLINE[0]=0x841a MISC_REG_PORT_MODE=0x0 
[ 3431.029799] Error if no traffic: QM: VOQ BYTE credit is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[16]=0x815a1c QM_REG_VOQINITCRDBYTE[16]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3431.029799] Error if no traffic: QM: VOQ BYTE credit is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[0]=0x815a1c QM_REG_VOQINITCRDBYTE[0]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3431.123025] ot 0. Registers: XCM_REG_QM_ACT_ST_CNT[4]=0x2 
[ 3431.123025] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[9]=0x1 
[ 3431.123025]  
[ 3431.123025] Idle Check failed!!! (with 18 errors and 1 warnings) 
[ 3431.123025]  
[ 3431.123025]  
[ 3431.123025]  
[ 3431.123025]  
[ 3431.260135]  
[ 3431.262119] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["idle_chk"] 
[ 3431.277930]  
[ 3431.277930] fw-version: 8_59_1_1 
[ 3431.277930] fw-image: main 
[ 3431.277930] mfw-version: 8_34_8_0 
[ 3431.277930] chip-revision: b0 
[ 3431.277930] tools-version: 11 
[ 3431.277930] chip: bb 
[ 3431.277930] platform: asic 
[ 3431.277930] pci-func: 0 
[ 3431.277930] epoch: 1685571839 
[ 3431.277930] path: 0 
[ 3431.277930] dump-type: idle-chk 
[ 3431.277930]  
[ 3431.277930] FW_IDLE_CHECK: 
[ 3431.277930] Error if no traffic: PSWRQ2: sr_cnt is not equal to sr_num_cfg. Registers: PSWRQ2_REG_SR_CNT=0x7e PSWRQ2_REG_SR_NUM_CFG=0x7f 
[ 3431.277930] Error if no traffic: PSWRQ2: SR_CNT_per_vq6 is not equal to MAX_SRS_vq. Registers: PSWRQ2_REG_SR_CNT_PER_VQ_6=0x7e PSWRQ2_REG_MAX_SRS_VQ6=0x7f 
[ 3431.277930] Error if no traffic: PSWRQ2: blk_cnt is not equal to blk_num_cfg. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3431.277930] Error if no traffic: PSWRQ2: There are more than eight used blocks. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3431.277938] r if no traffic: XSDM: There are outstanding PXP requests. Registers: XSDM_REG_DST_PXP_SRC_PEND_EMPTY=0x0 
[ 3431.277938] Error if no traffic: XSDM: The PXP requests linked-list is not empty. Registers: XSDM_REG_DST_PXP_LINK_EMPTY=0x0 
[ 3431.277938] Error if no traffic: XCM: There are messages in the XX RAM. Registers: XCM_REG_XX_FREE_CNT=0x3f 
[ 3431.277938] Error if no traffic: XCM: XX CAM has active lcids. Registers: XCM_REG_XX_LCID_CAM_FILL_LVL=0x1 
[ 3431.277938] Error if no traffic: XCM: has locked lcids in xxLock. Registers: XCM_REG_XX_LOCK_CNT=0x1 
[ 3431.277938] Error if no traffic: XSEM: Thread not freed. Registers: XSEM_REG_NUM_OF_THREADS=0x17 
[ 3431.277938] Error if no traffic: QM: VOQ LINE credit on LB queues is not equal to init credit. Registers: QM_REG_VOQCRDLINE[16]=0x80d0 QM_REG_VOQINITCRDLINE[16]=0x8124 MISC_REG_PORT_MODE=0x0 
[ 3431.381855]  QM: VOQ BYTE credit on LB queues is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[16]=0x815a1c QM_REG_VOQINITCRDBYTE[16]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3431.381855] Error if no traffic: QM: VOQ BYTE credit on NW queues is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[0]=0x815a1c QM_REG_VOQINITCRDBYTE[0]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3431.381855] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[4]=0x2 
[ 3431.381855] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[9]=0x1 
[ 3431.381855]  
[ 3431.381855] LSI_IDLE_CHECK: 
[ 3431.381855] Error if no traffic: PSWRQ2: sr_cnt is not equal to sr_num_cfg. Registers: PSWRQ2_REG_SR_CNT=0x7e PSWRQ2_REG_SR_NUM_CFG=0x7f 
[ 3431.381855] Error if no traffic: PSWRQ2: SR_CNT_per_vq6 is not equal to MAX_SRS_vq. Registers: PSWRQ2_REG_SR_CNT_PER_VQ_6=0x7e PSWRQ2_REG_MAX_SRS_VQ6=0x7f 
[ 3431.564463] Error if no traffic: PSWRQ2: There are more than eight used blocks. Registers: PSWRQ2_REG_BLK_CNT=0x120 PSWRQ2_REG_BLK_NUM_CFG=0x140 
[ 3431.564463] Error if no traffic: PSWRQ2: blk_cnt_per_vq6 is not equal to max_blks_vq. Registers: PSWRQ2_REG_BLK_CNT_PER_VQ_6=0x120 PSWRQ2_REG_MAX_BLKS_VQ6=0x140 
[ 3431.564463] Error: PGLUE: PCIE error status is set. Registers: PGLUE_B_REG_PCIE_ERR_STATUS=0x4 
[ 3431.564463] Error if no traffic: XSDM: XSDM_REG_DST_PXP_SRC_PEND_EMPTY = 0. Registers: XSDM_REG_DST_PXP_SRC_PEND_EMPTY=0x0 
[ 3431.564463] Error if no traffic: XSDM: XSDM_REG_DST_PXP_LINK_EMPTY = 0. Registers: XSDM_REG_DST_PXP_LINK_EMPTY=0x0 
[ 3431.564463] Error if no traffic: XCM: XX_FREE_CNT is different than reset value. Registers: XCM_REG_XX_FREE_CNT=0x3f 
[ 3431.564463] Error if no traffic: XCM: XX protection CAM is not empty. Registers: XCM_REG_XX_LCID_CAM_FILL_LVL=0x1 
[ 3431.564463] Error if no traffic: XCM: there is locked LCID in XX. Registers: XCM_REG_XX_LOCK_CNT=0x1 
[ 3431.664495] Error if no traffic: QM: VOQ LINE credit is not equal to init credit. Registers: QM_REG_VOQCRDLINE[16]=0x80d0 QM_REG_VOQINITCRDLINE[16]=0x8124 MISC_REG_PORT_MODE=0x0 
[ 3431.664495] Error if no traffic: QM: VOQ LINE credit is not equal to init credit. Registers: QM_REG_VOQCRDLINE[0]=0x83aa QM_REG_VOQINITCRDLINE[0]=0x841a MISC_REG_PORT_MODE=0x0 
[ 3431.664495] Error if no traffic: QM: VOQ BYTE credit is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[16]=0x815a1c QM_REG_VOQINITCRDBYTE[16]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3431.664495] Error if no traffic: QM: VOQ BYTE credit is not equal to init credit. Registers: QM_REG_VOQCRDBYTE[0]=0x815a1c QM_REG_VOQINITCRDBYTE[0]=0x818000 MISC_REG_PORT_MODE=0x0 
[ 3431.664495] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[4]=0x2 
[ 3431.664495] Error if no traffic: XCM: QM Active State Counter is not 0. Registers: XCM_REG_QM_ACT_ST_CNT[9]=0x1 
[ 3431.664495]  
[ 3431.763194]  
[ 3431.765204] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["reg_fifo"] 
[ 3431.773935]  
[ 3431.773935] fw-version: 8_59_1_1 
[ 3431.773935] fw-image: main 
[ 3431.773935] mfw-version: 8_34_8_0 
[ 3431.773935] chip-revision: b0 
[ 3431.773935] tools-version: 11 
[ 3431.773935] chip: bb 
[ 3431.773935] platform: asic 
[ 3431.773935] pci-func: 0 
[ 3431.773935] epoch: 1685571839 
[ 3431.773935] path: 0 
[ 3431.773935] dump-type: reg-fifo 
[ 3431.773935]  
[ 3431.773935] fifo contained 0 elements 
[ 3431.773935]  
[ 3431.821123]  
[ 3431.823033] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["igu_fifo"] 
[ 3431.831668]  
[ 3431.831668] fw-version: 8_59_1_1 
[ 3431.831668] fw-image: main 
[ 3431.831668] mfw-version: 8_34_8_0 
[ 3431.831668] chip-revision: b0 
[ 3431.831668] tools-version: 11 
[ 3431.831668] chip: bb 
[ 3431.831668] platform: asic 
[ 3431.831668] pci-func: 0 
[ 3431.831668] epoch: 1685571839 
[ 3431.831668] path: 0 
[ 3431.831668] dump-type: igu-fifo 
[ 3431.831668]  
[ 3431.831668] fifo contained 0 elements 
[ 3431.831668]  
[ 3431.877837]  
[ 3431.879656] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["protection_override"] 
[ 3431.889332]  
[ 3431.889332] fw-version: 8_59_1_1 
[ 3431.889332] fw-image: main 
[ 3431.889332] mfw-version: 8_34_8_0 
[ 3431.889332] chip-revision: b0 
[ 3431.889332] tools-version: 11 
[ 3431.889332] chip: bb 
[ 3431.889332] platform: asic 
[ 3431.889332] pci-func: 0 
[ 3431.889332] epoch: 1685571840 
[ 3431.889332] path: 0 
[ 3431.889332] dump-type: protection-override 
[ 3431.889332]  
[ 3431.889332] window  0, address: 0x0009060, size:       1 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944158] window  1, address: 0x0e00098, size:       2 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944158] window  2, address: 0x00500a4, size:       5 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944158] window  3, address: 0x0540a7c, size:       5 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944158] window  4, address: 0x02a8400, size:       5 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944158] window  5, address: 0x0040474, size:       5 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944158] window  6, address: 0x0e2a0b0, size:       3 regs, read: 1, write: 1, read protection: override VF , write protection: override VF  
[ 3431.944166]       88 regs, read: 1, write: 1, read protection: override VF , write protection: override VF  
[ 3431.944166] window  9, address: 0x0e2a100, size:     800 regs, read: 1, write: 1, read protection: override VF , write protection: override VF  
[ 3431.944166] window 10, address: 0x02aa3bc, size:       1 regs, read: 1, write: 1, read protection: override PDA, write protection: override PDA 
[ 3431.944166] window 11, address: 0x0000000, size: 8388607 regs, read: 1, write: 0, read protection: override PDA, write protection: override HV  
[ 3431.944166] window 12, address: 0x0008d1c, size:       1 regs, read: 1, write: 1, read protection: override VF , write protection: override VF  
[ 3431.944166] protection override contained 13 elements 
[ 3431.944166]  
[ 3431.944166]  
[ 3432.112459]  
[ 3432.114385] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["fw_asserts"] 
[ 3432.123438]  
[ 3432.123438] fw-version: 8_59_1_1 
[ 3432.123438] fw-image: main 
[ 3432.123438] mfw-version: 8_34_8_0 
[ 3432.123438] chip-revision: b0 
[ 3432.123438] tools-version: 11 
[ 3432.123438] chip: bb 
[ 3432.123438] platform: asic 
[ 3432.123438] pci-func: 0 
[ 3432.123438] epoch: 1685571840 
[ 3432.123438] path: 0 
[ 3432.123438] dump-type: fw-asserts 
[ 3432.123438]  
[ 3432.123438]  
[ 3432.123438] TSTORM_ASSERT: size=8 
[ 3432.123438] 00030004 
[ 3432.123438] 87d252a9 
[ 3432.123438] 4dcf5854 
[ 3432.123438] ff000006 
[ 3432.123438] 89000035 
[ 3432.123438] ff000006 
[ 3432.123438] 04f9020e 
[ 3432.123438] 020851df 
[ 3432.123438]  
[ 3432.123438] MSTORM_ASSERT: size=8 
[ 3432.123438] 01030004 
[ 3432.123438] 87d252a9 
[ 3432.123438] 4703040b 
[ 3432.123438] ff000006 
[ 3432.123438] 89000035 
[ 3432.123438] ff000006 
[ 3432.123438] 04f9020e 
[ 3432.123438] 020851df 
[ 3432.123438]  
[ 3432.123438] USTORM_ASSERT: size=8 
[ 3432.123438] 02030004 
[ 3432.123438] 87d252a9 
[ 3432.123438] cd34d58d 
[ 3432.123438] ff000006 
[ 3432.234059] YSTORM_ASSERT: size=8 
[ 3432.234059] 04030004 
[ 3432.234059] 87d252a9 
[ 3432.234059] 00076fb8 
[ 3432.234059] ff000006 
[ 3432.234059] 89000035 
[ 3432.234059] ff000006 
[ 3432.234059] 04f9020e 
[ 3432.234059] 020851da 
[ 3432.234059]  
[ 3432.234059] PSTORM_ASSERT: size=8 
[ 3432.234059] 05030004 
[ 3432.234059] 87d252a9 
[ 3432.234059] b69a1327 
[ 3432.234059] ff000006 
[ 3432.234059] 89000035 
[ 3432.234059] ff000006 
[ 3432.234059] 04f9020e 
[ 3432.234059] 020851da 
[ 3432.234059]  
[ 3432.234059]  
[ 3432.234059]  
[ 3432.288633]  
[ 3432.290375] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["ilt"] 
[ 3432.299667] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["grc"] 
[ 3432.594054] [qed_dbg_dump:8021(qede_roce)]Collecting a debug feature ["mcp_trace"] 
[ 3433.174928]  
[ 3433.174928] fw-version: 8_59_1_1 
[ 3433.174928] fw-image: main 
[ 3433.174928] mfw-version: 8_34_8_0 
[ 3433.174928] chip-revision: b0 
[ 3433.174928] tools-version: 11 
[ 3433.174928] chip: bb 
[ 3433.174928] platform: asic 
[ 3433.174928] pci-func: 0 
[ 3433.174928] epoch: 1685571840 
[ 3433.174928] path: 0 
[ 3433.174928] dump-type: mcp-trace 
[ 3433.174928]  
[ 3433.174928] TRACE DRV_HSI : [F1] Updating PF driver with msg 0x5 
[ 3433.174928] TRACE DRV_HSI : [F1] DRV_MSG_CODE_GET_OEM_UPDATES 0x1 
[ 3433.174928] TRACE DRV_HSI : [F1] Updating PF driver with msg 0x7 
[ 3433.174928] TRACE DRV_HSI : [F1] DRV_MSG_CODE_BW_UPDATE_ACK 0x0 
[ 3433.174928] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_STORM_FW_VER 0x83b0100 
[ 3433.174928] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_MTU 0x5dc 
[ 3433.174928] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_DRIVER_STATE 0x4 
[ 3433.174928] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_ESWITCH_MODE 0x0 
[ 3433.174928] TRACE DRV_HSI : [F1] INIT_PHY 0x0 
[ 3433.174938]  on port 1 
[ 3433.174938] TRACE ETH     : [P1] Port reset: value 0x0 
[ 3433.174938] TRACE DRV_HSI : [P1] Updating all pfs on with msg 0x0 
[ 3433.174938] TRACE ETH     : [P1] Increment link-flap counter to 2 
[ 3433.174938] TRACE ETH     : [P1] Updating LFA status to 0x20001 
[ 3433.174938] TRACE ETH     : SmartAN enabled set to 0 
[ 3433.174938] TRACE ETH     : [P1] Transceiver card 
[ 3433.174938] TRACE ETH     : Init now 
[ 3433.174938] TRACE ETH     : [P1] Disable Firecode-FEC lane 2 
[ 3433.174938] TRACE ETH     : [P1] Disable RS-FEC 
[ 3433.174938] TRACE ETH     : [P1] Force speed 0x61a8 
[ 3433.174938] TRACE ETH     : [P1] eth_port_init: req_speed: 25000Mbps adv_speed 0xa 
[ 3433.174938] TRACE ETH     :      eth_port_init: flow_ctrl 0x7 loopback 0x0 
[ 3433.174938] TRACE ETH     : Transceiver type is 0x16 
[ 3433.174938] TRACE ETH     : Transceiver init done on port 1 
[ 3433.174938] TRACE ETH     : [P1] phy->lane_map = 0x00000004, num_lanes = 1 
[ 3433.275853] ] Setting 25G/50G OUI to 0x6a737d 
[ 3433.275853] ERROR ERROR   : Autoneg Pause setting ignored while speed is not set to autoneg 
[ 3433.275853] TRACE ETH     : [1]: POWER UP PHY. 
[ 3433.275853] TRACE ETH     : [P1] Power ON PHY 
[ 3433.275853] TRACE ETH     : [1]: ETH stat address: 0xe2ae28 
[ 3433.275853] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_DRIVER_STATE 0x5 
[ 3433.275853] TRACE ETH     : [P1] MAC fault change: 0x0->0x1 
[ 3433.275853] TRACE ETH     : physical_link_update: status 0x1800000 on port 1 
[ 3433.275853] TRACE ETH     : eth_link_update. new_link_status = 0x1800000 link_speed = 0 
[ 3433.275853] TRACE ETH     : [P1] MAC fault change: 0x1->0x3 
[ 3433.275853] TRACE ETH     : physical_link_update: status 0x3800000 on port 1 
[ 3433.275853] TRACE TRACE   : Repeat the last trace for 1 times 
[ 3433.275853] TRACE ETH     : eth_link_update. new_link_status = 0x3800000 link_speed = 25000 
[ 3433.275853] TRACE ETH     : [P1] MAC fault change: 0x3->0x2 
[ 3433.373204]  25000 
[ 3433.373204] TRACE DRV_HSI : [F0] INIT_PHY 0x0 
[ 3433.373204] TRACE ETH     : [P0] Check LFA - First time link init 
[ 3433.373204] TRACE ETH     : [P0] CLPORT: port disabled 
[ 3433.373204] TRACE ETH     : [P0] Power OFF PHY 
[ 3433.373204] TRACE ETH     : [P0] Reset Tx_Credits and soft reset MAC 
[ 3433.373204] TRACE ETH     : [P0] Closing data path and reset serdes 
[ 3433.373204] TRACE ETH     : physical_link_update: status 0x0 on port 0 
[ 3433.373204] TRACE ETH     : [P0] Port reset: value 0x0 
[ 3433.373204] TRACE DRV_HSI : [P0] Updating all pfs on with msg 0x0 
[ 3433.373204] TRACE ETH     : [P0] Increment link-flap counter to 2 
[ 3433.373204] TRACE ETH     : [P0] Updating LFA status to 0x20001 
[ 3433.373204] TRACE ETH     : SmartAN enabled set to 0 
[ 3433.373204] TRACE ETH     : [P0] Transceiver card 
[ 3433.373204] TRACE ETH     : Init now 
[ 3433.373204] TRACE ETH     : [P0] Disable Firecode-FEC lane 0 
[ 3433.373204] TRACE ETH     : [P0] Disable RS-FEC 
[ 3433.471850] H     : Transceiver init done on port 0 
[ 3433.471850] TRACE ETH     : [P0] phy->lane_map = 0x00000001, num_lanes = 1 
[ 3433.471850] TRACE ETH     : [P0] Serdes core_revision: 0x6, serdes id is 0x4ad4 
[ 3433.471850] TRACE ETH     : [P0] Autoneg is disabled 
[ 3433.471850] TRACE ETH     : [P0] Clear ALLOW_PLL_CHANGE 
[ 3433.471850] TRACE ETH     : [P0] Setting an.enable to 0 an_mode to 0x3 
[ 3433.471850] TRACE ETH     : [P0] Setting 25G/50G OUI to 0x6a737d 
[ 3433.471850] ERROR ERROR   : Autoneg Pause setting ignored while speed is not set to autoneg 
[ 3433.471850] TRACE ETH     : [0]: POWER UP PHY. 
[ 3433.471850] TRACE ETH     : [P0] Power ON PHY 
[ 3433.471850] TRACE ETH     : [0]: ETH stat address: 0xe2a1a8 
[ 3433.471850] TRACE DRV_HSI : [F0] DRV_MSG_CODE_OV_UPDATE_DRIVER_STATE 0x5 
[ 3433.471850] TRACE ETH     : [P0] MAC fault change: 0x0->0x1 
[ 3433.471850] TRACE ETH     : physical_link_update: status 0x1800000 on port 0 
[ 3433.572516] new_link_status = 0xe00011 link_speed = 25000 
[ 3433.572516] TRACE ETH     : [P1] eth_link_update: link_state 0x800000->0xe00011 
[ 3433.572516] TRACE ETH     : [P1] Link is Up in 25000Mbps (change count 1) 
[ 3433.572516] TRACE ETH     : [P1] Reset Tx_Credits and soft reset MAC 
[ 3433.572516] TRACE ETH     : [P1] CLMAC configured. Pause RX/TX enabled 
[ 3433.572516] TRACE ETH     : physical_link_update: status 0xe00011 on port 1 
[ 3433.572516] TRACE DBG     : port 1: DCBX config 0x10 
[ 3433.572516] TRACE DBG     : Port 1: DCB mode set to 0. Updating the driver for operational MIB 
[ 3433.728827] TRACE DRV_HSI : [P1] Updating all pfs on with msg 0x5 
[ 3433.728827] TRACE DRV_HSI : [F1] Updating PF driver with msg 0x5 
[ 3433.728827] TRACE DBG     : Configuring NIG for PFC and ETS for port 1 
[ 3433.728827] TRACE ETH     : Disabling PFC 
[ 3433.728827] TRACE ETH     : [P1] CLMAC configured. Pause RX/TX enabled 
[ 3433.728827] TRACE DRV_HSI : [F1] LINK_RESET 0x0 
[ 3433.728827] TRACE INIT    : g_port 1 chan->config = 0x4b94adc6 chan->nw2bmc = 0x0 
[ 3433.728827] TRACE INIT    : g_spad.nvm_cfg.cfg1.port[1].features= 0x0 
[ 3433.728827] TRACE ETH     : [P1] CLPORT: port disabled 
[ 3433.728827] TRACE ETH     : [P1] Power OFF PHY 
[ 3433.728827] TRACE ETH     : [P1] Reset Tx_Credits and soft reset MAC 
[ 3433.728827] TRACE ETH     : [P1] Closing data path and reset serdes 
[ 3433.728827] TRACE ETH     : physical_link_update: status 0x0 on port 1 
[ 3433.728827] TRACE ETH     : [P1] Port reset: value 0x0 
[ 3433.728827] TRACE DRV_HSI : [F1] INIT_PHY 0x0 
[ 3433.827130] TRACE ETH     : [P1] CLPORT: port disabled 
[ 3433.827130] TRACE ETH     : [P1] Power OFF PHY 
[ 3433.827130] TRACE ETH     : [P1] Reset Tx_Credits and soft reset MAC 
[ 3433.827130] TRACE ETH     : [P1] Closing data path and reset serdes 
[ 3433.827130] TRACE ETH     : physical_link_update: status 0x0 on port 1 
[ 3433.827130] TRACE ETH     : [P1] Port reset: value 0x0 
[ 3433.827130] TRACE DRV_HSI : [P1] Updating all pfs on with msg 0x0 
[ 3433.827130] TRACE ETH     : [P1] Increment link-flap counter to 3 
[ 3433.827130] TRACE ETH     : [P1] Updating LFA status to 0x30001 
[ 3433.827130] TRACE ETH     : SmartAN enabled set to 0 
[ 3433.827130] TRACE ETH     : [P1] Transceiver card 
[ 3433.827130] TRACE ETH     : Init now 
[ 3433.827130] TRACE ETH     : [P1] Disable Firecode-FEC lane 2 
[ 3433.827130] TRACE ETH     : [P1] Disable RS-FEC 
[ 3433.827130] TRACE ETH     : [P1] Force speed 0x61a8 
[ 3433.827130] TRACE ETH     : [P1] eth_port_init: req_speed: 25000Mbps adv_speed 0xa 
[ 3433.827138] nes = 1 
[ 3433.827138] TRACE ETH     : [P1] Serdes core_revision: 0x6, serdes id is 0x4ad4 
[ 3433.827138] TRACE ETH     : [P1] Autoneg is disabled 
[ 3433.827138] TRACE ETH     : [P1] Clear ALLOW_PLL_CHANGE 
[ 3433.827138] TRACE ETH     : [P1] Setting an.enable to 0 an_mode to 0x3 
[ 3433.827138] TRACE ETH     : [P1] Setting 25G/50G OUI to 0x6a737d 
[ 3433.827138] ERROR ERROR   : Autoneg Pause setting ignored while speed is not set to autoneg 
[ 3433.827138] TRACE ETH     : [1]: POWER UP PHY. 
[ 3433.827138] TRACE ETH     : [P1] Power ON PHY 
[ 3433.827138] TRACE ETH     : [1]: ETH stat address: 0xe2ae28 
[ 3433.827138] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_MTU 0x2328 
[ 3433.827138] TRACE ETH     : [P1] MAC fault change: 0x0->0x1 
[ 3433.827138] TRACE ETH     : physical_link_update: status 0x1800000 on port 1 
[ 3433.827138] TRACE ETH     : eth_link_update. new_link_status = 0x1800000 link_speed = 0 
[ 3433.827138] TRACE DRV_HSI : [F1] DRV_MSG_CODE_OV_UPDATE_CURR_CFG 0x1 
[ 3434.028467] TRACE ETH     : [P0] MAC fault change: 0x3->0x2 
[ 3434.028467] TRACE ETH     : physical_link_update: status 0x2800000 on port 0 
[ 3434.028467] TRACE TRACE   : Repeat the last trace for 1 times 
[ 3434.028467] TRACE ETH     : eth_link_update. new_link_status = 0x2800000 link_speed = 25000 
[ 3434.028467] TRACE ETH     : [P0] MAC fault change: 0x2->0x0 
[ 3434.028467] TRACE ETH     : physical_link_update: status 0x800000 on port 0 
[ 3434.028467] TRACE ETH     : eth_link_update. new_link_status = 0xe00011 link_speed = 25000 
[ 3434.028467] TRACE ETH     : [P0] eth_link_update: link_state 0x800000->0xe00011 
[ 3434.028467] TRACE ETH     : [P0] Link is Up in 25000Mbps (change count 1) 
[ 3434.028467] TRACE ETH     : [P0] Reset Tx_Credits and soft reset MAC 
[ 3434.028467] TRACE ETH     : [P0] CLMAC configured. Pause RX/TX enabled 
[ 3434.028467] TRACE ETH     : physical_link_update: status 0xe00011 on port 0 
[ 3434.028467] TRACE DBG     : port 0: DCBX config 0x10 
[ 3434.028488] r with msg 0x5 
[ 3434.028488] TRACE DBG     : Configuring NIG for PFC and ETS for port 0 
[ 3434.028488] TRACE ETH     : Disabling PFC 
[ 3434.028488] TRACE ETH     : [P0] CLMAC configured. Pause RX/TX enabled 
[ 3434.028488] TRACE DRV_HSI : [F0] LINK_RESET 0x0 
[ 3434.028488] TRACE INIT    : g_port 0 chan->config = 0x58062e48 chan->nw2bmc = 0x1 
[ 3434.028488] TRACE INIT    : g_spad.nvm_cfg.cfg1.port[0].features= 0x0 
[ 3434.028488] TRACE ETH     : [P0] CLPORT: port disabled 
[ 3434.028488] TRACE ETH     : [P0] Power OFF PHY 
[ 3434.028488] TRACE ETH     : [P0] Reset Tx_Credits and soft reset MAC 
[ 3434.028488] TRACE ETH     : [P0] Closing data path and reset serdes 
[ 3434.028488] TRACE ETH     : physical_link_update: status 0x0 on port 0 
[ 3434.028488] TRACE ETH     : [P0] Port reset: value 0x0 
[ 3434.028488] TRACE DRV_HSI : [F0] INIT_PHY 0x0 
[ 3434.028488] TRACE ETH     : [P0] Check LFA - First time link init 
[ 3434.028488] TRACE ETH     : [P0] CLPORT: port disabled 
[ 3434.127153] ACE ETH     : [P0] Port reset: value 0x0 
[ 3434.127153] TRACE DRV_HSI : [P0] Updating all pfs on with msg 0x0 
[ 3434.127153] TRACE ETH     : [P0] Increment link-flap counter to 3 
[ 3434.127153] TRACE ETH     : [P0] Updating LFA status to 0x30001 
[ 3434.127153] TRACE ETH     : SmartAN enabled set to 0 
[ 3434.127153] TRACE ETH     : [P0] Transceiver card 
[ 3434.127153] TRACE ETH     : Init now 
[ 3434.127153] TRACE ETH     : [P0] Disable Firecode-FEC lane 0 
[ 3434.127153] TRACE ETH     : [P0] Disable RS-FEC 
[ 3434.127153] TRACE ETH     : [P0] Force speed 0x61a8 
[ 3434.127153] TRACE ETH     : [P0] eth_port_init: req_speed: 25000Mbps adv_speed 0xa 
[ 3434.127153] TRACE ETH     :      eth_port_init: flow_ctrl 0x7 loopback 0x0 
[ 3434.127153] TRACE ETH     : Transceiver type is 0x16 
[ 3434.127153] TRACE ETH     : Transceiver init done on port 0 
[ 3434.127153] TRACE ETH     : [P0] phy->lane_map = 0x00000001, num_lanes = 1 
[ 3434.227772] /50G OUI to 0x6a737d 
[ 3434.227772] ERROR ERROR   : Autoneg Pause setting ignored while speed is not set to autoneg 
[ 3434.227772] TRACE ETH     : [0]: POWER UP PHY. 
[ 3434.227772] TRACE ETH     : [P0] Power ON PHY 
[ 3434.227772] TRACE ETH     : [0]: ETH stat address: 0xe2a1a8 
[ 3434.227772] TRACE DRV_HSI : [F0] DRV_MSG_CODE_OV_UPDATE_MTU 0x2328 
[ 3434.227772] TRACE ETH     : [P0] MAC fault change: 0x0->0x1 
[ 3434.227772] TRACE ETH     : physical_link_update: status 0x1800000 on port 0 
[ 3434.227772] TRACE ETH     : eth_link_update. new_link_status = 0x1800000 link_speed = 0 
[ 3434.227772] TRACE DRV_HSI : [F0] DRV_MSG_CODE_OV_UPDATE_CURR_CFG 0x1 
[ 3434.227772] TRACE ETH     : [P1] MAC fault change: 0x1->0x3 
[ 3434.227772] TRACE ETH     : physical_link_update: status 0x3800000 on port 1 
[ 3434.227772] TRACE TRACE   : Repeat the last trace for 1 times 
[ 3434.227772] TRACE ETH     : eth_link_update. new_link_status = 0x3800000 link_speed = 25000 
[ 3434.323252] w_link_status = 0x2800000 link_speed = 25000 
[ 3434.323252] TRACE ETH     : [P1] MAC fault change: 0x2->0x0 
[ 3434.323252] TRACE ETH     : physical_link_update: status 0x800000 on port 1 
[ 3434.323252] TRACE ETH     : eth_link_update. new_link_status = 0xe00011 link_speed = 25000 
[ 3434.323252] TRACE ETH     : [P1] eth_link_update: link_state 0x800000->0xe00011 
[ 3434.323252] TRACE ETH     : [P1] Link is Up in 25000Mbps (change count 1) 
[ 3434.323252] TRACE ETH     : [P1] Reset Tx_Credits and soft reset MAC 
[ 3434.323252] TRACE ETH     : [P1] CLMAC configured. Pause RX/TX enabled 
[ 3434.323252] TRACE ETH     : physical_link_update: status 0xe00011 on port 1 
[ 3434.323252] TRACE DBG     : port 1: DCBX config 0x10 
[ 3434.323252] TRACE DBG     : Port 1: DCB mode set to 0. Updating the driver for operational MIB 
[ 3434.323252] TRACE DRV_HSI : [P1] Updating all pfs on with msg 0x5 
[ 3434.323252] TRACE DRV_HSI : [F1] Updating PF driver with msg 0x5 
[ 3434.421412] ge: 0x1->0x3 
[ 3434.421412] TRACE ETH     : physical_link_update: status 0x3800000 on port 0 
[ 3434.421412] TRACE TRACE   : Repeat the last trace for 1 times 
[ 3434.421412] TRACE ETH     : eth_link_update. new_link_status = 0x3800000 link_speed = 25000 
[ 3434.421412] TRACE ETH     : [P0] MAC fault change: 0x3->0x2 
[ 3434.421412] TRACE ETH     : physical_link_update: status 0x2800000 on port 0 
[ 3434.421412] TRACE TRACE   : Repeat the last trace for 1 times 
[ 3434.421412] TRACE ETH     : eth_link_update. new_link_status = 0x2800000 link_speed = 25000 
[ 3434.421412] TRACE ETH     : [P0] MAC fault change: 0x2->0x0 
[ 3434.421412] TRACE ETH     : physical_link_update: status 0x800000 on port 0 
[ 3434.421412] TRACE ETH     : eth_link_update. new_link_status = 0xe00011 link_speed = 25000 
[ 3434.421412] TRACE ETH     : [P0] eth_link_update: link_state 0x800000->0xe00011 
[ 3434.421412] TRACE ETH     : [P0] Link is Up in 25000Mbps (change count 1) 
[ 3434.519309] G     : port 0: DCBX config 0x10 
[ 3434.519309] TRACE DBG     : Port 0: DCB mode set to 0. Updating the driver for operational MIB 
[ 3434.519309] TRACE DRV_HSI : [P0] Updating all pfs on with msg 0x5 
[ 3434.519309] TRACE DRV_HSI : [F0] Updating PF driver with msg 0x5 
[ 3434.519309] TRACE DBG     : Configuring NIG for PFC and ETS for port 0 
[ 3434.519309] TRACE ETH     : Disabling PFC 
[ 3434.519309] TRACE ETH     : [P0] CLMAC configured. Pause RX/TX enabled 
[ 3434.519309] TRACE DRV_HSI : [F0] DRV_MSG_CODE_NIG_DRAIN 0x3e8 
[ 3434.519309] TRACE DBG     : [P0] Start NIG drain time=87000000 
[ 3434.519309] TRACE DBG     : [P0] Start NIG drain, time_to_end =1467159835 
[ 3434.519309] TRACE DBG     : [P0] Start TC NIG drain for TC=0. time_to_end = 1467160733 
[ 3434.519309] TRACE DBG     : [P0] Start TC NIG drain for TC=1. time_to_end = 1467161633 
[ 3434.519309] TRACE DBG     : [P0] Start TC NIG drain for TC=2. time_to_end = 1467162625 
[ 3434.616868] ime_to_end = 1467166003 
[ 3434.616868] TRACE DBG     : [P0] Start TC NIG drain for TC=6. time_to_end = 1467167157 
[ 3434.616868] TRACE DBG     : [P0] Start TC NIG drain for TC=7. time_to_end = 1467168333 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain, diff =59736 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=1, diff =60635 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=3, diff =61574 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=7, diff =62571 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=15, diff =63682 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=31, diff =64911 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=63, diff =66119 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=127, diff =67320 
[ 3434.616868] TRACE DBG     : [P0] Stop NIG drain on TC_bit=255, diff =68502 
[ 3434.616868] ERROR ERROR   : [F0]Invalid command 0xc0040000 
[ 3434.616868]  
[ 3434.616868]  
[ 3434.616868]  
[ 3434.815735]  
[ 3434.817593] [qed_dbg_nvm_image:8155(qede_roce)]Collecting a debug feature ["nvram image 3"] 
[ 3434.875168] [qed_dbg_nvm_image:8155(qede_roce)]Collecting a debug feature ["nvram image 4"] 
[ 3434.932962] [qed_dbg_nvm_image:8155(qede_roce)]Collecting a debug feature ["nvram image 5"] 
[ 3435.404834] [qed_dbg_nvm_image:8155(qede_roce)]Collecting a debug feature ["nvram image 2"] 
[ 3435.964490] [qed_start_recovery_process:2659(qede_roce-0)]Triggering a recovery process 
[ 3435.973627] [qede_generic_hw_err_handler:2716(qede_roce)]Generic sleepable HW error handling is done 
[ 3435.973663] [qed_mcp_handle_process_kill:1699(qede_roce-0)]Received a process kill indication 
[ 3435.993752] [qed_mcp_handle_process_kill:1716(qede_roce-0)]Process kill counter: 1 
[ 3436.002467] [qede_recovery_handler:2645(qede_roce)]Starting a recovery process 
[ 3445.817859] [qed_hw_err_notify:830(qede_roce-0)]Ramrod is stuck [CID ff000000 RDMA_RAMROD_DEREGISTER_MR:04 PROTOCOLID_ROCE:02 echo 04fb] 
[ 3462.026215] restraintd[1961]: *** Current Time: Wed May 31 18:24:30 2023  Localwatchdog at: Wed May 31 19:23:29 2023 
[-- MARK -- Wed May 31 22:25:00 2023] 
[ 3522.034296] restraintd[1961]: *** Current Time: Wed May 31 18:25:30 2023  Localwatchdog at: Wed May 31 19:23:29 2023 


======= Start of the tracebacks ==============


[ 3564.513986] INFO: task kworker/u96:7:91 blocked for more than 120 seconds. 
[ 3564.521918]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3564.527832] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3564.536754] task:kworker/u96:7   state:D stack:    0 pid:   91 ppid:     2 flags:0x80004000 
[ 3564.546278] Workqueue: events_unbound ucma_close_id [rdma_ucm] 
[ 3564.552997] Call Trace: 
[ 3564.555909]  __schedule+0x2d1/0x870 
[ 3564.560003]  schedule+0x55/0xf0 
[ 3564.563673]  schedule_timeout+0x281/0x320 
[ 3564.568354]  ? xas_store+0x303/0x5a0 
[ 3564.572519]  wait_for_completion+0x96/0x100 
[ 3564.577369]  cm_destroy_id+0x21c/0x490 [ib_cm] 
[ 3564.582509]  ? _destroy_id+0x2f3/0x300 [rdma_cm] 
[ 3564.587840]  ? kfree+0x22e/0x250 
[ 3564.591600]  _destroy_id+0xe8/0x300 [rdma_cm] 
[ 3564.596639]  ucma_close_id+0x2b/0x40 [rdma_ucm] 
[ 3564.601859]  process_one_work+0x1d3/0x390 
[ 3564.606507]  worker_thread+0x30/0x390 
[ 3564.610766]  ? process_one_work+0x390/0x390 
[ 3564.615599]  kthread+0x134/0x150 
[ 3564.619374]  ? set_kthread_struct+0x50/0x50 
[ 3564.624238]  ret_from_fork+0x35/0x40 
[ 3564.628440] INFO: task kworker/2:1:100 blocked for more than 120 seconds. 
[ 3564.636254]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3564.642202] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3564.651191] task:kworker/2:1     state:D stack:    0 pid:  100 ppid:     2 flags:0x80004000 
[ 3564.660738] Workqueue: events qede_sp_task [qede] 
[ 3564.666271] Call Trace: 
[ 3564.669244]  __schedule+0x2d1/0x870 
[ 3564.673390]  schedule+0x55/0xf0 
[ 3564.677132]  schedule_timeout+0x281/0x320 
[ 3564.681827]  ? __queue_work+0x13d/0x3f0 
[ 3564.686358]  ? xas_load+0x8/0x80 
[ 3564.690188]  wait_for_completion+0x96/0x100 
[ 3564.695100]  cma_process_remove+0x227/0x250 [rdma_cm] 
[ 3564.701018]  cma_remove_one+0x62/0xd0 [rdma_cm] 
[ 3564.706317]  remove_client_context+0x8b/0xd0 [ib_core] 
[ 3564.712321]  disable_device+0x8c/0x130 [ib_core] 
[ 3564.717746]  __ib_unregister_device+0x40/0xa0 [ib_core] 
[ 3564.723827]  ib_unregister_device+0x21/0x30 [ib_core] 
[ 3564.729714]  qedr_remove+0xe/0xa0 [qedr] 
[ 3564.734346]  qede_rdma_dev_remove+0x1d9/0x200 [qede] 
[ 3564.740171]  __qede_remove+0x4d/0x1a0 [qede] 
[ 3564.745163]  qede_sp_task+0x2ca/0x350 [qede] 
[ 3564.750153]  process_one_work+0x1d3/0x390 
[ 3564.754877]  worker_thread+0x30/0x390 
[ 3564.759189]  ? process_one_work+0x390/0x390 
[ 3564.764083]  kthread+0x134/0x150 
[ 3564.767882]  ? set_kthread_struct+0x50/0x50 
[ 3564.772820]  ret_from_fork+0x35/0x40 
[ 3564.777037] INFO: task kworker/2:2:101 blocked for more than 120 seconds. 
[ 3564.784817]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3564.790784] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3564.799738] task:kworker/2:2     state:D stack:    0 pid:  101 ppid:     2 flags:0x80004000 
[ 3564.809345] Workqueue: events linkwatch_event 
[ 3564.814425] Call Trace: 
[ 3564.817380]  __schedule+0x2d1/0x870 
[ 3564.821490]  schedule+0x55/0xf0 
[ 3564.825256]  schedule_preempt_disabled+0xa/0x10 
[ 3564.830522]  __mutex_lock.isra.7+0x349/0x420 
[ 3564.835508]  ? lock_timer_base+0x67/0x90 
[ 3564.840112]  linkwatch_event+0xa/0x30 
[ 3564.844446]  process_one_work+0x1d3/0x390 
[ 3564.849151]  worker_thread+0x30/0x390 
[ 3564.853443]  ? process_one_work+0x390/0x390 
[ 3564.858338]  kthread+0x134/0x150 
[ 3564.862184]  ? set_kthread_struct+0x50/0x50 
[ 3564.867072]  ret_from_fork+0x35/0x40 
[ 3564.871320] INFO: task NetworkManager:919 blocked for more than 120 seconds. 
[ 3564.879449]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3564.885412] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3564.894391] task:NetworkManager  state:D stack:    0 pid:  919 ppid:     1 flags:0x00000080 
[ 3564.903988] Call Trace: 
[ 3564.906939]  __schedule+0x2d1/0x870 
[ 3564.911051]  schedule+0x55/0xf0 
[ 3564.914786]  schedule_preempt_disabled+0xa/0x10 
[ 3564.920080]  __mutex_lock.isra.7+0x349/0x420 
[ 3564.925073]  ? bpf_lsm_capset+0x10/0x10 
[ 3564.929550]  rtnetlink_rcv_msg+0x268/0x370 
[ 3564.934386]  ? sock_has_perm+0x80/0xa0 
[ 3564.938795]  ? rtnl_calcit.isra.43+0x120/0x120 
[ 3564.943991]  netlink_rcv_skb+0x4c/0x130 
[ 3564.948510]  netlink_unicast+0x19a/0x230 
[ 3564.953113]  netlink_sendmsg+0x204/0x3d0 
[ 3564.957695]  sock_sendmsg+0x50/0x60 
[ 3564.961796]  ____sys_sendmsg+0x22a/0x250 
[ 3564.966429]  ? copy_msghdr_from_user+0x5c/0x90 
[ 3564.971597]  ? ____sys_recvmsg+0x91/0x150 
[ 3564.976302]  ___sys_sendmsg+0x7c/0xc0 
[ 3564.980596]  ? copy_msghdr_from_user+0x5c/0x90 
[ 3564.985794]  ? ___sys_recvmsg+0x89/0xc0 
[ 3564.990310]  ? __wake_up_common+0x7a/0x190 
[ 3564.995110]  __sys_sendmsg+0x57/0xa0 
[ 3564.999321]  do_syscall_64+0x5b/0x1b0 
[ 3565.003652]  entry_SYSCALL_64_after_hwframe+0x61/0xc6 
[ 3565.009514] RIP: 0033:0x7f152cc29a27 
[ 3565.013726] Code: Unable to access opcode bytes at RIP 0x7f152cc299fd. 
[ 3565.021268] RSP: 002b:00007ffdb8ad09d0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e 
[ 3565.029953] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f152cc29a27 
[ 3565.038166] RDX: 0000000000000000 RSI: 00007ffdb8ad0a20 RDI: 000000000000000d 
[ 3565.046343] RBP: 00007ffdb8ad0a20 R08: 0000000000000000 R09: 0000000000000000 
[ 3565.054520] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 
[ 3565.062704] R13: 0000000000000000 R14: 00007ffdb8ad0bb8 R15: 00007ffdb8ad0bb4 
[ 3582.033955] restraintd[1961]: *** Current Time: Wed May 31 18:26:30 2023  Localwatchdog at: Wed May 31 19:23:29 2023 
[ 3642.033659] restraintd[1961]: *** Current Time: Wed May 31 18:27:30 2023  Localwatchdog at: Wed May 31 19:23:29 2023 
[ 3687.394122] INFO: task kworker/u96:7:91 blocked for more than 120 seconds. 
[ 3687.402026]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3687.407930] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3687.416846] task:kworker/u96:7   state:D stack:    0 pid:   91 ppid:     2 flags:0x80004000 
[ 3687.426376] Workqueue: events_unbound ucma_close_id [rdma_ucm] 
[ 3687.433080] Call Trace: 
[ 3687.435978]  __schedule+0x2d1/0x870 
[ 3687.440070]  schedule+0x55/0xf0 
[ 3687.443745]  schedule_timeout+0x281/0x320 
[ 3687.448427]  ? xas_store+0x303/0x5a0 
[ 3687.452595]  wait_for_completion+0x96/0x100 
[ 3687.457461]  cm_destroy_id+0x21c/0x490 [ib_cm] 
[ 3687.462611]  ? _destroy_id+0x2f3/0x300 [rdma_cm] 
[ 3687.467951]  ? kfree+0x22e/0x250 
[ 3687.471719]  _destroy_id+0xe8/0x300 [rdma_cm] 
[ 3687.476760]  ucma_close_id+0x2b/0x40 [rdma_ucm] 
[ 3687.481991]  process_one_work+0x1d3/0x390 
[ 3687.486636]  worker_thread+0x30/0x390 
[ 3687.490882]  ? process_one_work+0x390/0x390 
[ 3687.495710]  kthread+0x134/0x150 
[ 3687.499497]  ? set_kthread_struct+0x50/0x50 
[ 3687.504337]  ret_from_fork+0x35/0x40 
[ 3687.508555] INFO: task kworker/2:1:100 blocked for more than 120 seconds. 
[ 3687.516377]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3687.522319] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3687.531314] task:kworker/2:1     state:D stack:    0 pid:  100 ppid:     2 flags:0x80004000 
[ 3687.540858] Workqueue: events qede_sp_task [qede] 
[ 3687.546395] Call Trace: 
[ 3687.549341]  __schedule+0x2d1/0x870 
[ 3687.553470]  schedule+0x55/0xf0 
[ 3687.557221]  schedule_timeout+0x281/0x320 
[ 3687.561922]  ? __queue_work+0x13d/0x3f0 
[ 3687.566453]  ? xas_load+0x8/0x80 
[ 3687.570286]  wait_for_completion+0x96/0x100 
[ 3687.575192]  cma_process_remove+0x227/0x250 [rdma_cm] 
[ 3687.581116]  cma_remove_one+0x62/0xd0 [rdma_cm] 
[ 3687.586413]  remove_client_context+0x8b/0xd0 [ib_core] 
[ 3687.592421]  disable_device+0x8c/0x130 [ib_core] 
[ 3687.597839]  __ib_unregister_device+0x40/0xa0 [ib_core] 
[ 3687.603917]  ib_unregister_device+0x21/0x30 [ib_core] 
[ 3687.609801]  qedr_remove+0xe/0xa0 [qedr] 
[ 3687.614426]  qede_rdma_dev_remove+0x1d9/0x200 [qede] 
[ 3687.620252]  __qede_remove+0x4d/0x1a0 [qede] 
[ 3687.625249]  qede_sp_task+0x2ca/0x350 [qede] 
[ 3687.630250]  process_one_work+0x1d3/0x390 
[ 3687.634967]  worker_thread+0x30/0x390 
[ 3687.639285]  ? process_one_work+0x390/0x390 
[ 3687.644176]  kthread+0x134/0x150 
[ 3687.647977]  ? set_kthread_struct+0x50/0x50 
[ 3687.652924]  ret_from_fork+0x35/0x40 
[ 3687.657137] INFO: task kworker/2:2:101 blocked for more than 120 seconds. 
[ 3687.664930]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3687.670898] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3687.679848] task:kworker/2:2     state:D stack:    0 pid:  101 ppid:     2 flags:0x80004000 
[ 3687.689451] Workqueue: events linkwatch_event 
[ 3687.694550] Call Trace: 
[ 3687.697499]  __schedule+0x2d1/0x870 
[ 3687.701615]  schedule+0x55/0xf0 
[ 3687.705389]  schedule_preempt_disabled+0xa/0x10 
[ 3687.710649]  __mutex_lock.isra.7+0x349/0x420 
[ 3687.715641]  ? lock_timer_base+0x67/0x90 
[ 3687.720254]  linkwatch_event+0xa/0x30 
[ 3687.724586]  process_one_work+0x1d3/0x390 
[ 3687.729292]  worker_thread+0x30/0x390 
[ 3687.733582]  ? process_one_work+0x390/0x390 
[ 3687.738491]  kthread+0x134/0x150 
[ 3687.742339]  ? set_kthread_struct+0x50/0x50 
[ 3687.747238]  ret_from_fork+0x35/0x40 
[ 3687.751481] INFO: task NetworkManager:919 blocked for more than 120 seconds. 
[ 3687.759595]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3687.765565] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3687.774546] task:NetworkManager  state:D stack:    0 pid:  919 ppid:     1 flags:0x00000080 
[ 3687.784157] Call Trace: 
[ 3687.787093]  __schedule+0x2d1/0x870 
[ 3687.791216]  schedule+0x55/0xf0 
[ 3687.794959]  schedule_preempt_disabled+0xa/0x10 
[ 3687.800247]  __mutex_lock.isra.7+0x349/0x420 
[ 3687.805244]  ? bpf_lsm_capset+0x10/0x10 
[ 3687.809729]  rtnetlink_rcv_msg+0x268/0x370 
[ 3687.814567]  ? sock_has_perm+0x80/0xa0 
[ 3687.818964]  ? rtnl_calcit.isra.43+0x120/0x120 
[ 3687.824156]  netlink_rcv_skb+0x4c/0x130 
[ 3687.828674]  netlink_unicast+0x19a/0x230 
[ 3687.833275]  netlink_sendmsg+0x204/0x3d0 
[ 3687.837859]  sock_sendmsg+0x50/0x60 
[ 3687.841975]  ____sys_sendmsg+0x22a/0x250 
[ 3687.846610]  ? copy_msghdr_from_user+0x5c/0x90 
[ 3687.851775]  ? ____sys_recvmsg+0x91/0x150 
[ 3687.856488]  ___sys_sendmsg+0x7c/0xc0 
[ 3687.860794]  ? copy_msghdr_from_user+0x5c/0x90 
[ 3687.865997]  ? ___sys_recvmsg+0x89/0xc0 
[ 3687.870512]  ? __wake_up_common+0x7a/0x190 
[ 3687.875305]  __sys_sendmsg+0x57/0xa0 
[ 3687.879510]  do_syscall_64+0x5b/0x1b0 
[ 3687.883841]  entry_SYSCALL_64_after_hwframe+0x61/0xc6 
[ 3687.889690] RIP: 0033:0x7f152cc29a27 
[ 3687.893903] Code: Unable to access opcode bytes at RIP 0x7f152cc299fd. 
[ 3687.901465] RSP: 002b:00007ffdb8ad09d0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e 
[ 3687.910165] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f152cc29a27 
[ 3687.918390] RDX: 0000000000000000 RSI: 00007ffdb8ad0a20 RDI: 000000000000000d 
[ 3687.926576] RBP: 00007ffdb8ad0a20 R08: 0000000000000000 R09: 0000000000000000 
[ 3687.934786] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 
[ 3687.942952] R13: 0000000000000000 R14: 00007ffdb8ad0bb8 R15: 00007ffdb8ad0bb4 
[ 3687.951155] INFO: task pool-restraintd:138302 blocked for more than 120 seconds. 
[ 3687.959648]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3687.965601] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3687.974593] task:pool-restraintd state:D stack:    0 pid:138302 ppid:     1 flags:0x00000080 
[ 3687.984262] Call Trace: 
[ 3687.987222]  __schedule+0x2d1/0x870 
[ 3687.991383]  schedule+0x55/0xf0 
[ 3687.995126]  schedule_preempt_disabled+0xa/0x10 
[ 3688.000413]  __mutex_lock.isra.7+0x349/0x420 
[ 3688.005440]  __netlink_dump_start+0x54/0x1f0 
[ 3688.010457]  ? rtnl_xdp_prog_skb+0x70/0x70 
[ 3688.015269]  rtnetlink_rcv_msg+0x230/0x370 
[ 3688.020042]  ? rtnl_xdp_prog_skb+0x70/0x70 
[ 3688.024857]  ? rtnl_calcit.isra.43+0x120/0x120 
[ 3688.030033]  netlink_rcv_skb+0x4c/0x130 
[ 3688.034541]  netlink_unicast+0x19a/0x230 
[ 3688.039162]  netlink_sendmsg+0x204/0x3d0 
[ 3688.043765]  sock_sendmsg+0x50/0x60 
[ 3688.047884]  __sys_sendto+0xee/0x160 
[ 3688.052094]  ? syscall_trace_enter+0x1ff/0x2d0 
[ 3688.057260]  ? __sys_socket+0x93/0xe0 
[ 3688.061580]  __x64_sys_sendto+0x24/0x30 
[ 3688.066096]  do_syscall_64+0x5b/0x1b0 
[ 3688.070404]  entry_SYSCALL_64_after_hwframe+0x61/0xc6 
[ 3688.076270] RIP: 0033:0x7f38f2ef388a 
[ 3688.080479] Code: Unable to access opcode bytes at RIP 0x7f38f2ef3860. 
[ 3688.087961] RSP: 002b:00007f38f19a5700 EFLAGS: 00000293 ORIG_RAX: 000000000000002c 
[ 3688.096652] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f38f2ef388a 
[ 3688.104851] RDX: 0000000000000014 RSI: 00007f38f19a6830 RDI: 0000000000000007 
[ 3688.113090] RBP: 0000000000000000 R08: 00007f38f19a67d4 R09: 000000000000000c 
[ 3688.121276] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f38f19a6830 
[ 3688.129487] R13: 0000000000000014 R14: 0000000000000000 R15: 00007f38f19a67d4 
[ 3688.137701] INFO: task kworker/4:0:44833 blocked for more than 120 seconds. 
[ 3688.145697]       Not tainted 4.18.0-494.el8.x86_64 #1 
[ 3688.151674] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3688.160657] task:kworker/4:0     state:D stack:    0 pid:44833 ppid:     2 flags:0x80004080 
[ 3688.170265] Workqueue: ipv6_addrconf addrconf_verify_work 
[ 3688.176531] Call Trace: 
[ 3688.179516]  __schedule+0x2d1/0x870 
[ 3688.183650]  schedule+0x55/0xf0 
[ 3688.187403]  schedule_preempt_disabled+0xa/0x10 
[ 3688.192683]  __mutex_lock.isra.7+0x349/0x420 
[ 3688.197681]  addrconf_verify_work+0xa/0x20 
[ 3688.202527]  process_one_work+0x1d3/0x390 
[ 3688.207248]  ? process_one_work+0x390/0x390 
[ 3688.212150]  worker_thread+0x30/0x390 
[ 3688.216468]  ? process_one_work+0x390/0x390 
[ 3688.221410]  kthread+0x134/0x150 
[ 3688.225243]  ? set_kthread_struct+0x50/0x50 
[ 3688.230122]  ret_from_fork+0x35/0x40 
[ 3702.033492] restraintd[1961]: *** Current Time: Wed May 31 18:28:30 2023  Localwatchdog at: Wed May 31 19:23:29 2023 
[ 3762.033219] restraintd[1961]: *** Current Time: Wed May 31 18:29:30 2023  Localwatchdog at: Wed May 31 19:23:29 2023 

Refer to the RDMA client console log:

https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/79158/7915807/14006698/console.log



Expected results:

In the client log, the following output is expected without ANY TRACEBACKS or DEBUG messages from qperf :

From qperf test with  RHEL-8.8.0-20230228.22:


[ 2808.949731] restraintd[1949]: ** Fetching task: 157018294 [/mnt/tests/gerrit-git.engineering.redhat.com/kernel-tests/infiniband/qperf] 
[ 2809.322187] restraintd[1949]: ** Preparing metadata 
[ 2809.592154] restraintd[1949]: ** Refreshing peer role hostnames: Retries 0 
[ 2809.831057] restraintd[1949]: ** Updating env vars 
[ 2809.831227] restraintd[1949]: ** Updating external watchdog: 5400 seconds 
[ 2810.234395] restraintd[1949]: ** Installing dependencies 
[ 2811.922524] restraintd[1949]: ** Running task: 157018294 [/kernel/infiniband/qperf] 
[ 2861.542149] restraintd[1949]: ** Completed Task : 157018294 
[ 2861.983806] restraintd[1949]: ** Fetching task: 157018295 [/mnt/tests/gerrit-git.engineering.redhat.com/kernel-tests/infiniband/iSER/iser_basic] 


Additional info:

Comment 1 RHEL Program Management 2023-09-21 13:40:04 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 2 RHEL Program Management 2023-09-21 13:43:59 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.