Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1932841

Summary: Qede nic: system occur panic with qed_ilt_blk_alloc
Product: Red Hat Enterprise Linux Fast Datapath Reporter: liting <tli>
Component: openvswitch2.11Assignee: Open vSwitch development team <ovs-team>
Status: CLOSED DUPLICATE QA Contact: liting <tli>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 21.BCC: ctrautma, jhsiao, mpattric, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 20:07:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description liting 2021-02-25 12:04:11 UTC
Description of problem:


Version-Release number of selected component (if applicable):
[root@dell-per730-52 ~]# rpm -qa|grep openvs
openvswitch-selinux-extra-policy-1.0-18.el7fdp.noarch
openvswitch2.11-2.11.3-86.el7fdp.x86_64
[root@dell-per730-52 ~]# rpm -qa|grep dpdk
dpdk-18.11.5-1.el7_8.x86_64
dpdk-tools-18.11.5-1.el7_8.x86_64
[root@dell-per730-52 ~]# uname -a
Linux dell-per730-52.rhts.eng.pek2.redhat.com 3.10.0-1160.21.1.el7.x86_64 #1 SMP Mon Feb 22 18:03:13 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@dell-per730-52 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.9 (Maipo)

[root@dell-per730-52 crash]# ethtool -i p4p1
driver: qede
version: 8.37.0.20
firmware-version: mfw 8.40.24.0 storm 8.37.7.0
expansion-rom-version: 
bus-info: 0000:82:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: yes

[root@dell-per730-52 crash]# lspci -s 0000:82:00.0
82:00.0 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)

How reproducible:


Steps to Reproduce:
1. Run ovs dpdk pvp ansible performance case on qede card.


Actual results:
The system will panic and following is the call trace.
[ 802.227216] ------------[ cut here ]------------
[  802.232356] kernel BUG at drivers/iommu/intel-iommu.c:667!
[  802.238476] invalid opcode: 0000 [#1] SMP 
[  802.243061] Modules linked in: vhost_net vhost macvtap macvlan vfio_pci vfio_iommu_type1 vfio openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad i40iw rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi iTCO_wdt iTCO_vendor_support mxm_wmi dcdbas sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul qedr ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
[  802.322863]  ablk_helper cryptd bnxt_re pcspkr ib_core lpc_ich ipmi_ssif mei_me mei sg ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic qede crct10dif_pclmul crct10dif_common mgag200 qed i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops ttm crc32c_intel i40e drm crc8 tg3 libahci libata drm_panel_orientation_quirks ptp pps_core bnxt_en nfp megaraid_sas devlink dm_mirror dm_region_hash dm_log dm_mod
[  802.371944] CPU: 1 PID: 10199 Comm: driverctl Kdump: loaded Not tainted 3.10.0-1160.21.1.el7.x86_64 #1
[  802.382329] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.5.5 08/16/2017
[  802.390678] task: ffff8c4575bc5280 ti: ffff8c4574ff8000 task.ti: ffff8c4574ff8000
[  802.399025] RIP: 0010:[<ffffffff9de07f15>]  [<ffffffff9de07f15>] domain_get_iommu+0x55/0x70
[  802.408352] RSP: 0018:ffff8c4574ffb958  EFLAGS: 00010202
[  802.414277] RAX: 0000000000000000 RBX: ffff8c457eb5a098 RCX: 0000000000000000
[  802.422237] RDX: 0000000000000000 RSI: ffff8c3dc8b98ac0 RDI: ffff8c45553d0b00
[  802.430197] RBP: ffff8c4574ffb958 R08: 000000000001f0a0 R09: ffffffff9de0b2de
[  802.438157] R10: ffff8c4cdee1f0a0 R11: ffffefd20522e600 R12: 0000000000000000
[  802.446117] R13: 00000008c5780000 R14: ffff8c45553d0b00 R15: 0000000000010000
[  802.454078] FS:  00007fe400a0e740(0000) GS:ffff8c4cdee00000(0000) knlGS:0000000000000000
[  802.463107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  802.469516] CR2: 00007f570e71b000 CR3: 00000008f4bc4000 CR4: 00000000003607e0
[  802.477476] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  802.485436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  802.493396] Call Trace:
[  802.496123]  [<ffffffff9de0be48>] __intel_map_single+0x68/0x160
[  802.502730]  [<ffffffff9da18b58>] ? alloc_pages_current+0x98/0x110
[  802.509625]  [<ffffffff9de0c035>] intel_alloc_coherent+0xb5/0x150
[  802.516435]  [<ffffffffc061aa64>] qed_ilt_blk_alloc+0x114/0x210 [qed]
[  802.523622]  [<ffffffff9da24526>] ? kmalloc_order_trace+0x26/0xa0
[  802.530426]  [<ffffffffc061bee9>] qed_cxt_tables_alloc+0xd9/0x5c0 [qed]
[  802.537811]  [<ffffffffc0622415>] qed_resc_alloc+0x2d5/0x710 [qed]
[  802.544704]  [<ffffffffc0631f3b>] qed_slowpath_start+0x2eb/0xb40 [qed]
[  802.551992]  [<ffffffffc06bf572>] __qede_probe+0x142/0x8f0 [qede]
[  802.558793]  [<ffffffff9dad8618>] ? kernfs_next_descendant_post+0x48/0x60
[  802.566369]  [<ffffffffc06bfd5f>] qede_probe+0x3f/0xb0 [qede]
[  802.572781]  [<ffffffff9dbd6a8a>] local_pci_probe+0x4a/0xb0
[  802.578997]  [<ffffffff9dbd81d9>] pci_device_probe+0x109/0x160
[  802.585509]  [<ffffffff9dcbb6a5>] driver_probe_device+0xc5/0x3e0
[  802.592210]  [<ffffffff9dcbb9c0>] ? driver_probe_device+0x3e0/0x3e0
[  802.599201]  [<ffffffff9dcbba03>] __device_attach+0x43/0x50
[  802.605418]  [<ffffffff9dcb9325>] bus_for_each_drv+0x75/0xc0
[  802.611731]  [<ffffffff9dcbb4e0>] device_attach+0x90/0xb0
[  802.617753]  [<ffffffff9dcb9679>] bus_rescan_devices_helper+0x39/0x60
[  802.624939]  [<ffffffff9dcb9a82>] store_drivers_probe+0x32/0x70
[  802.631543]  [<ffffffff9dcb8fa9>] bus_attr_store+0x29/0x30
[  802.637663]  [<ffffffff9dadb3c2>] sysfs_kf_write+0x42/0x50
[  802.643783]  [<ffffffff9dada9ab>] kernfs_fop_write+0xeb/0x160
[  802.650195]  [<ffffffff9da4dcd0>] vfs_write+0xc0/0x1f0
[  802.655928]  [<ffffffff9da4eaaf>] SyS_write+0x7f/0xf0
[  802.661564]  [<ffffffff9df96226>] tracesys+0xa6/0xcc
[  802.667102] Code: 10 0f 1f 44 00 00 48 83 c7 04 8b 4f fc 85 c9 75 25 83 c0 01 39 d0 75 ee 31 c0 5d c3 31 d2 48 8b 05 99 b2 c5 00 5d 48 8b 04 10 c3 <0f> 0b 66 0f 1f 84 00 00 00 00 00 85 c0 78 de 48 98 48 8d 14 c5 
[  802.688748] RIP  [<ffffffff9de07f15>] domain_get_iommu+0x55/0x70
[  802.695460]  RSP <ffff8c4574ffb958>


Expected results:
It has no panic.

Additional info:
It occur two panic. Following is the vmcore and log.
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-03:28:55/vmcore-dmesg.txt
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-03:28:55/vmcore
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-04:09:22/vmcore-dmesg.txt
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-04:09:22/vmcore

Comment 1 Mike Pattrick 2022-11-15 20:07:43 UTC
Closing as duplicate of bz1786215

*** This bug has been marked as a duplicate of bug 1786215 ***