Bug 1932841 - Qede nic: system occur panic with qed_ilt_blk_alloc
Summary: Qede nic: system occur panic with qed_ilt_blk_alloc
Keywords:
Status: CLOSED DUPLICATE of bug 1786215
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.11
Version: FDP 21.B
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Open vSwitch development team
QA Contact: liting
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-25 12:04 UTC by liting
Modified: 2022-11-15 20:10 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-15 20:07:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1111 0 None None None 2022-11-15 20:10:58 UTC

Description liting 2021-02-25 12:04:11 UTC
Description of problem:


Version-Release number of selected component (if applicable):
[root@dell-per730-52 ~]# rpm -qa|grep openvs
openvswitch-selinux-extra-policy-1.0-18.el7fdp.noarch
openvswitch2.11-2.11.3-86.el7fdp.x86_64
[root@dell-per730-52 ~]# rpm -qa|grep dpdk
dpdk-18.11.5-1.el7_8.x86_64
dpdk-tools-18.11.5-1.el7_8.x86_64
[root@dell-per730-52 ~]# uname -a
Linux dell-per730-52.rhts.eng.pek2.redhat.com 3.10.0-1160.21.1.el7.x86_64 #1 SMP Mon Feb 22 18:03:13 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@dell-per730-52 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.9 (Maipo)

[root@dell-per730-52 crash]# ethtool -i p4p1
driver: qede
version: 8.37.0.20
firmware-version: mfw 8.40.24.0 storm 8.37.7.0
expansion-rom-version: 
bus-info: 0000:82:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: yes

[root@dell-per730-52 crash]# lspci -s 0000:82:00.0
82:00.0 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)

How reproducible:


Steps to Reproduce:
1. Run ovs dpdk pvp ansible performance case on qede card.


Actual results:
The system will panic and following is the call trace.
[ 802.227216] ------------[ cut here ]------------
[  802.232356] kernel BUG at drivers/iommu/intel-iommu.c:667!
[  802.238476] invalid opcode: 0000 [#1] SMP 
[  802.243061] Modules linked in: vhost_net vhost macvtap macvlan vfio_pci vfio_iommu_type1 vfio openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad i40iw rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi iTCO_wdt iTCO_vendor_support mxm_wmi dcdbas sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul qedr ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
[  802.322863]  ablk_helper cryptd bnxt_re pcspkr ib_core lpc_ich ipmi_ssif mei_me mei sg ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic qede crct10dif_pclmul crct10dif_common mgag200 qed i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops ttm crc32c_intel i40e drm crc8 tg3 libahci libata drm_panel_orientation_quirks ptp pps_core bnxt_en nfp megaraid_sas devlink dm_mirror dm_region_hash dm_log dm_mod
[  802.371944] CPU: 1 PID: 10199 Comm: driverctl Kdump: loaded Not tainted 3.10.0-1160.21.1.el7.x86_64 #1
[  802.382329] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.5.5 08/16/2017
[  802.390678] task: ffff8c4575bc5280 ti: ffff8c4574ff8000 task.ti: ffff8c4574ff8000
[  802.399025] RIP: 0010:[<ffffffff9de07f15>]  [<ffffffff9de07f15>] domain_get_iommu+0x55/0x70
[  802.408352] RSP: 0018:ffff8c4574ffb958  EFLAGS: 00010202
[  802.414277] RAX: 0000000000000000 RBX: ffff8c457eb5a098 RCX: 0000000000000000
[  802.422237] RDX: 0000000000000000 RSI: ffff8c3dc8b98ac0 RDI: ffff8c45553d0b00
[  802.430197] RBP: ffff8c4574ffb958 R08: 000000000001f0a0 R09: ffffffff9de0b2de
[  802.438157] R10: ffff8c4cdee1f0a0 R11: ffffefd20522e600 R12: 0000000000000000
[  802.446117] R13: 00000008c5780000 R14: ffff8c45553d0b00 R15: 0000000000010000
[  802.454078] FS:  00007fe400a0e740(0000) GS:ffff8c4cdee00000(0000) knlGS:0000000000000000
[  802.463107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  802.469516] CR2: 00007f570e71b000 CR3: 00000008f4bc4000 CR4: 00000000003607e0
[  802.477476] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  802.485436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  802.493396] Call Trace:
[  802.496123]  [<ffffffff9de0be48>] __intel_map_single+0x68/0x160
[  802.502730]  [<ffffffff9da18b58>] ? alloc_pages_current+0x98/0x110
[  802.509625]  [<ffffffff9de0c035>] intel_alloc_coherent+0xb5/0x150
[  802.516435]  [<ffffffffc061aa64>] qed_ilt_blk_alloc+0x114/0x210 [qed]
[  802.523622]  [<ffffffff9da24526>] ? kmalloc_order_trace+0x26/0xa0
[  802.530426]  [<ffffffffc061bee9>] qed_cxt_tables_alloc+0xd9/0x5c0 [qed]
[  802.537811]  [<ffffffffc0622415>] qed_resc_alloc+0x2d5/0x710 [qed]
[  802.544704]  [<ffffffffc0631f3b>] qed_slowpath_start+0x2eb/0xb40 [qed]
[  802.551992]  [<ffffffffc06bf572>] __qede_probe+0x142/0x8f0 [qede]
[  802.558793]  [<ffffffff9dad8618>] ? kernfs_next_descendant_post+0x48/0x60
[  802.566369]  [<ffffffffc06bfd5f>] qede_probe+0x3f/0xb0 [qede]
[  802.572781]  [<ffffffff9dbd6a8a>] local_pci_probe+0x4a/0xb0
[  802.578997]  [<ffffffff9dbd81d9>] pci_device_probe+0x109/0x160
[  802.585509]  [<ffffffff9dcbb6a5>] driver_probe_device+0xc5/0x3e0
[  802.592210]  [<ffffffff9dcbb9c0>] ? driver_probe_device+0x3e0/0x3e0
[  802.599201]  [<ffffffff9dcbba03>] __device_attach+0x43/0x50
[  802.605418]  [<ffffffff9dcb9325>] bus_for_each_drv+0x75/0xc0
[  802.611731]  [<ffffffff9dcbb4e0>] device_attach+0x90/0xb0
[  802.617753]  [<ffffffff9dcb9679>] bus_rescan_devices_helper+0x39/0x60
[  802.624939]  [<ffffffff9dcb9a82>] store_drivers_probe+0x32/0x70
[  802.631543]  [<ffffffff9dcb8fa9>] bus_attr_store+0x29/0x30
[  802.637663]  [<ffffffff9dadb3c2>] sysfs_kf_write+0x42/0x50
[  802.643783]  [<ffffffff9dada9ab>] kernfs_fop_write+0xeb/0x160
[  802.650195]  [<ffffffff9da4dcd0>] vfs_write+0xc0/0x1f0
[  802.655928]  [<ffffffff9da4eaaf>] SyS_write+0x7f/0xf0
[  802.661564]  [<ffffffff9df96226>] tracesys+0xa6/0xcc
[  802.667102] Code: 10 0f 1f 44 00 00 48 83 c7 04 8b 4f fc 85 c9 75 25 83 c0 01 39 d0 75 ee 31 c0 5d c3 31 d2 48 8b 05 99 b2 c5 00 5d 48 8b 04 10 c3 <0f> 0b 66 0f 1f 84 00 00 00 00 00 85 c0 78 de 48 98 48 8d 14 c5 
[  802.688748] RIP  [<ffffffff9de07f15>] domain_get_iommu+0x55/0x70
[  802.695460]  RSP <ffff8c4574ffb958>


Expected results:
It has no panic.

Additional info:
It occur two panic. Following is the vmcore and log.
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-03:28:55/vmcore-dmesg.txt
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-03:28:55/vmcore
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-04:09:22/vmcore-dmesg.txt
http://netqe-bj.usersys.redhat.com/share/tli/vm_core/127.0.0.1-2021-02-25-04:09:22/vmcore

Comment 1 Mike Pattrick 2022-11-15 20:07:43 UTC
Closing as duplicate of bz1786215

*** This bug has been marked as a duplicate of bug 1786215 ***


Note You need to log in before you can comment on or make changes to this bug.