Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1786215

Summary: kernel panic with message intel-iommu.c:667 with broadcom and openvswitch 2.11 and 2.12 on rhel7.7
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Hekai Wang <hewang>
Component: openvswitch2.11Assignee: Timothy Redaelli <tredaelli>
Status: CLOSED WONTFIX QA Contact: Hekai Wang <hewang>
Severity: high Docs Contact:
Priority: high    
Version: FDP 20.ACC: anantha.subramanyam, ctrautma, fhallal, jhsiao, kernel-qe, kzhang, network-qe, qding, ralongi, tli, vasundhara-v.volam
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-10 17:38:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hekai Wang 2019-12-24 02:29:05 UTC
Description of problem:

kernel panic with error message below when test ovs dpdk bonding .

[  921.497770] device ovs-netdev left promiscuous mode 
[  923.249255] vfio-pci 0000:af:00.0: No device request channel registered, blocked until released by user 
[  924.302651] bnxt_en 0000:af:00.0: enabling device (0400 -> 0402) 
[  924.309058] ------------[ cut here ]------------ 
[  924.313667] kernel BUG at drivers/iommu/intel-iommu.c:667! 
[  924.319139] invalid opcode: 0000 [#1] SMP  
[  924.323279] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc tun sch_ingress openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack vfio_pci vfio_iommu_type1 vfio sctp mlx4_ib mlx4_en mlx4_core ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt i40iw rpcrdma sunrpc rdma_ucm ib_iser ib_umad rdma_cm iw_cm ib_ipoib libiscsi ib_cm scsi_transport_iscsi mlx5_ib ib_uverbs dell_smbios iTCO_wdt iTCO_vendor_support dell_wmi_descriptor dcdbas skx_edac intel_powerclamp coretemp bnxt_re ib_core intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ipmi_ssif sg wmi ipmi_si ipmi_devintf ipmi_msghandler mei_me mei i2c_i801 lpc_ich acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea mlx5_core sysfillrect sysimgblt fb_sys_fops i40e tg3 mlxfw ttm ahci crct10dif_pclmul crct10dif_common drm bnxt_en libahci crc32c_intel ptp devlink libata megaraid_sas pps_core drm_panel_orientation_quirks nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod 
[  924.454409] CPU: 1 PID: 287 Comm: kworker/1:1 Kdump: loaded Tainted: G               ------------ T 3.10.0-1062.4.3.el7.x86_64 #1 
[  924.466019] Hardware name: Dell Inc. PowerEdge R740/0JMK61, BIOS 1.6.11 11/20/2018 
[  924.473573] Workqueue: events work_for_cpu_fn 
[  924.477941] task: ffff9d0a1d9ea0e0 ti: ffff9d0a1bc2c000 task.ti: ffff9d0a1bc2c000 
[  924.485402] RIP: 0010:[<ffffffff91000c15>]  [<ffffffff91000c15>] domain_get_iommu+0x55/0x70 
[  924.493754] RSP: 0018:ffff9d0a1bc2fc48  EFLAGS: 00010202 
[  924.499053] RAX: 0000000000000000 RBX: ffff9d021fffb098 RCX: 0000000000000000 
[  924.506171] RDX: 0000000000000000 RSI: ffff9d0a0c42ba80 RDI: ffff9cfeaafd8b00 
[  924.513284] RBP: ffff9d0a1bc2fc48 R08: 000000000001f0a0 R09: ffffffff91003fde 
[  924.520402] R10: ffff9d0a1d41f0a0 R11: fffffd09c1310ac0 R12: 0000000000000000 
[  924.527517] R13: 0000001051113000 R14: ffff9cfeaafd8b00 R15: 0000000000001000 
[  924.534630] FS:  0000000000000000(0000) GS:ffff9d0a1d400000(0000) knlGS:0000000000000000 
[  924.542698] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[  924.548430] CR2: 00007fbcf514322c CR3: 0000000b82610000 CR4: 00000000007607e0 
[  924.555544] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
[  924.562661] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 
[  924.569776] PKRU: 00000000 
[  924.572483] Call Trace: 
[  924.574931]  [<ffffffff91004b48>] __intel_map_single+0x68/0x160 
[  924.580838]  [<ffffffff90c16298>] ? alloc_pages_current+0x98/0x110 
[  924.586999]  [<ffffffff91004d35>] intel_alloc_coherent+0xb5/0x150 
[  924.593082]  [<ffffffffc060f7c4>] bnxt_init_one+0x214/0x13f0 [bnxt_en] 
[  924.599590]  [<ffffffff90a35c19>] ? sched_clock+0x9/0x10 
[  924.604889]  [<ffffffff90ade545>] ? sched_clock_cpu+0x85/0xc0 
[  924.610623]  [<ffffffff90dd122a>] local_pci_probe+0x4a/0xb0 
[  924.616176]  [<ffffffff90abadfa>] work_for_cpu_fn+0x1a/0x30 
[  924.621738]  [<ffffffff90abe21f>] process_one_work+0x17f/0x440 
[  924.627556]  [<ffffffff90abf488>] worker_thread+0x278/0x3c0 
[  924.633115]  [<ffffffff90abf210>] ? manage_workers.isra.26+0x2a0/0x2a0 
[  924.639626]  [<ffffffff90ac61f1>] kthread+0xd1/0xe0 
[  924.644493]  [<ffffffff90ac6120>] ? insert_kthread_work+0x40/0x40 
[  924.650574]  [<ffffffff9118cd37>] ret_from_fork_nospec_begin+0x21/0x21 
[  924.657080]  [<ffffffff90ac6120>] ? insert_kthread_work+0x40/0x40 
[  924.663158] Code: 10 0f 1f 44 00 00 48 83 c7 04 8b 4f fc 85 c9 75 25 83 c0 01 39 d0 75 ee 31 c0 5d c3 31 d2 48 8b 05 11 ed c5 00 5d 48 8b 04 10 c3 <0f> 0b 66 0f 1f 84 00 00 00 00 00 85 c0 78 de 48 98 48 8d 14 c5  
[  924.683474] RIP  [<ffffffff91000c15>] domain_get_iommu+0x55/0x70 
[  924.689492]  RSP <ffff9d0a1bc2fc48> 
[    0.000000] Initializing cgroup subsys cpuset 
[    0.000000] Initializing cgroup subsys cpu 
[    0.000000] Initializing cgroup subsys cpuacct 
[    0.000000] Linux version 3.10.0-1062.4.3.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Tue Nov 12 10:42:40 EST 2019 
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.10.0-1062.4.3.el7.x86_64 ro kpti spectre_v2=retpoline console=ttyS0,115200n81 LANG=en_US.UTF-8 skew_tick=1 nohz=on nohz_full=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47 rcu_nocbs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47 tuned.non_isolcpus=00005555,55555555 intel_pstate=disable nosoftlockup nohz=on default_hugepagesz=1G intel_iommu=on iommu=pt modprobe.blacklist=qedi modprobe.blacklist=qedf modprobe.blacklist=qedr irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr novmcoredd disable_cpu_apicid=0 elfcorehdr=871812K 
[    0.000000] e820: BIOS-provided physical RAM map: 


Version-Release number of selected component (if applicable):

Linux version 3.10.0-1062.4.3.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Tue Nov 12 10:42:40 EST 2019 

RHEL-7.7-updates-20191119.2 

ovs version openvswitch2.12-2.12.0-12.el7fdp.x86_64.rpm 

dpdk version dpdk-18.11.2-1.el7.x86_64.rpm


How reproducible:
Always

Steps to Reproduce:
install dpdk and openvswitch 

select nic kernel driver which is bnxt_en and enable vfio-pci driver on it 

create openvswitch bridge ovsbr0 

modprobe openvswitch
systemctl stop openvswitch
sleep 3
systemctl start openvswitch
sleep 3

ovs-vsctl --if-exists del-br ovsbr0
sleep 5

ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="4096,4096"
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask="$pmd_cpu_mask"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
systemctl restart openvswitch
sleep 3
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev

ovs-vsctl add-bond ovsbr0 dpdkbond dpdk0 dpdk1 "bond_mode=${bond_mode}" \
-- set Interface dpdk0 type=dpdk options:dpdk-devargs=class=eth,mac=${nic1_mac} mtu_request=${mtu_val} \
-- set Interface dpdk1 type=dpdk options:dpdk-devargs=class=eth,mac=${nic2_mac} mtu_request=${mtu_val}

ovs-vsctl set Port dpdkbond vlan_mode=trunk
ovs-vsctl list Port dpdkbond

ovs-vsctl set Port dpdkbond bond_updelay=5
ovs-vsctl set Port dpdkbond bond_downdelay=5

ovs-vsctl list Port dpdkbond

ovs-vsctl add-port ovsbr0 vhost0 -- set interface vhost0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhost0

ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 actions=NORMAL

sleep 2
ovs-vsctl show
sleep 5
echo "after bonding nic, check the bond status"
ovs-appctl bond/show
sleep 30
ovs-appctl bond/show


Define vm guest with xml as below 

<domain type='kvm'>
  <name>guest30032</name>
  <uuid>37425e76-af6a-44a6-aba0-73434afe34c0</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='2'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel7.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pmu state='off'/>
    <vmport state='off'/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-2' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/root/rhel.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='none'/>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xa'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:11:8f:ea'/>
      <source type='unix' path='/tmp/vhost0' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:bb:63:7b'/>
      <source bridge='virbr0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </memballoon>
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'/>
</domain>


Start guest30032

Then clean env 

virsh list --all --name | xargs -I {} virsh destroy {}
virsh list --all --name | xargs -I {} virsh undefine {}
systemctl start openvswitch
ovs-vsctl --if-exists del-br ovsbr0
systemctl stop openvswitch

bus_list=`dpdk-devbind -s | grep  -E drv=vfio-pci\|drv=igb | awk '{print $1}'`
for i in $bus_list
do
    kernel_driver=`lspci -s $i -v | grep Kernel  | grep modules  | awk '{print $NF}'`
    dpdk-devbind -b $kernel_driver $i
done
dpdk-devbind -s

At this point , Kernel panic 


Summary 
with broadcom driver, It works as table below

OVS 2.9 ON RHEL7.7 , It works fine

OVS 2.11 ON RHEL7.7 , Kernel panic

OVS 2.12 ON RHEL7.7 , Kernel panic

OVS 2.11 ON RHEL8.1 , It works fine

OVS 2.12 ON RHEL8.1 , It works fine

Qede have the similar behaviour as broadcom driver .

ixgbe,i40e,mlx5_core,nfp Are all works fine .

Here is the Failed job link 

ovs2.12 one rhel7.7
https://beaker.engineering.redhat.com/jobs/3981494

ovs2.11 on rhel7.7
https://beaker.engineering.redhat.com/jobs/3981491

1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Mike Pattrick 2022-11-15 20:07:43 UTC
*** Bug 1932841 has been marked as a duplicate of this bug. ***