Bug 2150488

Summary: dpdk-testpmd startup failed and hangs in guest
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Hekai Wang <hewang>
Component: openvswitchAssignee: Timothy Redaelli <tredaelli>
openvswitch sub component: ovs-dpdk QA Contact: qding
Status: NEW --- Docs Contact:
Severity: unspecified    
Priority: unspecified CC: ctrautma, jhsiao, ktraynor
Version: FDP 22.KKeywords: Regression
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hekai Wang 2022-12-03 11:50:07 UTC
Description of problem:
dpdk-testpmd startup failed and hangs in guest

Version-Release number of selected component (if applicable):
openvswitch2.13-2.13.0-208.el8fdp.x86_64.rpm
openvswitch2.16-2.16.0-106.el8fdp.x86_64.rpm
openvswitch2.15-2.15.0-128.el8fdp.x86_64.rpm

How reproducible:
Almost 100%

Steps to Reproduce:

Guest xml 
<domain type='kvm'>
  <name>guest30032</name>
  <uuid>37425e76-af6a-44a6-aba0-73434afe34c0</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>9</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='9'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <vcpupin vcpu='3' cpuset='5'/>
    <vcpupin vcpu='4' cpuset='7'/>
    <vcpupin vcpu='5' cpuset='25'/>
    <vcpupin vcpu='6' cpuset='27'/>
    <vcpupin vcpu='7' cpuset='29'/>
    <vcpupin vcpu='8' cpuset='31'/>
    <emulatorpin cpuset='9'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.4.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pmu state='off'/>
    <vmport state='off'/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-8' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/root/rhel.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='none'/>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xa'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='52:54:00:11:8f:e8'/>
      <source type='unix' path='/tmp/vhost0' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' queues='4' rx_queue_size='1024' tx_queue_size='1024'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:bb:63:7b'/>
      <source bridge='virbr0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </memballoon>
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'/>
</domain>

Guest config 
:: [ 09:21:55 ] :: [  BEGIN   ] :: Running 'guest_start_testpmd 1500 4'
spawn virsh console guest30032
Connected to domain 'guest30032'
Escape character is ^] (Ctrl + ])


Red Hat Enterprise Linux 8.4 (Ootpa)
Kernel 4.18.0-305.33.1.el8_4.x86_64 on an x86_64

Activate the web console with: systemctl enable --now cockpit.socket

localhost login: root

Password: 
[root@localhost ~]# grubby --remove-args=skew_tick --update-kernel `grubby --defa
ault-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --remove-args=nohz --update-kernel `grubby --default-k
kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --remove-args=nohz_full --update-kernel `grubby --defa
ault-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --remove-args=rcu_nocbs --update-kernel `grubby --defa
ault-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --remove-args=tuned.non_isolcpus --update-kernel `grub
bby --default-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --remove-args=intel_pstate --update-kernel `grubby --d
default-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --remove-args=nosoftlockup --update-kernel `grubby --d
default-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --args=default_hugepagesz=1G hugepagesz=1G intel_iommu
u=on iommu=pt --update-kernel `grubby --default-kernel`
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --info=ALL
index=0
kernel="/boot/vmlinuz-4.18.0-305.33.1.el8_4.x86_64"
args="ro rhgb quiet crashkernel=auto resume=UUID=e897ba23-01db-4954-8508-fd466c6dcb39 console=ttyS0,115200 hugepagesz=1G intel_iommu=on iommu=pt default_hugepagesz=1G"
root="UUID=f3cc273a-369a-4413-8bff-cc5a43697b1a"
initrd="/boot/initramfs-4.18.0-305.33.1.el8_4.x86_64.img $tuned_initrd"
title="Red Hat Enterprise Linux (4.18.0-305.33.1.el8_4.x86_64) 8.4 (Ootpa)"
id="f0206a69c89144c680f1efb78c4dfe8c-4.18.0-305.33.1.el8_4.x86_64"
index=1
kernel="/boot/vmlinuz-0-rescue-f0206a69c89144c680f1efb78c4dfe8c"
args="ro rhgb quiet crashkernel=auto resume=UUID=e897ba23-01db-4954-8508-fd466c6dcb39 console=ttyS0,115200"
root="UUID=f3cc273a-369a-4413-8bff-cc5a43697b1a"
initrd="/boot/initramfs-0-rescue-f0206a69c89144c680f1efb78c4dfe8c.img"
title="Red Hat Enterprise Linux (0-rescue-f0206a69c89144c680f1efb78c4dfe8c) 8.4 (Ootpa)"
id="f0206a69c89144c680f1efb78c4dfe8c-0-rescue"
[root@localhost ~]# echo $?
0
[root@localhost ~]# echo isolated_cores=1,2,3,4,5,6,7,8 > /etc/tuned/cpu-partitio
oning-variables.conf
[root@localhost ~]# echo $?
0
[root@localhost ~]# tuned-adm profile cpu-partitioning
CONSOLE  tuned.plugins.plugin_systemd: you may need to manualy run 'dracut -f' to update the systemd configuration in initrd image
[root@localhost ~]# echo $?
0
[root@localhost ~]# grubby --info=ALL
index=0
kernel="/boot/vmlinuz-4.18.0-305.33.1.el8_4.x86_64"
args="ro rhgb quiet crashkernel=auto resume=UUID=e897ba23-01db-4954-8508-fd466c6dcb39 console=ttyS0,115200 hugepagesz=1G intel_iommu=on iommu=pt default_hugepagesz=1G skew_tick=1 nohz=on nohz_full=1,2,3,4,5,6,7,8 rcu_nocbs=1,2,3,4,5,6,7,8 tuned.non_isolcpus=00000001 intel_pstate=disable nosoftlockup"
root="UUID=f3cc273a-369a-4413-8bff-cc5a43697b1a"
initrd="/boot/initramfs-4.18.0-305.33.1.el8_4.x86_64.img $tuned_initrd"
title="Red Hat Enterprise Linux (4.18.0-305.33.1.el8_4.x86_64) 8.4 (Ootpa)"
id="f0206a69c89144c680f1efb78c4dfe8c-4.18.0-305.33.1.el8_4.x86_64"
index=1
kernel="/boot/vmlinuz-0-rescue-f0206a69c89144c680f1efb78c4dfe8c"
args="ro rhgb quiet crashkernel=auto resume=UUID=e897ba23-01db-4954-8508-fd466c6dcb39 console=ttyS0,115200"
root="UUID=f3cc273a-369a-4413-8bff-cc5a43697b1a"
initrd="/boot/initramfs-0-rescue-f0206a69c89144c680f1efb78c4dfe8c.img"
title="Red Hat Enterprise Linux (0-rescue-f0206a69c89144c680f1efb78c4dfe8c) 8.4 (Ootpa)"
id="f0206a69c89144c680f1efb78c4dfe8c-0-rescue"
[root@localhost ~]# echo $?
0
[root@localhost ~]# logout

[root@localhost ~]# modprobe -r vfio_iommu_type1
[root@localhost ~]# echo $?
0
[root@localhost ~]# modprobe -r vfio
[root@localhost ~]# echo $?
0
[root@localhost ~]# modprobe vfio 
[root@localhost ~]# echo $?
0
[root@localhost ~]# modprobe vfio-pci
[root@localhost ~]# echo $?
0
[root@localhost ~]# ip link set eth1 down
[root@localhost ~]# echo $?
0
[root@localhost ~]# /usr/bin/dpdk-devbind.py -b vfio-pci 0000:03:00.0
[root@localhost ~]# echo $?
0
[root@localhost ~]# /usr/bin/dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:03:00.0 'Virtio network device 1041' drv=vfio-pci unused=

Network devices using kernel driver
===================================
0000:02:00.0 'Virtio network device 1041' if=eth0 drv=virtio-pci unused=vfio-pci *Active*

No 'Baseband' devices detected
==============================

No 'Crypto' devices detected
============================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================

No 'Regex' devices detected
===========================
[root@localhost ~]# echo $?
0
[root@localhost ~]# logout


Red Hat Enterprise Linux 8.4 (Ootpa)
Kernel 4.18.0-305.33.1.el8_4.x86_64 on an x86_64

Activate the web console with: systemctl enable --now cockpit.socket

localhost login: 
:: [ 09:22:33 ] :: [   LOG    ] :: dpdk-testpmd -l 0,1,2,3,4,5,6,7,8     --socket-mem 1024  --legacy-mem  -n 4 -- --forward-mode=macswap --port-topology=chained      --disable-rss -i --rxq=4 --txq=4 --rxd=1024 --txd=1024 --nb-cores=8 --max-pkt-len=1500 --auto-start
spawn virsh console guest30032
Connected to domain 'guest30032'
Escape character is ^] (Ctrl + ])


Red Hat Enterprise Linux 8.4 (Ootpa)
Kernel 4.18.0-305.33.1.el8_4.x86_64 on an x86_64

Activate the web console with: systemctl enable --now cockpit.socket

localhost login: root

Password: 
Last login: Thu Dec  1 09:22:25 on ttyS0
[root@localhost ~]# dpdk-testpmd -l 0,1,2,3,4,5,6,7,8     --socket-mem 1024  --le
egacy-mem  -n 4 -- --forward-mode=macswap --port-topology=chained      --disable-
-rss -i --rxq=4 --txq=4 --rxd=1024 --txd=1024 --nb-cores=8 --max-pkt-len=1500 --a
auto-start
EAL: Detected 9 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL:   Invalid NUMA socket, default to 0
EAL: Probe PCI driver: net_virtio (1af4:1041) device: 0000:02:00.0 (socket 0)
EAL:   Invalid NUMA socket, default to 0
EAL: Probe PCI driver: net_virtio (1af4:1041) device: 0000:03:00.0 (socket 0)
EAL:   using IOMMU type 1 (Type 1)
EAL: No legacy callbacks, legacy socket not created
Set macswap packet forwarding mode
Interactive-mode selected
Auto-start selected
testpmd: create a new mbuf pool <mb_pool_0>: n=211456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
EAL: Error disabling MSI-X interrupts for fd 48
Setting up watches.
Watches established.
Setting up watches.
Watches established.
FAIL(1):TIMEOUT to run_cmd "dpdk-testpmd -l 0,1,2,3,4,5,6,7,8     --socket-mem 1024  --legacy-mem  -n 4 -- --forward-mode=macswap --port-topology=chained      --disable-rss -i --rxq=4 --txq=4 --rxd=1024 --txd=1024 --nb-cores=8 --max-pkt-len=1500 --auto-start"
:: [ 10:22:34 ] :: [   FAIL   ] :: Command 'guest_start_testpmd 1500 4' (Expected 0, got 1)


Actual results:
dpdk-testpmd start failed and hung in guest .
Here are the jobs all I have tried .

https://beaker.engineering.redhat.com/jobs/7294840
https://beaker.engineering.redhat.com/jobs/7294842
https://beaker.engineering.redhat.com/jobs/7294845
https://beaker.engineering.redhat.com/jobs/7294868
https://beaker.engineering.redhat.com/jobs/7300769
https://beaker.engineering.redhat.com/jobs/7300770
https://beaker.engineering.redhat.com/jobs/7300772

All 22.K version works fine .
Here are some jobs works fine
#ixgbe https://beaker.engineering.redhat.com/jobs/7206890
#i40e https://beaker.engineering.redhat.com/jobs/7206892
#mlx5_core https://beaker.engineering.redhat.com/jobs/7206893

#i40e https://beaker.engineering.redhat.com/jobs/7206913
#ice https://beaker.engineering.redhat.com/jobs/7206915
#mlx5_core https://beaker.engineering.redhat.com/jobs/7206916
#mlx5_cx6 https://beaker.engineering.redhat.com/jobs/7206917


Expected results:
It works fine

Additional info: