Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2317182

Summary: SRIOV VF performance degradation with testpmd in a rhel 9.2 vm
Product: Red Hat OpenStack Reporter: Robin Jarry <rjarry>
Component: tripleo-ansibleAssignee: Robin Jarry <rjarry>
Status: CLOSED ERRATA QA Contact: Nate Johnston <njohnston>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 17.1 (Wallaby)CC: astupnik, bfournie, cfontain, dhughes, ekuris, gurpsing, jslagle, mburns, mnietoji, njohnston
Target Milestone: z4Keywords: Regression, Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tripleo-ansible-3.3.1-17.1.20240920151435.el9ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-21 09:42:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2179366, 2293368    
Bug Blocks: 2276671    

Description Robin Jarry 2024-10-08 08:51:13 UTC
testpmd performance dropped:

RHOS-17.1-RHEL-9-20240909.n.1 --> 28Mpps
RHOS-17.1-RHEL-9-20240927.n.1 --> 3.5Mpps

Setup
=====

host
----

BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-284.11.1.el9_2.x86_64 root=UUID=b9819149-d15b-42cf-9199-3628c6b51bac ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 isolcpus=1,2,3 hugepagesz=1GB default_hugepagesz=1GB transparent_hugepage=never hugepages=4 nohz=on nohz_full=1,2,3 rcu_nocbs=1,2,3

PID    CPUs       NUMAs  NONVOL_CTX_SW  VOL_CTX_SW  COMM
35275  0,1,20,21  0               1.7K        447K  qemu-kvm
35298  0,1,20,21  0                 24        1.5K  qemu-kvm
35299  0,20       0                  0           9  TC tc-ram-node0
35359  0,1,20,21  0                177         413  IO mon_iothread
35360  16         0                  2       56.2M  CPU 0/KVM
35361  36         0                  0       52.8K  CPU 1/KVM
35362  24         0                 24       24.8K  CPU 2/KVM     <----- testpmd lcore 2
35363  4          0                 24       23.4K  CPU 3/KVM     <----- testpmd lcore 3
35364  10         0                  0        248K  CPU 4/KVM
35365  30         0                  0        212K  CPU 5/KVM
35366  18         0                  0        1.7M  CPU 6/KVM
35367  38         0                 10        146K  CPU 7/KVM
35368  26         0                  2        389K  CPU 8/KVM
35369  6          0                 22        388K  CPU 9/KVM
35370  8          0                  4        191K  CPU 10/KVM
35371  28         0                 17        179K  CPU 11/KVM
35372  32         0                  0        375K  CPU 12/KVM
35373  12         0                  0        183K  CPU 13/KVM
35374  34         0                  0        203K  CPU 14/KVM
35375  14         0                  0        109K  CPU 15/KVM
35413  0,1,20,21  0                  0           1  vnc_worker

8: enp6s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether f8:f2:1e:03:6c:a4 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether fa:16:3e:7c:91:96 brd ff:ff:ff:ff:ff:ff, vlan 128, spoof checking off, link-state enable, trust off <-------- vm port 0
    vf 1     link/ether 12:f4:71:a5:15:21 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 2     link/ether 3e:4c:b1:6e:83:b5 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 3     link/ether 3a:10:98:88:e3:21 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
9: enp6s0f3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether f8:f2:1e:03:6c:a6 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 7a:1c:53:09:85:29 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether fa:16:3e:9e:86:00 brd ff:ff:ff:ff:ff:ff, vlan 129, spoof checking off, link-state enable, trust off <-------- vm port 1
    vf 2     link/ether 12:25:29:df:e4:ad brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 3     link/ether c6:21:02:2e:5d:bd brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off

libvirt domain definition
-------------------------

<domain type='kvm'>
  <name>instance-00000008</name>
  <uuid>acd01630-dba5-4a34-b337-e30bf1d89b49</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.1">
      <nova:package version="23.2.3-17.1.20240919170757.2ace99d.el9ost"/>
      <nova:name>testpmd-sriov-vf-dut</nova:name>
      <nova:creationTime>2024-10-07 17:01:32</nova:creationTime>
      <nova:flavor name="perf_numa_0_sriov_dut">
        <nova:memory>8192</nova:memory>
        <nova:disk>20</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>16</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="74ce7934f7f1426a9a9a42bf319e646f">admin</nova:user>
        <nova:project uuid="6a6c20a9945b40a894f06c295ae730dc">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="7ab2e961-9c20-4913-b87d-0210608a643a"/>
      <nova:ports>
        <nova:port uuid="dc56e2f5-92f2-44c4-afca-0107912f26b5">
          <nova:ip type="fixed" address="10.10.107.151" ipVersion="4"/>
        </nova:port>
        <nova:port uuid="cab51037-063b-43d0-89a3-ed08434df717">
          <nova:ip type="fixed" address="10.10.128.198" ipVersion="4"/>
        </nova:port>
        <nova:port uuid="a5e7de51-0ba2-4783-894b-bd3a9d04c6dd">
          <nova:ip type="fixed" address="10.10.129.134" ipVersion="4"/>
        </nova:port>
      </nova:ports>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='16'/>
    <vcpupin vcpu='1' cpuset='36'/>
    <vcpupin vcpu='2' cpuset='24'/>
    <vcpupin vcpu='3' cpuset='4'/>
    <vcpupin vcpu='4' cpuset='10'/>
    <vcpupin vcpu='5' cpuset='30'/>
    <vcpupin vcpu='6' cpuset='18'/>
    <vcpupin vcpu='7' cpuset='38'/>
    <vcpupin vcpu='8' cpuset='26'/>
    <vcpupin vcpu='9' cpuset='6'/>
    <vcpupin vcpu='10' cpuset='8'/>
    <vcpupin vcpu='11' cpuset='28'/>
    <vcpupin vcpu='12' cpuset='32'/>
    <vcpupin vcpu='13' cpuset='12'/>
    <vcpupin vcpu='14' cpuset='34'/>
    <vcpupin vcpu='15' cpuset='14'/>
    <emulatorpin cpuset='0-1,20-21'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>Red Hat</entry>
      <entry name='product'>OpenStack Compute</entry>
      <entry name='version'>23.2.3-17.1.20240919170757.2ace99d.el9ost</entry>
      <entry name='serial'>acd01630-dba5-4a34-b337-e30bf1d89b49</entry>
      <entry name='uuid'>acd01630-dba5-4a34-b337-e30bf1d89b49</entry>
      <entry name='family'>Virtual Machine</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel9.0.0'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model' check='partial'>
    <topology sockets='16' dies='1' cores='1' threads='1'/>
    <numa>
      <cell id='0' cpus='0-15' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/nova/instances/acd01630-dba5-4a34-b337-e30bf1d89b49/disk'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/lib/nova/instances/acd01630-dba5-4a34-b337-e30bf1d89b49/disk.config'/>
      <target dev='sda' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x16'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0x17'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x18'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0x19'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/>
    </controller>
    <controller type='pci' index='11' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='11' port='0x1a'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/>
    </controller>
    <controller type='pci' index='12' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='12' port='0x1b'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x3'/>
    </controller>
    <controller type='pci' index='13' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='13' port='0x1c'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x4'/>
    </controller>
    <controller type='pci' index='14' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='14' port='0x1d'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x5'/>
    </controller>
    <controller type='pci' index='15' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='15' port='0x1e'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x6'/>
    </controller>
    <controller type='pci' index='16' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='16' port='0x1f'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='piix3-uhci'>
      <address type='pci' domain='0x0000' bus='0x12' slot='0x01' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='17' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='17' port='0x20'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='pci' index='18' model='pcie-to-pci-bridge'>
      <model name='pcie-pci-bridge'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <interface type='vhostuser'>
      <mac address='fa:16:3e:35:0a:b5'/>
      <source type='unix' path='/var/lib/vhost_sockets/vhudc56e2f5-92' mode='server'/>
      <target dev='vhudc56e2f5-92'/>
      <model type='virtio'/>
      <driver rx_queue_size='1024' tx_queue_size='1024'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </interface>
    <interface type='hostdev' managed='yes'>
      <mac address='fa:16:3e:7c:91:96'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x06' slot='0x0a' function='0x0'/>
      </source>
      <vlan>
        <tag id='128'/>
      </vlan>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </interface>
    <interface type='hostdev' managed='yes'>
      <mac address='fa:16:3e:9e:86:00'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x06' slot='0x0e' function='0x1'/>
      </source>
      <vlan>
        <tag id='129'/>
      </vlan>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <log file='/var/lib/nova/instances/acd01630-dba5-4a34-b337-e30bf1d89b49/console.log' append='off'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <log file='/var/lib/nova/instances/acd01630-dba5-4a34-b337-e30bf1d89b49/console.log' append='off'/>
      <target type='serial' port='0'/>
    </console>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='10.10.120.189'>
      <listen type='address' address='10.10.120.189'/>
    </graphics>
    <audio id='1' type='none'/>
    <video>
      <model type='virtio' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <stats period='10'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/urandom</backend>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </rng>
  </devices>
</domain>

# perf record -C 4,24 sleep 10
# Overhead  Command       Shared Object      Symbol                                           
# ........  ............  .................  .................................................
#
     9.12%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_l1d_flush
     8.01%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_l1d_flush
     2.89%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_vmexit
     2.68%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_vmexit
     1.58%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_vcpu_run
     1.17%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_vcpu_run
     1.11%  CPU 3/KVM     [kernel.kallsyms]  [k] vcpu_enter_guest
     0.66%  CPU 2/KVM     [kernel.kallsyms]  [k] vcpu_enter_guest
     0.63%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_vcpu_enter_exit
     0.60%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_vcpu_enter_exit
     0.58%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_handle_exit_irqoff
     0.52%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_eqs_exit.constprop.0
     0.49%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_handle_exit_irqoff
     0.37%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_spec_ctrl_restore_host
     0.37%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_eqs_exit.constprop.0
     0.33%  CPU 3/KVM     [kernel.kallsyms]  [k] __get_current_cr3_fast
     0.33%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_dynticks_inc
     0.30%  CPU 3/KVM     [kernel.kallsyms]  [k] apic_has_pending_timer
     0.30%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_lapic_find_highest_irr
     0.28%  CPU 2/KVM     [kernel.kallsyms]  [k] __context_tracking_exit
     0.28%  CPU 2/KVM     [kernel.kallsyms]  [k] apic_has_pending_timer
     0.28%  CPU 3/KVM     [kernel.kallsyms]  [k] native_sched_clock
     0.27%  CPU 2/KVM     [kernel.kallsyms]  [k] context_tracking_recursion_enter
     0.23%  CPU 3/KVM     [kernel.kallsyms]  [k] vcpu_run
     0.22%  CPU 2/KVM     [kernel.kallsyms]  [k] native_sched_clock
     0.21%  CPU 2/KVM     [kernel.kallsyms]  [k] __get_current_cr3_fast
     0.21%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_lapic_find_highest_irr
     0.20%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_dynticks_inc
     0.20%  CPU 3/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr_special
     0.18%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_spec_ctrl_restore_host
     0.18%  CPU 2/KVM     [kernel.kallsyms]  [k] __srcu_read_lock
     0.18%  CPU 2/KVM     [kernel.kallsyms]  [k] vcpu_run
     0.18%  CPU 3/KVM     [kernel.kallsyms]  [k] __srcu_read_unlock
     0.17%  CPU 3/KVM     [kernel.kallsyms]  [k] __srcu_read_lock
     0.17%  CPU 3/KVM     [kernel.kallsyms]  [k] context_tracking_recursion_enter
     0.16%  CPU 3/KVM     [kernel.kallsyms]  [k] vtime_guest_exit
     0.16%  CPU 3/KVM     [kernel.kallsyms]  [k] __context_tracking_exit
     0.16%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_preempt_deferred_qs
     0.15%  CPU 3/KVM     [kernel.kallsyms]  [k] intel_guest_get_msrs
     0.14%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run
     0.12%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_handle_exit
     0.12%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_nocb_flush_deferred_wakeup
     0.12%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state.part.0
     0.12%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run_flags
     0.11%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_prepare_switch_to_guest
     0.11%  CPU 3/KVM     [kernel.kallsyms]  [k] __context_tracking_enter
     0.09%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_wait_lapic_expire
     0.09%  CPU 2/KVM     [kernel.kallsyms]  [k] vtime_guest_exit
     0.09%  CPU 2/KVM     [kernel.kallsyms]  [k] intel_guest_get_msrs
     0.08%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run
     0.08%  migration/24  [kernel.kallsyms]  [k] enqueue_task_fair
     0.08%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_eqs_enter.constprop.0
     0.08%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state.part.0
     0.08%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_recover_nmi_blocking
     0.08%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_preempt_deferred_qs
     0.08%  CPU 2/KVM     [kernel.kallsyms]  [k] vtime_guest_enter
     0.08%  CPU 2/KVM     [kernel.kallsyms]  [k] __context_tracking_enter
     0.08%  migration/4   [kernel.kallsyms]  [k] smpboot_thread_fn
     0.07%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_flush_pml_buffer
     0.07%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_prepare_switch_to_guest
     0.07%  CPU 3/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr.constprop.0
     0.07%  CPU 2/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr_special
     0.07%  CPU 3/KVM     [kernel.kallsyms]  [k] account_guest_time
     0.06%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_update_host_rsp
     0.06%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_recover_nmi_blocking
     0.06%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_update_host_rsp
     0.06%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run_flags
     0.06%  CPU 2/KVM     [kernel.kallsyms]  [k] account_guest_time
     0.06%  CPU 2/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr.constprop.0
     0.05%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_user_enter
     0.05%  CPU 3/KVM     [kernel.kallsyms]  [k] perf_guest_get_msrs
     0.05%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_update_hv_timer
     0.05%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_handle_exit
     0.05%  CPU 2/KVM     [kernel.kallsyms]  [k] __srcu_read_unlock
     0.04%  CPU 3/KVM     [kernel.kallsyms]  [k] vtime_guest_enter
     0.04%  CPU 3/KVM     [kernel.kallsyms]  [k] __cgroup_account_cputime_field
     0.04%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_wait_lapic_expire
     0.04%  CPU 2/KVM     [kernel.kallsyms]  [k] get_vtime_delta
     0.03%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_nocb_flush_deferred_wakeup
     0.03%  CPU 3/KVM     [kernel.kallsyms]  [k] sched_clock
     0.03%  CPU 3/KVM     [kernel.kallsyms]  [k] get_vtime_delta
     0.03%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_complete_interrupts
     0.03%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_eqs_enter.constprop.0
     0.02%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_sync_pir_to_irr
     0.02%  CPU 2/KVM     [kernel.kallsyms]  [k] vtime_account_system
     0.02%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_user_exit
     0.02%  CPU 3/KVM     [kernel.kallsyms]  [k] vtime_account_system
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_set_rvi
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_complete_interrupts
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] sched_clock
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_load_guest_xsave_state.part.0
     0.01%  perf          [kernel.kallsyms]  [k] exit_to_user_mode_prepare
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_flush_pml_buffer
     0.01%  perf          [kernel.kallsyms]  [k] kvm_on_user_return
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state
     0.01%  perf          [kernel.kallsyms]  [k] smp_call_function_single
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_load_guest_xsave_state.part.0
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_dynticks_eqs_exit
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] cr4_read_shadow
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_user_enter
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] perf_guest_get_msrs
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_user_exit
     0.00%  CPU 3/KVM     [kernel.kallsyms]  [k] cgroup_rstat_updated
     0.00%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_update_hv_timer
     0.00%  CPU 2/KVM     [kernel.kallsyms]  [k] __cgroup_account_cputime_field
     0.00%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_sync_pir_to_irr

virtual machine
---------------

BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-284.11.1.el9_2.x86_64 root=UUID=b9819149-d15b-42cf-9199-3628c6b51bac ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 isolcpus=1,2,3 hugepagesz=1GB default_hugepagesz=1GB transparent_hugepage=never hugepages=4 nohz=on nohz_full=1,2,3 rcu_nocbs=1,2,3

/root/dpdk/build/app/dpdk-testpmd -l 1,2,3 -n 4 --socket-mem 1024 -- -i --nb-cores=2 --auto-start --forward-mode=io --rxd=1024 --txd=1024 --rxq=1 --txq=1

PID    CPUs    NUMAs  NONVOL_CTX_SW  VOL_CTX_SW  COMM
17992  1       0                  4        2343  dpdk-testpmd
17993  0,4-15  0                  2       45558  eal-intr-thread
17994  0,4-15  0                  0           1  rte_mp_handle
17995  2       0                 10           2  rte-worker-2
17996  3       0                 10           1  rte-worker-3
17997  0,4-15  0                  3           7  iavf-event-thre
17998  0,4-15  0                  0           2  telemetry-v2


# perf record -t 17993 sleep 10
# Overhead  Command          Shared Object      Symbol                       
# ........  ...............  .................  .............................
#
    21.43%  eal-intr-thread  [kernel.kallsyms]  [k] finish_task_switch.isra.0
    21.43%  eal-intr-thread  dpdk-testpmd       [.] iavf_dev_alarm_handler
    21.43%  eal-intr-thread  libc.so.6          [.] timerfd_settime
    14.29%  eal-intr-thread  libc.so.6          [.] read
     7.14%  eal-intr-thread  [kernel.kallsyms]  [k] _raw_spin_unlock_irq
     7.14%  eal-intr-thread  [kernel.kallsyms]  [k] syscall_exit_work
     7.14%  eal-intr-thread  libc.so.6          [.] epoll_wait


# perf record -C 2,3 sleep 10
# Overhead  Command       Shared Object  Symbol                            
# ........  ............  .............  ..................................
#
    27.39%  rte-worker-3  dpdk-testpmd   [.] _iavf_recv_raw_pkts_vec_avx2
    26.82%  rte-worker-2  dpdk-testpmd   [.] _iavf_recv_raw_pkts_vec_avx2
    11.04%  rte-worker-2  dpdk-testpmd   [.] iavf_xmit_fixed_burst_vec_avx2
    11.03%  rte-worker-3  dpdk-testpmd   [.] iavf_xmit_fixed_burst_vec_avx2
     8.35%  rte-worker-2  dpdk-testpmd   [.] pkt_burst_io_forward
     7.86%  rte-worker-3  dpdk-testpmd   [.] pkt_burst_io_forward
     2.67%  rte-worker-2  dpdk-testpmd   [.] run_pkt_fwd_on_lcore
     2.35%  rte-worker-3  dpdk-testpmd   [.] run_pkt_fwd_on_lcore
     0.88%  rte-worker-3  dpdk-testpmd   [.] iavf_xmit_pkts_vec_avx2
     0.76%  rte-worker-2  dpdk-testpmd   [.] iavf_xmit_pkts_vec_avx2
     0.49%  rte-worker-3  dpdk-testpmd   [.] iavf_recv_pkts_vec_avx2
     0.35%  rte-worker-2  dpdk-testpmd   [.] iavf_recv_pkts_vec_avx2

Comment 2 Robin Jarry 2024-10-08 09:05:44 UTC
Correction, the host kernel command line is incorrect:

BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-284.85.1.el9_2.x86_64 root=LABEL=img-rootfs ro no_timer_check crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M default_hugepagesz=1GB hugepagesz=1G hugepages=64 iommu=pt intel_iommu=on tsx=off isolcpus=2-19,22-39 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on nohz_full=2-19,22-39 rcu_nocbs=2-19,22-39 tuned.non_isolcpus=00300003 intel_pstate=passive nosoftlockup

Comment 4 Robin Jarry 2024-10-08 10:37:28 UTC
Most likely related: https://bugs.dpdk.org/show_bug.cgi?id=1337

It seems that the DPDK used in the guest has been compiled from source:

[root@testpmd-sriov-vf-dut dpdk]# pwd
/root/dpdk
[root@testpmd-sriov-vf-dut dpdk]# git describe
v22.11

This version is affected by the bug linked above.

A temporary workaround will be to update the DPDK sources to at least 22.11.5 (ideally the HEAD of the 22.11 branch).

Comment 6 Miguel Angel Nieto 2024-10-09 07:55:00 UTC
We are seen a similar performance degradation in our ovs-hwoffload scenario. It is like maximum throughput possible for both sriov (with intel) and ovs-hwoffload (mellanox) is around 3-3.5 mpps.
I think this is happening because vm is not running continuously (as we saw with vmexit), but this is not related to the nic

Comment 7 Robin Jarry 2024-10-09 09:36:05 UTC
> We are seen a similar performance degradation in our ovs-hwoffload scenario. It is like maximum throughput possible for both sriov (with intel) and ovs-hwoffload (mellanox) is around 3-3.5 mpps.
> I think this is happening because vm is not running continuously (as we saw with vmexit), but this is not related to the nic

It would be best to reproduce this without any hardware offloading in a different NIC model. We cannot assert that the problem is the same.

I dug a bit in the differences between RHOS-17.1-RHEL-9-20240909.n.1 (good) and RHOS-17.1-RHEL-9-20240927.n.1 (bad).

The usual suspect is the kernel. Here are the list of patches that were backported to RHEL 9.2. I have highlighted those which are worth exploring.

$ git ll --no-merges kernel-5.14.0-284.82.1.el9_2..kernel-5.14.0-284.85.1.el9_2 
cee6fd6a53cf (tag: kernel-5.14.0-284.85.1.el9_2) [redhat] kernel-5.14.0-284.85.1.el9_2
bb0aa0bffc61 (tag: kernel-5.14.0-284.84.1.el9_2) [redhat] kernel-5.14.0-284.84.1.el9_2
fb936ae968ac wifi: mac80211: Avoid address calculations via out of bounds array indexing
afbf165737fd ice: Add netif_device_attach/detach into PF reset flow                               <================
df0e681878b0 (tag: kernel-5.14.0-284.83.1.el9_2) [redhat] kernel-5.14.0-284.83.1.el9_2
641a45ce9003 iommu/amd: Fix panic accessing amd_iommu_enable_faulting                             <================
db422e8bfd04 iommu/vt-d: Allocate DMAR fault interrupts locally                                   <================
625b919aa1e7 blk-mq: fix race condition in active queue accounting
5a9b09aeab3e ceph: force sending a cap update msg back to MDS for revoke op
88174c3ed0e8 ceph: periodically flush the cap releases
ace09f15efbb kernfs: change kernfs_rename_lock into a read-write lock
b9d35b4eb8be kernfs: Separate kernfs_pr_cont_buf and rename_lock
f940d4a33e64 kernfs: fix missing kernfs_iattr_rwsem locking
ade4b5b805c2 kernfs: Use a per-fs rwsem to protect per-fs list of kernfs_super_info
7f6ca30a78e7 kernfs: Introduce separate rwsem to protect inode attributes
5104c5a78a91 kernfs: dont take d_lock on revalidate
3a792c917f7b kernfs: dont take i_lock on inode attr read
15dcc74243a4 xfs: allow SECURE namespace xattrs to use reserved block pool
6fa969b8be61 ice: Fix improper extts handling
51a18c3abef5 ice: Add GPIO pin support for E823 products
08285e6d3333 net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket
d51d18c1c717 mm/shmem: disable PMD-sized page cache if needed
ec92e0b6a230 mm/filemap: skip to create PMD-sized page cache if needed
f64b9b2c80b6 mm/readahead: limit page cache size in page_cache_ra_order()
fdb04f54572b readahead: use ilog2 instead of a while loop in page_cache_ra_order()
3e83e1610e63 mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray
c1eda8b9b8d2 mm: fix khugepaged with shmem_enabled=advise                                         <================
d47f3fb1c81a Bluetooth: af_bluetooth: Fix deadlock
dd9fae805dc3 sched/deadline: Fix task_struct reference leak
623e03d35a6d crypto: qat - Fix ADF_DEV_RESET_SYNC memory leak
98039f2ce515 crypto: qat - resolve race condition during AER recovery
dfd5d7eb4d96 crypto: qat - fix double free during reset
5cd880c254e7 crypto: qat - ignore subsequent state up commands
500a45686f3e crypto: qat - do not shadow error code
2704209db7c1 crypto: qat - fix state machines cleanup paths
43834a60e680 crypto: qat - refactor device restart logic
df37c216c5df crypto: qat - replace state machine calls
95271c3b6329 crypto: qat - fix concurrency issue when device state changes
050ba3ceaea0 crypto: qat - delay sysfs initialization
70a8b1900a82 ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."
eebdec48293f KVM: s390: fix LPSWEY handling                                                       <================
8fa1de7a6124 tty: n_gsm: require CAP_NET_ADMIN to attach N_GSM0710 ldisc
ade355b72101 gpiolib: cdev: Fix use after free in lineinfo_changed_notify
037b41092760 sched/topology: Optimize topology_span_sane()
3f0c7a9c736e cpumask: Add for_each_cpu_from()
1be141e8a3ea cpumask: Fix invalid uniprocessor mask assumption
3cecbbd5b7c1 scsi: qedf: Ensure the copied buf is NUL terminated
528b2bb35a95 cppc_cpufreq: Fix possible null pointer dereference
5517bc5c68de cpufreq: exit() callback is optional
ac049b75494e powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

It would help a lot if we could get attention from the KVM and netdev folks.

Comment 8 Robin Jarry 2024-10-09 11:25:55 UTC
Out of curiosity, I checked out the changes in qemu as well. It seems that the openstack-nova-libvirt-container images from both puddle versions have the same qemu-kvm-7.2.0-14.el9_2.11 rpm.

Comment 9 Robin Jarry 2024-10-09 16:33:53 UTC
Just for reference, I have collected these traces on a different system truning
with OSP 17.1_20240906.2.

The symptoms seem to be exactly the same. The same suspicious vm_exit() on
vCPUs that are supposed to do only userspace stuff.

We don't observe rx_missed_errors in testpmd running in the VM but that does
not mean there is no issue. The traffic generator says testpmd is not able to
forward all traffic:

Total-Tx        :      10.22 Gbps
Total-Rx        :       6.66 Gbps
drop-rate       :       3.55 Gbps

This could be a switch configuration issue.

==== GUEST ====================================================================

[root@testpmd-sriov-vf-dut ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-284.11.1.el9_2.x86_64 root=UUID=b9819149-d15b-42cf-9199-3628c6b51bac ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 isolcpus=1,2,3 hugepagesz=1GB default_hugepagesz=1GB transparent_hugepage=never hugepages=4 nohz=on nohz_full=1,2,3 rcu_nocbs=1,2,3

[root@testpmd-sriov-vf-dut ~]# procstat $(pidof dpdk-testpmd)
PID   CPUs    NUMAs  NONVOL_CTX_SW  VOL_CTX_SW  COMM
1688  1       0                  5        1.5K  dpdk-testpmd
1689  0,4-15  0                  6       56.1K  eal-intr-thread
1690  0,4-15  0                  0           1  rte_mp_handle
1691  2       0                  2           2  rte-worker-2
1692  3       0                  2           1  rte-worker-3
1693  0,4-15  0                  3           7  iavf-event-thre
1694  0,4-15  0                  0           1  telemetry-v2

[root@testpmd-sriov-vf-dut ~]# irqstat -sec 2,3
IRQ  CPU-2  CPU-3  DESCRIPTION
12       0     15  IO-APIC 12-edge i8042
22      39      0  IO-APIC 22-fasteoi virtio4, uhci_hcd:usb1, virtio3

[root@testpmd-sriov-vf-dut ~]# perf record -t 1689 sleep 10
[root@testpmd-sriov-vf-dut ~]# perf report
# Overhead  Command          Shared Object      Symbol
# ........  ...............  .................  .............................
#
    33.33%  eal-intr-thread  dpdk-testpmd       [.] iavf_dev_alarm_handler
    11.11%  eal-intr-thread  [kernel.kallsyms]  [k] __fget_light
    11.11%  eal-intr-thread  [kernel.kallsyms]  [k] _raw_spin_unlock_irq
    11.11%  eal-intr-thread  [kernel.kallsyms]  [k] finish_task_switch.isra.0
    11.11%  eal-intr-thread  [kernel.kallsyms]  [k] syscall_exit_to_user_mode
    11.11%  eal-intr-thread  dpdk-testpmd       [.] rte_eal_alarm_set
    11.11%  eal-intr-thread  libc.so.6          [.] epoll_wait

[root@testpmd-sriov-vf-dut ~]# perf record -C 2,3 sleep 10
[root@testpmd-sriov-vf-dut ~]# perf report
# Overhead  Command       Shared Object      Symbol
# ........  ............  .................  ..................................
#
    25.93%  rte-worker-3  dpdk-testpmd       [.] _iavf_recv_raw_pkts_vec_avx2
    25.34%  rte-worker-2  dpdk-testpmd       [.] _iavf_recv_raw_pkts_vec_avx2
    11.51%  rte-worker-3  dpdk-testpmd       [.] pkt_burst_io_forward
    11.03%  rte-worker-2  dpdk-testpmd       [.] pkt_burst_io_forward
     8.63%  rte-worker-2  dpdk-testpmd       [.] iavf_xmit_fixed_burst_vec_avx2
     7.46%  rte-worker-3  dpdk-testpmd       [.] iavf_xmit_fixed_burst_vec_avx2
     3.64%  rte-worker-2  dpdk-testpmd       [.] run_pkt_fwd_on_lcore
     3.59%  rte-worker-3  dpdk-testpmd       [.] run_pkt_fwd_on_lcore
     0.89%  rte-worker-3  dpdk-testpmd       [.] iavf_recv_pkts_vec_avx2
     0.77%  rte-worker-2  dpdk-testpmd       [.] iavf_xmit_pkts_vec_avx2
     0.62%  rte-worker-3  dpdk-testpmd       [.] iavf_xmit_pkts_vec_avx2
     0.59%  rte-worker-2  dpdk-testpmd       [.] iavf_recv_pkts_vec_avx2
     0.00%  perf          [kernel.kallsyms]  [k] security_perf_event_write

[root@compute-r730 ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-284.82.1.el9_2.x86_64 root=LABEL=img-rootfs ro no_timer_check crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M default_hugepagesz=1GB hugepagesz=1G hugepages=64 iommu=pt intel_iommu=on tsx=off isolcpus=2-19,22-39 console=ttyS0,115200 memtest=0 boot=LABEL=mkfs_boot skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on nohz_full=2-19,22-39 rcu_nocbs=2-19,22-39 tuned.non_isolcpus=00300003 intel_pstate=passive nosoftlockup




==== HOST =====================================================================

[root@compute-r730 ~]# procstat 37250
PID    CPUs       NUMAs  NONVOL_CTX_SW  VOL_CTX_SW  COMM
37250  0,1,20,21  0               2.0K        102K  qemu-kvm
37255  0,1,20,21  0                  7        1.2K  qemu-kvm
37256  0,20       0                  1          16  TC tc-ram-node0
37399  0,1,20,21  0                388         282  IO mon_iothread
37400  16         0                  7        5.1M  CPU 0/KVM
37401  36         0                  0       24.6K  CPU 1/KVM
37402  24         0                  7       15.9K  CPU 2/KVM
37403  4          0                 71       16.4K  CPU 3/KVM
37404  10         0                  0       48.0K  CPU 4/KVM
37405  30         0                  0       79.0K  CPU 5/KVM
37406  18         0                  1       62.7K  CPU 6/KVM
37407  38         0                  0       40.2K  CPU 7/KVM
37408  26         0                  0       42.1K  CPU 8/KVM
37409  6          0                 10       62.3K  CPU 9/KVM
37410  8          0                  5       40.6K  CPU 10/KVM
37411  28         0                  7        185K  CPU 11/KVM
37412  32         0                  2       73.6K  CPU 12/KVM
37413  12         0                  2       41.7K  CPU 13/KVM
37414  34         0                 12       42.1K  CPU 14/KVM
37415  14         0                  3       64.2K  CPU 15/KVM
37429  0,1,20,21  0                  0           1  vnc_worker
95601  0,1,20,21  0                  0          15  worker
95602  0,1,20,21  0                  0          14  worker
95603  0,1,20,21  0                  1          15  worker
95606  0,1,20,21  0                  3          10  worker
95608  0,1,20,21  0                  3           6  worker
95612  0,1,20,21  0                  1           6  worker
95616  0,1,20,21  0                  2           6  worker
95617  0,1,20,21  0                  1           5  worker

[root@compute-r730 ~]# irqstat -sec 4,24
IRQ  CPU-4  CPU-24  DESCRIPTION
18       0     121  IR-IO-APIC 18-fasteoi ehci_hcd:usb1, ehci_hcd:usb2
152      2       0  IR-PCI-MSI 2097154-edge mlx5_comp2@pci:0000:04:00.0
162      0       2  IR-PCI-MSI 2097164-edge mlx5_comp12@pci:0000:04:00.0
212      0       1  IR-PCI-MSI 3149845-edge i40e-enp6s0f2-TxRx-20
242      0       1  IR-PCI-MSI 3151873-edge i40e-enp6s0f3-TxRx-0
262      0       1  IR-PCI-MSI 3151893-edge i40e-enp6s0f3-TxRx-20
395      2       0  IR-PCI-MSI 2099202-edge mlx5_comp2@pci:0000:04:00.1
405      0       2  IR-PCI-MSI 2099212-edge mlx5_comp12@pci:0000:04:00.1
537      2       0  IR-PCI-MSI 4194306-edge mlx5_comp2@pci:0000:08:00.0
547      0       2  IR-PCI-MSI 4194316-edge mlx5_comp12@pci:0000:08:00.0
601      2       0  IR-PCI-MSI 4196354-edge mlx5_comp2@pci:0000:08:00.1
611      0       2  IR-PCI-MSI 4196364-edge mlx5_comp12@pci:0000:08:00.1
673      1       0  IR-PCI-MSI 1048576-edge eno3-tx-0
725    102       0  IR-PCI-MSI 2101250-edge mlx5_comp2@pci:0000:04:00.2
737      2       0  IR-PCI-MSI 2103298-edge mlx5_comp2@pci:0000:04:00.3
749      3       0  IR-PCI-MSI 2105346-edge mlx5_comp2@pci:0000:04:00.4
761      4       0  IR-PCI-MSI 2107394-edge mlx5_comp2@pci:0000:04:00.5
773      2       0  IR-PCI-MSI 2134018-edge mlx5_comp2@pci:0000:04:02.2
785      2       0  IR-PCI-MSI 2136066-edge mlx5_comp2@pci:0000:04:02.3
797      2       0  IR-PCI-MSI 2138114-edge mlx5_comp2@pci:0000:04:02.4
809      2       0  IR-PCI-MSI 2140162-edge mlx5_comp2@pci:0000:04:02.5

[root@compute-r730 ~]# perf record -C 4,24 sleep 10
[root@compute-r730 ~]# perf report
# Overhead  Command       Shared Object      Symbol
# ........  ............  .................  .................................................
#
     7.77%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_l1d_flush
     7.43%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_l1d_flush
     2.42%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_vmexit
     2.33%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_vmexit
     1.54%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_vcpu_run
     1.50%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_vcpu_run
     1.27%  CPU 2/KVM     [kernel.kallsyms]  [k] vcpu_enter_guest
     1.14%  CPU 3/KVM     [kernel.kallsyms]  [k] vcpu_enter_guest
     0.87%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_handle_exit_irqoff
     0.80%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_vcpu_enter_exit
     0.73%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_vcpu_enter_exit
     0.72%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_handle_exit_irqoff
     0.57%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_lapic_find_highest_irr
     0.40%  CPU 2/KVM     [kernel.kallsyms]  [k] __get_current_cr3_fast
     0.37%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_eqs_exit.constprop.0
     0.31%  CPU 2/KVM     [kernel.kallsyms]  [k] vcpu_run
     0.31%  CPU 3/KVM     [kernel.kallsyms]  [k] __srcu_read_lock
     0.31%  CPU 3/KVM     [kernel.kallsyms]  [k] apic_has_pending_timer
     0.30%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_lapic_find_highest_irr
     0.29%  CPU 3/KVM     [kernel.kallsyms]  [k] get_vtime_delta
     0.29%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_eqs_exit.constprop.0
     0.29%  CPU 3/KVM     [kernel.kallsyms]  [k] __get_current_cr3_fast
     0.28%  CPU 3/KVM     [kernel.kallsyms]  [k] __context_tracking_exit
     0.28%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_dynticks_inc
     0.28%  CPU 2/KVM     [kernel.kallsyms]  [k] intel_guest_get_msrs
     0.27%  CPU 3/KVM     [kernel.kallsyms]  [k] vtime_guest_exit
     0.25%  CPU 2/KVM     [kernel.kallsyms]  [k] get_vtime_delta
     0.25%  CPU 3/KVM     [kernel.kallsyms]  [k] vcpu_run
     0.24%  CPU 3/KVM     [kernel.kallsyms]  [k] intel_guest_get_msrs
     0.23%  CPU 2/KVM     [kernel.kallsyms]  [k] apic_has_pending_timer
     0.22%  CPU 3/KVM     [kernel.kallsyms]  [k] __srcu_read_unlock
     0.22%  CPU 3/KVM     [kernel.kallsyms]  [k] vtime_guest_enter
     0.22%  CPU 3/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr.constprop.0
     0.22%  CPU 2/KVM     [kernel.kallsyms]  [k] __srcu_read_lock
     0.22%  CPU 3/KVM     [kernel.kallsyms]  [k] context_tracking_recursion_enter
     0.21%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_dynticks_inc
     0.21%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_spec_ctrl_restore_host
     0.21%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_nocb_flush_deferred_wakeup
     0.19%  CPU 2/KVM     [kernel.kallsyms]  [k] context_tracking_recursion_enter
     0.19%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_spec_ctrl_restore_host
     0.17%  CPU 2/KVM     [kernel.kallsyms]  [k] __srcu_read_unlock
     0.17%  CPU 2/KVM     [kernel.kallsyms]  [k] __context_tracking_exit
     0.17%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state.part.0
     0.16%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_update_host_rsp
     0.16%  CPU 2/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr.constprop.0
     0.15%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run_flags
     0.14%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_update_hv_timer
     0.14%  CPU 2/KVM     [kernel.kallsyms]  [k] native_sched_clock
     0.14%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_flush_pml_buffer
     0.14%  CPU 2/KVM     [kernel.kallsyms]  [k] vtime_account_system
     0.13%  CPU 3/KVM     [kernel.kallsyms]  [k] native_sched_clock
     0.13%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run_flags
     0.13%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state.part.0
     0.13%  CPU 2/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr_special
     0.13%  CPU 3/KVM     [kernel.kallsyms]  [k] account_guest_time
     0.12%  CPU 2/KVM     [kernel.kallsyms]  [k] vtime_guest_exit
     0.11%  CPU 2/KVM     [kernel.kallsyms]  [k] __context_tracking_enter
     0.11%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_handle_exit
     0.11%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run
     0.10%  CPU 3/KVM     [kernel.kallsyms]  [k] add_atomic_switch_msr_special
     0.10%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_eqs_enter.constprop.0
     0.10%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_sync_pir_to_irr
     0.10%  CPU 3/KVM     [kernel.kallsyms]  [k] __vmx_vcpu_run
     0.10%  CPU 3/KVM     [kernel.kallsyms]  [k] __context_tracking_enter
     0.10%  migration/24  [kernel.kallsyms]  [k] _raw_spin_lock
     0.09%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_update_host_rsp
     0.07%  CPU 2/KVM     [kernel.kallsyms]  [k] account_guest_time
     0.07%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_nocb_flush_deferred_wakeup
     0.07%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_handle_exit
     0.07%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_eqs_enter.constprop.0
     0.06%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_sync_pir_to_irr
     0.06%  CPU 2/KVM     [kernel.kallsyms]  [k] vtime_guest_enter
     0.06%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_update_hv_timer
     0.06%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_flush_pml_buffer
     0.05%  CPU 3/KVM     [kernel.kallsyms]  [k] __cgroup_account_cputime_field
     0.05%  CPU 3/KVM     [kernel.kallsyms]  [k] vtime_account_system
     0.05%  perf          [kernel.kallsyms]  [k] __bitmap_equal
     0.04%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_prepare_switch_to_guest
     0.03%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_cpu_has_pending_timer
     0.03%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_load_guest_xsave_state.part.0
     0.03%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state
     0.03%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_recover_nmi_blocking
     0.02%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_prepare_switch_to_guest
     0.02%  perf          [kernel.kallsyms]  [k] selinux_perf_event_write
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_wait_lapic_expire
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_user_exit
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_load_guest_xsave_state.part.0
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] cgroup_rstat_updated
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_cpu_has_pending_timer
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] kvm_load_host_xsave_state
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] __cgroup_account_cputime_field
     0.01%  perf          [kernel.kallsyms]  [k] intel_pmu_handle_irq
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] cpuacct_account_field
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] kvm_wait_lapic_expire
     0.01%  perf          [kernel.kallsyms]  [k] kvm_on_user_return
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] __vmx_complete_interrupts
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_preempt_deferred_qs
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_set_rvi
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] rcu_dynticks_eqs_exit
     0.01%  CPU 3/KVM     [kernel.kallsyms]  [k] cgroup_rstat_updated
     0.01%  CPU 2/KVM     [kernel.kallsyms]  [k] rcu_user_exit
     0.01%  perf          [kernel.kallsyms]  [k] native_flush_tlb_one_user
     0.01%  perf          [kernel.kallsyms]  [k] kfree
     0.00%  CPU 2/KVM     [kernel.kallsyms]  [k] vmx_recover_nmi_blocking
     0.00%  CPU 3/KVM     [kernel.kallsyms]  [k] vmx_handle_exit
     0.00%  CPU 2/KVM     [kernel.kallsyms]  [k] perf_guest_get_msrs
     0.00%  perf          [kernel.kallsyms]  [k] native_sched_clock
     0.00%  perf          [kernel.kallsyms]  [k] native_apic_mem_write
     0.00%  perf          [kernel.kallsyms]  [k] intel_bts_enable_local
     0.00%  perf          [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
     0.00%  perf          [kernel.kallsyms]  [k] remote_function

Comment 10 Robin Jarry 2024-10-09 16:35:20 UTC
Adding testpmd live stats for posterity:

testpmd> show port stats all

  ######################## NIC statistics for port 0  ########################
  RX-packets: 17671235749 RX-missed: 0          RX-bytes:  1130959087392
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 16173464816 TX-errors: 0          TX-bytes:  1035101748224

  Throughput (since last show)
  Rx-pps:      6958487          Rx-bps:   3562745840
  Tx-pps:      6139042          Tx-bps:   3143189512
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 16173628200 RX-missed: 0          RX-bytes:  1035112204800
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 17671420805 TX-errors: 0          TX-bytes:  1130970931200

  Throughput (since last show)
  Rx-pps:      6138943          Rx-bps:   3143138816
  Tx-pps:      6958375          Tx-bps:   3562688464
  ############################################################################

Comment 12 Robin Jarry 2024-10-10 11:21:51 UTC
This morning with the help of @cfontain we checked a 17.1_20240906.2 deployment on the same platform than before. The performance is perfectly fine and as expected.

 Total-Tx        :      13.33 Gbps
 Total-Rx        :      13.33 Gbps
 Total-PPS       :      26.04 Mpps

This performance is stable and we didn't see any rx_missed_errors reported by testpmd.

One important thing to note is that perf shows the same "amount" of vmexit() on CPUs that are pinned on vCPUs 2 & 3 (testpmd lcores) as reported in my previous comment.

After this, we tried multiple operations in hope to find the culprit.

1) Update of all nova-compute containers with the newer versions:

[root@compute-1 ~]# systemctl reboot
...
[root@compute-1 ~]# podman ps
CONTAINER ID  IMAGE                                                    COMMAND               STATUS                   NAMES
4b4ba91ed53e  ...openstack-ovn-controller:17.1_20240906.2              kolla_start           Up 54 minutes (healthy)  ovn_controller
a18804ff8255  ...openstack-iscsid:17.1_20240906.2                      kolla_start           Up 54 minutes (healthy)  iscsid
aeff84e50e58  ...openstack-cron:17.1_20240906.2                        kolla_start           Up 54 minutes (healthy)  logrotate_crond
443139e55aef  ...openstack-neutron-metadata-agent-ovn:17.1_20240906.2  kolla_start           Up 54 minutes (healthy)  ovn_metadata_agent
aa5f0c8d2770  ...openstack-neutron-sriov-agent:17.1_20240906.2         kolla_start           Up 54 minutes (healthy)  neutron_sriov_agent
47a712e1c87c  ...openstack-nova-libvirt:17.1_20240927.1                kolla_start           Up 54 minutes            nova_virtlogd_wrapper
d03ac84b82b7  ...openstack-nova-libvirt:17.1_20240927.1                kolla_start           Up 54 minutes            nova_virtsecretd
fa28d87cba68  ...openstack-nova-libvirt:17.1_20240927.1                kolla_start           Up 54 minutes            nova_virtnodedevd
99a5dae19df5  ...openstack-nova-libvirt:17.1_20240927.1                kolla_start           Up 54 minutes            nova_virtstoraged
c0e1e7d54891  ...openstack-nova-libvirt:17.1_20240927.1                kolla_start           Up 54 minutes            nova_virtqemud
9f28d80bb6eb  ...openstack-nova-libvirt:17.1_20240927.1                kolla_start           Up 54 minutes            nova_virtproxyd
a3222ba23208  ...openstack-nova-compute:17.1_20240927.1                kolla_start           Up 54 minutes (healthy)  nova_migration_target
2a4fe03d09e2  ...openstack-nova-compute:17.1_20240927.1                kolla_start           Up 54 minutes (healthy)  nova_compute
caa5107cdf38  ...openstack-nova-libvirt:17.1_20240927.1                /usr/sbin/virtlog...  Up 54 minutes            nova_virtlogd
1ec2008475a6  ...openstack-neutron-metadata-agent-ovn:17.1_20240906.2  /bin/bash -c HAPR...  Up 53 minutes            neutron-haproxy-ovnmeta

No change. Perf still good.

2) update the kernel to the newer version:

[root@compute-1 ~]# dnf install kernel-5.14.0-284.85.1.el9_2 kernel-modules-extra-5.14.0-284.85.1.el9_2
[root@compute-1 ~]# systemctl reboot

No change. Perf still good.

3) update tuned to the newer version:

[root@compute-1 ~]# dnf install tuned-2.24.0-1.2.20240819gitc082797f.el9fdp.noarch
...
[root@compute-1 ~]# systemctl restart tuned.service
[root@compute-1 ~]# tuned-adm profile
cpu-partitioning-powersave
[root@compute-1 ~]# tuned-adm profile cpu-partitioning-powersave
[root@compute-1 ~]# systemctl reboot

No change. Perf still good.

I have collected a sosreport on this particular system and will wait for a new fresh deployment with 17.1_20240927.1 where we can observe the issue. Then, we'll collect another sosreport and hopefully find "some" differences that may explain the perf drop.

Comment 13 Robin Jarry 2024-10-10 15:54:33 UTC
I have spent some time on the 17.1_20240927.1 deployment. Watching testpmd
rx_missed_errors, I noticed that they arrived in short bursts. This reminded me
of https://access.redhat.com/solutions/7007609.

I double checked and nomodeset is *NOT* present in the boot command line.
However, there is still console redirection.

I ran this script in the vm:
https://git.sr.ht/~rjarry/dotfiles/tree/main/item/bin/dpdk-port-stats.py

[root@testpmd-sriov-vf-dut ~]# python dpdk-port-stats.py -s /run/dpdk/rte/dpdk_telemetry.v2 | while read -r line; do printf '[testpmd-sriov-vf-dut %s] %s\n' "$(date)" "$line"; done

And in parallel, I watched the compute node kernel logs:

[root@compute-1 ~]# dmesg -WT | sed 's/^/[compute-1]/'

[testpmd-sriov-vf-dut Thu Oct 10 15:49:35] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:35] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:35] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:36] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:36] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:36] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:37] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:37] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:37] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:38] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:38] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:38] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[compute-1][Thu Oct 10 15:49:38 2024] IPv4: martian source 10.46.174.3 from 10.46.174.126, on dev eno1
[compute-1][Thu Oct 10 15:49:38 2024] ll header: 00000000: ff ff ff ff ff ff 00 00 5e 00 01 01 08 06
[testpmd-sriov-vf-dut Thu Oct 10 15:49:39] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:39] 0: RX=12.4M pkt/s DROP=50.1K pkt/s TX=12.4M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:39] 1: RX=12.4M pkt/s DROP=48.4K pkt/s TX=12.4M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:40] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:40] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:40] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:42] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:42] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:42] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:43] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:43] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:43] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:44] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:44] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:44] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:45] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:45] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:45] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:46] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:46] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:46] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:47] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:47] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:47] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:48] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:48] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:48] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[compute-1][Thu Oct 10 15:49:49 2024] IPv4: martian source 10.46.174.6 from 10.46.174.126, on dev eno1
[compute-1][Thu Oct 10 15:49:49 2024] ll header: 00000000: ff ff ff ff ff ff 00 00 5e 00 01 01 08 06
[testpmd-sriov-vf-dut Thu Oct 10 15:49:49] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:49] 0: RX=12.4M pkt/s DROP=49.1K pkt/s TX=12.4M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:49] 1: RX=12.4M pkt/s DROP=46.0K pkt/s TX=12.4M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:50] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:50] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:50] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:51] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:51] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:51] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:52] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:52] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:52] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[compute-1][Thu Oct 10 15:49:52 2024] IPv4: martian source 10.46.174.2 from 10.46.174.126, on dev eno1
[compute-1][Thu Oct 10 15:49:52 2024] ll header: 00000000: ff ff ff ff ff ff 00 00 5e 00 01 01 08 06
[testpmd-sriov-vf-dut Thu Oct 10 15:49:53] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:53] 0: RX=12.4M pkt/s DROP=49.4K pkt/s TX=12.4M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:53] 1: RX=12.4M pkt/s DROP=48.4K pkt/s TX=12.4M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:54] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:54] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:54] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:55] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:55] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:55] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:56] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:56] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:56] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:57] ---
[testpmd-sriov-vf-dut Thu Oct 10 15:49:57] 0: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s
[testpmd-sriov-vf-dut Thu Oct 10 15:49:57] 1: RX=12.6M pkt/s DROP=0.0 pkt/s TX=12.6M pkt/s

The rx_missed_errors are directly correlated with writes to the kernel console.

Disabling writing to the console: dmesg -n1 fixes the problem entirely.

I have a lead for a fix. I will submit more details later on.

Comment 14 Robin Jarry 2024-10-11 07:22:31 UTC
This is a side effect of bz 2179366 and bz 2293368. I will submit a patch that forces kernel.printk="0 4 0 0" in sysctl. This is critical. We already have many customer cases (including SEV1 escalations) in 17.1 because of this.

Comment 19 Miguel Angel Nieto 2024-10-23 12:49:07 UTC
Performance is ok again in next puddle:

(undercloud) [stack@undercloud-0 ~]$ rpm -qa | grep tripleo-ansible
tripleo-ansible-3.3.1-17.1.20240920151435.el9ost.noarch
(undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
RHOS-17.1-RHEL-9-20241014.n.1(undercloud) [stack@undercloud-0 ~]$

[tripleo-admin@compute-0 ~]$ sudo sysctl -a | grep "kernel.printk ="
kernel.printk = 1	4	1	1

2024-10-21 17:55:58.527 | The following entry was added to row 1635 in spreadsheet:
2024-10-21 17:55:58.530 | +--------------+-------------------------------+-----------------+-------------+-------------------+---------------+---------------------------------+----------------------------------+----------------------------+------------+
2024-10-21 17:55:58.534 | | RHOS Version |             Puddle            |   OVS Version   | OVN Version |  Deployment Type  |    Scenario   | Million Packets Per Second/Port | Million Packets Per Second Total |         Timestamp          | Extra Info |
2024-10-21 17:55:58.538 | +--------------+-------------------------------+-----------------+-------------+-------------------+---------------+---------------------------------+----------------------------------+----------------------------+------------+
2024-10-21 17:55:58.542 | |     17.1     | RHOS-17.1-RHEL-9-20241014.n.1 | 3.3.2-40.el9fdp |   24.03.4   | OVN-DPDK + SR-IOV | SRIOV VF VLAN |        14.181921767950033       |        28.363843535900067        | 2024-10-21T17:55:57.502126 |    None    |
2024-10-21 17:55:58.551 | +--------------+-------------------------------+-----------------+-------------+-------------------+---------------+---------------------------------+----------------------------------+----------------------------+------------+

Comment 24 errata-xmlrpc 2024-11-21 09:42:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHOSP 17.1.4 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:9974