Bug 2213416

Summary: The domain with vhost-user interface + iommu throws "call trace" when running the netperf tests
Product: Red Hat Enterprise Linux 9 Reporter: Yanghang Liu <yanghliu>
Component: qemu-kvmAssignee: Laurent Vivier <lvivier>
qemu-kvm sub component: Networking QA Contact: Yanghang Liu <yanghliu>
Status: CLOSED MIGRATED Docs Contact:
Severity: medium    
Priority: medium CC: chayang, coli, jinzhao, jiqiu, juzhang, kzhang, lvivier, mhou, qren, virt-maint, yanghliu, ymankad
Version: 9.3Keywords: MigratedToJIRA, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-22 16:07:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yanghang Liu 2023-06-08 06:04:39 UTC
Description of problem:
The domain with vhost-user interface  + iommu  throws "call trace" when running the netperf tests

Version-Release number of selected component (if applicable):
qemu-kvm-8.0.0-4.el9.x86_64
5.14.0-323.el9.x86_64
dpdk-22.11-3.el9_2.x86_64
openvswitch3.1-3.1.0-28.el9fdp.x86_64

How reproducible:
100%


Steps to Reproduce:
1. setup the host kernel option, like CPU isolation,huge-page, iommu 
# grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 
# echo "isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,31,29,27,25,23,21,19,17,15,13,11"  >> /etc/tuned/cpu-partitioning-variables.conf  
tuned-adm profile cpu-partitioning
# reboot

2. start a ovs-dpdk on the host

# echo 20 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
# echo 20 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
# modprobe vfio
# modprobe vfio-pci
# dpdk-devbind.py --bind=vfio-pci 0000:5e:00.0
# dpdk-devbind.py --bind=vfio-pci 0000:5e:00.1
...
# ovs-vsctl get Open_vSwitch . other_config
{dpdk-init="true", dpdk-lcore-mask="0x2", dpdk-socket-mem="1024,1024", pmd-cpu-mask="0x15554", vhost-iommu-support="true"}

# ovs-vsctl show 
1e271d29-308d-4201-be11-d898617cc592
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.0", n_rxq="2", n_txq="2"}
        Port vhost-user0
            Interface vhost-user0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser0.sock"}
    Bridge ovsbr1
        datapath_type: netdev
        Port ovsbr1
            Interface ovsbr1
                type: internal
        Port dpdk1
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.1", n_rxq="2", n_txq="2"}
        Port vhost-user1
            Interface vhost-user1
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser1.sock"}


3. start a nfv virt domain with iommu device and vhost-user interfaces

   <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:22'/>
      <source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>
      <target dev='vhost-user0'/>
      <model type='virtio'/>
      <driver name='vhost' queues='2' rx_queue_size='1024' iommu='on' ats='on'/>
      <alias name='net1'/>
    </interface>
    
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>

4. setup the kernel option in the domain
# grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 	
# echo "isolated_cores=1,2,3,4,5"  >> /etc/tuned/cpu-partitioning-variables.conf 
# tuned-adm profile cpu-partitioning
# reboot

5. run the netperf tests between the domain clinet and host server
(5.1) The host is the netperf server
# ip addr add 192.168.1.3/24 dev ens3f1
# netserver 
Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC

(5.2)The domain is the netperf client:
# ip addr add 192.168.1.2/24 dev enp6s0  <-- the domain can ping the 192.168.1.3 successfully but with some package lost
# netperf -H 192.168.1.3/24

6. check the domain dmesg 
# dmesg
[ 4802.234530] ------------[ cut here ]------------
[ 4802.234532] NETDEV WATCHDOG: enp6s0 (virtio_net): transmit queue 0 timed out
[ 4802.234549] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x1f9/0x200
[ 4802.236690] Modules linked in: intel_rapl_msr intel_rapl_common isst_if_common nfit libnvdimm kvm_intel kvm nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge ip_set stp llc iTCO_wdt rfkill iTCO_vendor_support nf_tables irqbypass nfnetlink rapl virtio_balloon i2c_i801 i2c_smbus lpc_ich qrtr pcspkr vfat fat drm fuse xfs libcrc32c ahci libahci nvme_tcp nvme_fabrics nvme libata nvme_core nvme_common t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel net_failover virtio_blk failover serio_raw sunrpc dm_mirror dm_region_hash dm_log dm_mod
[ 4802.243011] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 5.14.0-323.el9.x86_64 #1
[ 4802.243900] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20230301gitf80f052277c8-5.el9 03/01/2023
[ 4802.244809] RIP: 0010:dev_watchdog+0x1f9/0x200
[ 4802.245284] Code: 00 e9 40 ff ff ff 48 89 ef c6 05 03 af 7a 01 01 e8 3c c5 fa ff 44 89 e9 48 89 ee 48 c7 c7 a0 b1 6d 97 48 89 c2 e8 17 82 77 ff <0f> 0b e9 22 ff ff ff 0f 1f 44 00 00 55 53 48 89 fb 48 8b 6f 18 0f
[ 4802.247210] RSP: 0018:ffffb32980003eb0 EFLAGS: 00010286
[ 4802.247766] RAX: 0000000000000000 RBX: ffff99428b8ff488 RCX: 0000000000000027
[ 4802.248511] RDX: 0000000000000027 RSI: ffffffff97e67460 RDI: ffff994337c1f8c8
[ 4802.249262] RBP: ffff99428b8ff000 R08: ffff994337c1f8c0 R09: 0000000000000000
[ 4802.250016] R10: ffffffffffffffff R11: ffffffff98b6f070 R12: ffff99428b8ff3dc
[ 4802.250766] R13: 0000000000000000 R14: ffffffff96b7e5b0 R15: ffffb32980003f08
[ 4802.251516] FS:  0000000000000000(0000) GS:ffff994337c00000(0000) knlGS:0000000000000000
[ 4802.252364] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4802.252977] CR2: 00007ffe7a749000 CR3: 0000000101d54004 CR4: 0000000000770ef0
[ 4802.253732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4802.254479] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4802.255230] PKRU: 55555554
[ 4802.255532] Call Trace:
[ 4802.255805]  <IRQ>
[ 4802.256031]  ? pfifo_fast_change_tx_queue_len+0x70/0x70
[ 4802.256586]  call_timer_fn+0x24/0x130
[ 4802.256986]  __run_timers.part.0+0x1ee/0x280
[ 4802.257444]  ? enqueue_hrtimer+0x2f/0x80
[ 4802.257870]  ? __hrtimer_run_queues+0x159/0x2c0
[ 4802.258358]  run_timer_softirq+0x26/0x50
[ 4802.258785]  __do_softirq+0xc7/0x2ac
[ 4802.259173]  __irq_exit_rcu+0xb9/0xf0
[ 4802.259573]  sysvec_apic_timer_interrupt+0x72/0x90
[ 4802.260084]  </IRQ>
[ 4802.260318]  <TASK>
[ 4802.260559]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[ 4802.261103] RIP: 0010:default_idle+0x10/0x20
[ 4802.261571] Code: 8b 04 25 40 ef 01 00 f0 80 60 02 df c3 cc cc cc cc 0f ae 38 eb bb 0f 1f 40 00 0f 1f 44 00 00 66 90 0f 00 2d be da 47 00 fb f4 <c3> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 65
[ 4802.263496] RSP: 0018:ffffffff97e03ea8 EFLAGS: 00000252
[ 4802.264050] RAX: ffffffff96d8d320 RBX: ffffffff97e1a940 RCX: 0000000000000000
[ 4802.264803] RDX: 4000000000000000 RSI: ffff994337c22b20 RDI: 000000000497eebc
[ 4802.265554] RBP: 0000000000000000 R08: 0000045e163d1cbb R09: ffff9941d6202400
[ 4802.266301] R10: 0000000000020604 R11: 0000000000000000 R12: 0000000000000000
[ 4802.267054] R13: 000000006dc53d18 R14: 000000006d3c47a8 R15: 000000006d3c47b0
[ 4802.267810]  ? mwait_idle+0x70/0x70
[ 4802.268189]  default_idle_call+0x33/0xe0
[ 4802.268615]  cpuidle_idle_call+0x125/0x160
[ 4802.269051]  ? kvm_sched_clock_read+0x14/0x30
[ 4802.269519]  do_idle+0x78/0xe0
[ 4802.269891]  cpu_startup_entry+0x19/0x20
[ 4802.270311]  rest_init+0xca/0xd0
[ 4802.270671]  arch_call_rest_init+0xa/0x14
[ 4802.271099]  start_kernel+0x4a3/0x4c2
[ 4802.271495]  secondary_startup_64_no_verify+0xe5/0xeb
[ 4802.272037]  </TASK>
[ 4802.272279] ---[ end trace 87fb221169225dfd ]--

Besides above "Call Trace" , the domain will keep throwing the info like" virtio_net virtio3 enp6s0: TX timeout on queue: 0, sq: output.0, vq: 0x1, name: output.0, 7820000 usecs ago"

7. run the ping tests 
# ping  192.168.1.3
PING 192.168.1.3 (192.168.1.3) 56(84) bytes of data.
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
...

Actual results:
The domain with vhost-user interface  + iommu  throws "call trace" when running the netperf tests

Expected results:
No Call Trace

Additional info:

Comment 1 RHEL Program Management 2023-09-22 15:51:51 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.