Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2213198

Summary: [vhost-user][dpdk-testpmd] The vhost-user interface's transmit is 0 in the moongen_throughput tests
Product: Red Hat Enterprise Linux 8 Reporter: Yanghang Liu <yanghliu>
Component: qemu-kvmAssignee: Laurent Vivier <lvivier>
qemu-kvm sub component: Networking QA Contact: Yanghang Liu <yanghliu>
Status: CLOSED MIGRATED Docs Contact:
Severity: medium    
Priority: medium CC: chayang, coli, jinzhao, juzhang, lvivier, maxime.coquelin, virt-maint, yanghliu, ymankad
Version: 8.9Keywords: MigratedToJIRA, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-22 16:07:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yanghang Liu 2023-06-07 12:33:40 UTC
Description of problem:
The vhost-user interface(with multi queues) transmit is 0 in the moongen_throughput tests

Version-Release number of selected component (if applicable):
host:
qemu-kvm-6.2.0-33.module+el8.9.0+18724+20190c23.x86_64
libvirt-8.0.0-19.module+el8.8.0+18453+e0bf0d1d.x86_64
4.18.0-494.el8.x86_64
dpdk-21.11-2.el8_6.x86_64
openvswitch2.17-2.17.0-103.el8fdp.x86_64
guest:
dpdk-21.11-2.el8_6.x86_64
4.18.0-494.el8.x86_64


How reproducible:
100%

Steps to Reproduce:
1. setup the host kernel option, like CPU isolation,huge-page, iommu 

# grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 
# echo "isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,31,29,27,25,23,21,19,17,15,13,11"  >> /etc/tuned/cpu-partitioning-variables.conf  
tuned-adm profile cpu-partitioning
# reboot

2. start a ovs-dpdk on the host

# echo 20 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
# echo 20 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
# modprobe vfio
# modprobe vfio-pci
# dpdk-devbind.py --bind=vfio-pci 0000:5e:00.0
# dpdk-devbind.py --bind=vfio-pci 0000:5e:00.1
...
# ovs-vsctl show 
237ee9dd-96fa-4549-a45a-1ab474b19605
    Bridge ovsbr1
        datapath_type: netdev
        Port ovsbr1
            Interface ovsbr1
                type: internal
        Port vhost-user1
            Interface vhost-user1
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser1.sock"}
        Port dpdk1
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.1", n_rxq="2", n_txq="2"}
    Bridge ovsbr0
        datapath_type: netdev
        Port vhost-user0
            Interface vhost-user0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser0.sock"}
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.0", n_rxq="2", n_txq="2"}

# ovs-vsctl get Open_vSwitch . other_config
{dpdk-init="true", dpdk-lcore-mask="0x2", dpdk-socket-mem="1024,1024", pmd-cpu-mask="0x15554", vhost-iommu-support="true"}


2. start a nfv virt domain 
The domain xml is in the test log

3. hotplug two mq vhost-user interfaces into the domain

   <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:22'/>
      <source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>
      <target dev='vhost-user0'/>
      <model type='virtio'/>
      <driver name='vhost' queues='2' rx_queue_size='1024' iommu='on' ats='on'/>
      <alias name='net1'/>
    </interface>

    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:23'/>
      <source type='unix' path='/tmp/vhostuser1.sock' mode='server'/>
      <target dev='vhost-user1'/>
      <model type='virtio'/>
      <driver name='vhost' queues='2' rx_queue_size='1024' iommu='on' ats='on'/>
      <alias name='net2'/>
    </interface>

4. setup the kernel option in the domain
# grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` 	
# echo "isolated_cores=1,2,3,4,5"  >> /etc/tuned/cpu-partitioning-variables.conf 
# tuned-adm profile cpu-partitioning
# reboot

5. start a dpdk-testpmd in the domain 
# echo 2 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
# modprobe vfio
# modprobe vfio-pci
# dpdk-devbind.py --bind=vfio-pci 0000:06:00.0
# dpdk-devbind.py --bind=vfio-pci 0000:07:00.0
# dpdk-testpmd -l 1,2,3,4,5 -n 4  -d /usr/lib64/librte_net_virtio.so  -- --nb-cores=4 -i --disable-rss --rxd=512 --txd=512 --rxq=2 --txq=2 
  testpmd> start

6. do Moongen tests
# ./build/MoonGen examples/opnfv-vsperf.lua > /tmp/throughput.log

7. check if the vm with vhost-user interfaces can send and receive packages
testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 4072457825 RX-missed: 0          RX-bytes:  244347469500
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 514        TX-errors: 0          TX-bytes:  30840

  Throughput (since last show)
  Rx-pps:       436528          Rx-bps:    209533784
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 1802315709 RX-missed: 0          RX-bytes:  108138942540
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 512        TX-errors: 0          TX-bytes:  30720

  Throughput (since last show)
  Rx-pps:       385891          Rx-bps:    185227704
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

Actual results:
The vm can not send packages

Expected results:
The vm can send and receive packages normally

Additional info:
(1) repeated the above steps but with PF or VF,  the vm can send and receive packages normally
The full log is in:
http://10.73.72.41/log/2023-06-06_07:55/nfv_hotplug_hotunplug_PF_performance_1G
http://10.73.72.41/log/2023-06-06_07:55/nfv_hotplug_hotunplug_vhostuser_interface_performance_1G

Comment 1 Laurent Vivier 2023-06-09 14:38:12 UTC
Maxime,

could you have a look to see if the problem is with QEMU or with DPDK?

Comment 2 Maxime Coquelin 2023-06-12 07:58:15 UTC
(In reply to Laurent Vivier from comment #1)
> Maxime,
> 
> could you have a look to see if the problem is with QEMU or with DPDK?

Sure, I would need access to the machine if possible.

Comment 5 Maxime Coquelin 2023-06-26 09:27:07 UTC
Hello,

I just reproduced the exact steps you list in Comment 4, and it seems to work as expected:


  ######################## NIC statistics for port 0  ########################
  RX-packets: 2404892183 RX-missed: 0          RX-bytes:  144293530980
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 283770816  TX-errors: 0          TX-bytes:  17026248960

  Throughput (since last show)
  Rx-pps:     10770498          Rx-bps:   5169839424
  Tx-pps:     10770463          Tx-bps:   5169822416
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 283781148  RX-missed: 0          RX-bytes:  17026869000
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 2404902429 TX-errors: 0          TX-bytes:  144294146700

  Throughput (since last show)
  Rx-pps:     10771450          Rx-bps:   5170296608
  Tx-pps:     10771490          Tx-bps:   5170318424
  ############################################################################
testpmd>

And on Moongen side:

[Device: id=0] Sent 542277894 packets, current rate 14.66 Mpps, 7504.66 MBit/s, 9849.87 MBit/s wire rate.
[Device: id=1] Sent 542295001 packets, current rate 14.66 Mpps, 7505.61 MBit/s, 9851.12 MBit/s wire rate.
[Device: id=1] Received 408977910 packets, current rate 10.77 Mpps, 5513.70 MBit/s, 7236.74 MBit/s wire rate.
[Device: id=0] Received 408985748 packets, current rate 10.77 Mpps, 5513.72 MBit/s, 7236.76 MBit/s wire rate.
[Device: id=0] Sent 556935445 packets, current rate 14.66 Mpps, 7504.62 MBit/s, 9849.81 MBit/s wire rate.
[Device: id=1] Sent 556951599 packets, current rate 14.66 Mpps, 7504.10 MBit/s, 9849.14 MBit/s wire rate.
[Device: id=1] Received 419748410 packets, current rate 10.77 Mpps, 5514.50 MBit/s, 7237.78 MBit/s wire rate.
[Device: id=0] Received 419756268 packets, current rate 10.77 Mpps, 5514.50 MBit/s, 7237.79 MBit/s wire rate.
[Device: id=0] Sent 571593111 packets, current rate 14.66 Mpps, 7504.65 MBit/s, 9849.85 MBit/s wire rate.
[Device: id=1] Sent 571609134 packets, current rate 14.66 Mpps, 7504.59 MBit/s, 9849.78 MBit/s wire rate.
[Device: id=1] Received 430521547 packets, current rate 10.77 Mpps, 5515.85 MBit/s, 7239.55 MBit/s wire rate.
[Device: id=0] Received 430529393 packets, current rate 10.77 Mpps, 5515.84 MBit/s, 7239.54 MBit/s wire rate.

Comment 9 Yanghang Liu 2023-07-05 06:29:27 UTC
Hi Maxime,

This issue can always be reproduced via auto tests in RHEL8.9:
****************************************************
Packets_loss Frame_Size(Byte) Run_No Throughput(Mpps)
           0               64      0 0
****************************************************
Detailed Test log: http://10.73.72.41/log/2023-06-28_21:46/nfv_hotplug_hotunplug_vhostuser_interface_performance_1G


But the same auto tests can not reproduce this issue in RHEL9.3:
****************************************************
Packets_loss Frame_Size(Byte) Run_No Throughput(Mpps)
           0               64      0 21.127296
****************************************************
Detailed Test log: http://10.73.72.41/log/2023-07-04_17:21/nfv_hotplug_hotunplug_vhostuser_interface_performance_1G

Let me have a look at if I'm missing something with my manual testing when my test environment is free.
And then I will provide the reproducer with you again.

Comment 10 RHEL Program Management 2023-09-22 15:51:54 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 11 Red Hat Bugzilla 2024-01-21 04:25:51 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days