Bug 1738768 - Guest fails to recover receiving packets after vhost-user reconnect
Summary: Guest fails to recover receiving packets after vhost-user reconnect
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Adrián Moreno
QA Contact: Pei Zhang
URL:
Whiteboard:
: 1791904 (view as bug list)
Depends On:
Blocks: 1780498 1791983
TreeView+ depends on / blocked
 
Reported: 2019-08-08 06:14 UTC by Pei Zhang
Modified: 2020-02-04 18:29 UTC (History)
9 users (show)

Fixed In Version: qemu-kvm-4.1.0-16.module+el8.1.1+4917+752cfd65
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1780498 (view as bug list)
Environment:
Last Closed: 2020-02-04 18:28:48 UTC
Type: Bug
Target Upstream Version:
Embargoed:
amorenoz: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0404 0 None None None 2020-02-04 18:29:57 UTC

Description Pei Zhang 2019-08-08 06:14:22 UTC
Description of problem:
Boot guest with ovs+vhost-user+dpdk. Then re-connect vhost-user by re-starting ovs, testpmd in guest fails to recover receiving packets.

Version-Release number of selected component (if applicable):
4.18.0-128.el8.x86_64
dpdk-18.11-8.el8.x86_64
libvirt-5.6.0-1.module+el8.1.0+3890+4d3d259c.x86_64
openvswitch2.11-2.11.0-18.el8fdp.x86_64
qemu v4.1.0-rc4

upstream qemu: git://git.qemu.org/qemu.git master
# git log -1
commit 864ab314f1d924129d06ac7b571f105a2b76a4b2 (HEAD -> master, tag: v4.1.0-rc4, origin/master, origin/HEAD)
Author: Peter Maydell <peter.maydell>
Date:   Tue Aug 6 17:05:21 2019 +0100

    Update version for v4.1.0-rc4 release
    
    Signed-off-by: Peter Maydell <peter.maydell>


How reproducible:
100%

Steps to Reproduce:
1. Boot ovs, refer to [1]

# sh  boot_ovs_client.sh

2. Start guest, refer to[2]

3. Start testpmd in guest and start Moongen in another host, guest can receive packets, refer to[3]

3. Re-connect vhost-user by re-start ovs

# sh  boot_ovs_client.sh

4. Check testpmd in guest, the packets receiving can not recover.

Actual results:
dpdk packets receiving can not recover after vhost-user reconnect.

Expected results:
dpdk packets receiving should recover well after vhost-user reconnect.

Additional info:
1. This should be a regression bug, as below versions works well:
qemu-kvm-2.12.0-83.module+el8.1.0+3852+0ba8aef0.x86_64 works well
qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.x86_64   works well


Reference:
[1]
# cat boot_ovs_client.sh 
#!/bin/bash

set -e

echo "killing old ovs process"
pkill -f ovs-vswitchd || true
sleep 5
pkill -f ovsdb-server || true

echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch

echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch
rm -f $DB_FILE

echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init

echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

echo "creating bridge and ports"

ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:5e:00.0 
ovs-vsctl add-port ovsbr0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser0.sock
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl --if-exists del-br ovsbr1
ovs-vsctl add-br ovsbr1 -- set bridge ovsbr1 datapath_type=netdev
ovs-vsctl add-port ovsbr1 dpdk1 -- set Interface dpdk1 type=dpdk options:dpdk-devargs=0000:5e:00.1
ovs-vsctl add-port ovsbr1 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser1.sock
ovs-ofctl del-flows ovsbr1
ovs-ofctl add-flow ovsbr1 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr1 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1554
ovs-vsctl set Interface dpdk0 options:n_rxq=1
ovs-vsctl set Interface dpdk1 options:n_rxq=1

echo "all done"


[2]
<domain type='kvm'>
  <name>rhel8.0</name>
  <uuid>c67628f0-b996-11e9-8d0c-a0369fc7bbea</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='31'/>
    <vcpupin vcpu='1' cpuset='29'/>
    <vcpupin vcpu='2' cpuset='30'/>
    <vcpupin vcpu='3' cpuset='28'/>
    <vcpupin vcpu='4' cpuset='26'/>
    <vcpupin vcpu='5' cpuset='24'/>
    <emulatorpin cpuset='1,3,5,7,9,11'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <os>
    <type arch='x86_64' machine='pc-q35-4.1'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <pmu state='off'/>
    <vmport state='off'/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='tsc-deadline'/>
    <numa>
      <cell id='0' cpus='0-5' memory='8388608' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='threads' iommu='on' ats='on'/>
      <source file='/mnt/nfv//rhel8.0.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='none'/>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='18:66:da:5f:dd:01'/>
      <source bridge='switch'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:02'/>
      <source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='1024' iommu='on' ats='on'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:03'/>
      <source type='unix' path='/tmp/vhostuser1.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='1024' iommu='on' ats='on'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </memballoon>
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
  </devices>
</domain>

[3]
testpmd \
-l 1,2,3 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so.1 \
-w 0000:06:00.0 -w 0000:07:00.0 \
-- \
--nb-cores=2 \
-i \
--disable-rss \
--rxd=512 --txd=512 \
--rxq=1 --txq=1

Comment 2 Pei Zhang 2019-08-16 02:09:07 UTC
Below latest version still can hit this issue:

4.18.0-132.el8.x86_64
qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64

Comment 3 Adrián Moreno 2019-09-17 18:42:31 UTC
Reproducing it with testpmd yields:
 > VHOST_CONFIG: negotiated Virtio features: 0x40000000           

So clearly the virtio featuers are not properly saved / restored. That narrows the possible regression to the following commit:

commit 6ab79a20af3a7b3bf610ba9aebb446a9f0b05930
Author: Dan Streetman <ddstreet>
Date:   Tue Apr 16 14:46:24 2019 -0400

    do not call vhost_net_cleanup() on running net from char user event
    
    Buglink: https://launchpad.net/bugs/1823458
    
    Currently, a user CHR_EVENT_CLOSED event will cause net_vhost_user_event()
    to call vhost_user_cleanup(), which calls vhost_net_cleanup() for all
    its queues.  However, vhost_net_cleanup() must never be called like
    this for fully-initialized nets; when other code later calls
    vhost_net_stop() - such as from virtio_net_vhost_status() - it will try
    to access the already-cleaned-up fields and fail with assertion errors
    or segfaults.
    
    The vhost_net_cleanup() will eventually be called from
    qemu_cleanup_net_client().
    
    Signed-off-by: Dan Streetman <ddstreet>
    Message-Id: <20190416184624.15397-3-dan.streetman>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 5a26a24708..51921de443 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -236,7 +236,6 @@ static void chr_closed_bh(void *opaque)
     s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
 
     qmp_set_link(name, false, &err);
-    vhost_user_stop(queues, ncs);
 
     qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, net_vhost_user_event,
                              NULL, opaque, NULL, true);




Indeed, vhost_user_stop is no longer called:

static void vhost_user_stop(int queues, NetClientState *ncs[])
{
    NetVhostUserState *s;
    int i;

    for (i = 0; i < queues; i++) {
        assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER);

        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);

        if (s->vhost_net) {
            /* save acked features */
            uint64_t features = vhost_net_get_acked_features(s->vhost_net);
            if (features) {
                s->acked_features = features;
            }
            vhost_net_cleanup(s->vhost_net);
        }
    }
}

Comment 4 Adrián Moreno 2019-09-26 13:14:06 UTC
A patch has been sent upstream [1] but it missed the AV8.1. 
[1] https://patchew.org/QEMU/20190924162044.11414-1-amorenoz@redhat.com/

Moving it to AV8.1.1

Comment 10 Pei Zhang 2019-12-06 08:53:59 UTC
Hi Adrian,

This issue still exits with qemu-kvm-4.1.0-16.module+el8.1.1+4917+752cfd65.x86_64. Testpmd network still can not be covered. Could you check, please?

Move bug status from 'ON_QA' to 'ASSIGNED'.


Best regards,

Pei

Comment 11 Adrián Moreno 2019-12-10 10:41:48 UTC
It must be related to a new issue.
Can you please attach logs from qemu and ovs side?
Thanks

Comment 14 Pei Zhang 2019-12-26 09:55:46 UTC
(In reply to Adrián Moreno from comment #11)
> It must be related to a new issue.
> Can you please attach logs from qemu and ovs side?
> Thanks

(1)qemu-kvm-4.1.0-15.module+el8.1.1+4700+209eec8f.x86_64 (reproduce version) 
(2)qemu-kvm-4.1.0-20.module+el8.1.1+5309+6d656f05.x86_64 (with fixed patch version)

After comparing version (1) and (2), there are differences:

With version (1), testpmd both TX and RX stop receiving/sending packets after ovs vhost-user reconnect:

testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 1422265    RX-missed: 0          RX-bytes:  85335900
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 1420379    TX-errors: 0          TX-bytes:  85222740

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 1421558    RX-missed: 0          RX-bytes:  85293480
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 1421110    TX-errors: 0          TX-bytes:  85266600

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

With version (2), testpmd RX can receive packets. But all receiving packets show errors after ovs vhost-user reconnect:

testpmd> show port stats all 

  ######################## NIC statistics for port 0  ########################
  RX-packets: 1139121    RX-missed: 0          RX-bytes:  68347260
  RX-errors: 1476988
  RX-nombuf:  0         
  TX-packets: 1137417    TX-errors: 0          TX-bytes:  68245020

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 1138563    RX-missed: 0          RX-bytes:  68313780
  RX-errors: 1479063
  RX-nombuf:  0         
  TX-packets: 1137968    TX-errors: 0          TX-bytes:  68278080

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################
 

I prefer this bug has been fixed. And Adrián has filed a new bz  (Bug 1782528 - qemu-kvm: event flood when vhost-user backed virtio netdev is unexpectedly closed while guest is transmitting) to track the new issue which is the RX-errors issue. 

Move this bug to 'VERIFIED'. 

Hi Adrián, 

Feel free to correct me if you disagree. Thanks. 

(And sorry for my late response)


Best regards,

Pei

Comment 16 Amnon Ilan 2020-01-21 21:49:51 UTC
*** Bug 1791904 has been marked as a duplicate of this bug. ***

Comment 18 errata-xmlrpc 2020-02-04 18:28:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0404


Note You need to log in before you can comment on or make changes to this bug.