Bug 1293405 - [fdBeta] [RFE] OVS-DPDK: reconnect vhost-user ports in case of vswitch failure
[fdBeta] [RFE] OVS-DPDK: reconnect vhost-user ports in case of vswitch failure
Status: VERIFIED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openvswitch (Show other bugs)
7.3
x86_64 Linux
high Severity medium
: rc
: ---
Assigned To: Kevin Traynor
Pei Zhang
: FutureFeature
Depends On: 1335865
Blocks: 1411879
  Show dependency treegraph
 
Reported: 2015-12-21 12:07 EST by Flavio Leitner
Modified: 2017-01-17 06:15 EST (History)
15 users (show)

See Also:
Fixed In Version: openvswitch-2.6.1-2.git20161206.el7fdb
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Flavio Leitner 2015-12-21 12:07:00 EST
Description of problem:

If the OVS bridge crashes, it should be possible to reconnect the guests using vhost-user ports.
Comment 2 Rick Alongi 2016-01-13 10:15:14 EST
Reassigning bug to Jean Hsiao as this is involved DPDK.
Comment 3 Flavio Leitner 2016-07-28 12:34:35 EDT
Panu,

This is probably only solvable with vHost PMD using client mode where someone else provides the socket for OVS, then it just needs to re-open the socket.

fbl
Comment 4 Panu Matilainen 2016-09-14 09:04:09 EDT
As noted in comment #3, this will at the minimum require OVS using client mode vhost-user, meaning OVS >= 2.6.
Comment 5 Kevin Traynor 2016-11-02 12:21:21 EDT
As stated above OVS 2.6 will be needed for this. QEMU 2.7+ will also be needed.

commit c1ff66ac80b51aea49c3d672393a9a35f5164790
Author: Ciara Loftus <ciara.loftus@intel.com>
Date:   Mon Aug 15 16:11:26 2016 +0100

    netdev-dpdk: vHost client mode and reconnect
    
    Until now, vHost ports in OVS have only been able to operate in 'server'
    mode whereby OVS creates and manages the vHost socket and essentially
    acts as the vHost 'server'. With this commit a new mode, 'client' mode,
    is available. In this mode, OVS acts as the vHost 'client' and connects
    to the socket created and managed by QEMU which now acts as the vHost
    'server'. This mode allows for reconnect capability, which allows a
    vHost port to resume normal connectivity in event of switch reset.
    
    By default dpdkvhostuser ports still operate in 'server' mode. That is
    unless a valid 'vhost-server-path' is specified for a device like so:
    
    ovs-vsctl set Interface dpdkvhostuser0
    options:vhost-server-path=/path/to/socket
    
    'vhost-server-path' represents the full path of the vhost user socket
    that has been or will be created by QEMU. Once specified, the port stays
    in 'client' mode for the remainder of its lifetime.
    
    QEMU v2.7.0+ is required when using OVS in vHost client mode and QEMU in
    vHost server mode.
    
    Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
    Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Comment 8 Pei Zhang 2017-01-17 04:36:59 EST
Verification:

Versions:
3.10.0-514.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7.x86_64
dpdk-16.11-2.el7fdb.x86_64
openvswitch-2.6.1-3.git20161206.el7fdb.x86_64


Steps:
1. Boot guest with 2 vhostuser server /tmp/vhostuser$i. Full qemu command please refer to[1].

-chardev socket,id=char0,path=/tmp/vhostuser$i.sock,server \
-device virtio-net-pci,netdev=mynet0,mac=54:52:00:1a:2c:01 \
-netdev type=vhost-user,id=mynet0,chardev=char0,vhostforce \


2. Start ovs with 2 dpdkvhostuserclient which using socket path /tmp/vhostuser$i. Shell script boot_ovs.sh please refer to[2].

ovs-vsctl add-port ovsbr0 vhost-user1 -- set Interface vhost-user$i type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser$i.sock


3. Start testpmd in guest, packets can be forwarded. Full command  please refer to [3].

MoonGen receive data.
[Device: id=0] Received 93153529 packets, current rate 0.13 Mpps, 64.00 MBit/s, 84.00 MBit/s wire rate.
[Device: id=0] Sent 219131179 packets, current rate 0.15 Mpps, 76.64 MBit/s, 100.59 MBit/s wire rate.
[Device: id=1] Received 111338211 packets, current rate 0.15 Mpps, 76.64 MBit/s, 100.59 MBit/s wire rate.
[Device: id=1] Sent 202038848 packets, current rate 0.13 Mpps, 64.00 MBit/s, 84.00 MBit/s wire rate.


4. Restart ovs to simulate dpdkvhostuserclient disconnect/reconnect. During this process, testpmd in guest will stop forwarding for a while. After restart finished, testpmd in guest continue forwarding packets.

# sh boot_ovs.sh


5. Repeat step4 several times, network in guest can always recover.



Hi Kevin,

I'd like to confirm with you: Is above steps correct to verify this bug? If not, what's the testing scenario? Thanks.


Best Regards,
Pei


More detail info:
[1] qemu command line
# cat qemu.sh 
/usr/libexec/qemu-kvm -cpu SandyBridge -m 4096 -smp 4 \
-object memory-backend-file,id=mem0,size=4096M,mem-path=/dev/hugepages,share=on \
-numa node,nodeid=0,memdev=mem0 \
-mem-prealloc \
/home/pezhang/rhel7.4.qcow2 \
-chardev socket,id=char0,path=/tmp/vhostuser1.sock,server \
-device virtio-net-pci,netdev=mynet0,mac=54:52:00:1a:2c:01 \
-netdev type=vhost-user,id=mynet0,chardev=char0,vhostforce \
-chardev socket,id=char1,path=/tmp/vhostuser2.sock,server \
-device virtio-net-pci,netdev=mynet1,mac=54:52:00:1a:2c:02 \
-netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce \
-vga std -vnc :10 \
-monitor stdio \
-serial unix:/tmp/monitor,server,nowait \

[2]# cat boot_ovs.sh 
#!/bin/bash

set -e

echo "killing old ovs process"
pkill -f ovs- || true
pkill -f ovsdb || true

echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch

rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch

echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -f /var/run/openvswitch/vhost-user*
rm -f $DB_FILE

echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init

echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

echo "creating bridge and ports"

ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port ovsbr0 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser1.sock
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl --if-exists del-br ovsbr1
ovs-vsctl add-br ovsbr1 -- set bridge ovsbr1 datapath_type=netdev
ovs-vsctl add-port ovsbr1 dpdk1 -- set Interface dpdk1 type=dpdk
ovs-vsctl add-port ovsbr1 vhost-user2 -- set Interface vhost-user2 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser2.sock
ovs-ofctl del-flows ovsbr1
ovs-ofctl add-flow ovsbr1 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr1 "in_port=2,idle_timeout=0 actions=output:1"

echo "all done"


[3]
# cat testpmd.sh
queues=1
cores=2
testpmd \
-l 0,1,2 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so \
-w 0000:00:03.0 -w 0000:00:04.0 \
-- \
--nb-cores=${cores} \
--disable-hw-vlan -i \
--disable-rss \
--rxq=${queues} --txq=${queues} \
--auto-start \
--rxd=256 --txd=256 \
Comment 9 Kevin Traynor 2017-01-17 04:56:28 EST
Hi Pei,

Yes, the test described above is a correct test to verify this bug. Your results show the feature working as expected. Thanks for providing the detailed description.

Kevin.

Note You need to log in before you can comment on or make changes to this bug.