Bug 1880965 - ovs-vswitchd is killed when restarting it with vhost-user 4 queues + vIOMMU[ovs2.13]
Summary: ovs-vswitchd is killed when restarting it with vhost-user 4 queues + vIOMMU[o...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.13
Version: FDP 20.C
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Maxime Coquelin
QA Contact: Yanghang Liu
URL:
Whiteboard:
Depends On:
Blocks: 1880971
TreeView+ depends on / blocked
 
Reported: 2020-09-21 09:45 UTC by Pei Zhang
Modified: 2023-07-03 04:50 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1880971 (view as bug list)
Environment:
Last Closed: 2023-06-16 15:29:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-859 0 None None None 2023-06-16 15:30:46 UTC

Description Pei Zhang 2020-09-21 09:45:52 UTC
This bug was initially created as a copy of Bug #1880957

I am copying this bug because: 

This bug was initially created as a copy of Bug #1880299

I am copying this bug because: 



Description of problem:
Boot VM with vhost-user 4 queues, then boot ovs as vhost-user client. Kill ovs and start ovs again, this will cause qumu and ovs ovs-vswitchd killed.

Version-Release number of selected component (if applicable):
4.18.0-237.el8.x86_64
qemu-kvm-5.1.0-7.module+el8.3.0+8099+dba2fe3e.x86_64
openvswitch2.13-2.13.0-60.el8fdp.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot VM with vhost-user 4 queues 

    <interface type="vhostuser">
      <mac address="88:66:da:5f:dd:02" />
      <source mode="server" path="/tmp/vhost-user1.sock" type="unix" />
      <model type="virtio" />
      <driver ats="on" iommu="on" name="vhost" queues="4" rx_queue_size="1024" />
      <address bus="0x6" domain="0x0000" function="0x0" slot="0x00" type="pci" />
    </interface>



2. Boot ovs

# cat boot_ovs_client.sh 
#!/bin/bash

set -e

echo "killing old ovs process"
pkill -f ovs-vswitchd || true
sleep 5
pkill -f ovsdb-server || true

echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch

echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch
rm -f $DB_FILE

echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init

echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

echo "creating bridge and ports"

ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:5e:00.0 
ovs-vsctl add-port ovsbr0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser0.sock
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1554
ovs-vsctl set Interface dpdk0 options:n_rxq=4

echo "all done"


# sh boot_ovs_client.sh


3. Start testpmd again

# sh boot_ovs_client.sh

5. Qemu crash, and ovs-vswitchd is killed.

# abrt-cli list
id e70fe052ff1620fa88b4cbfa2f43b704710a1504
reason:         vhost_user_iotlb_cache_insert(): ovs-vswitchd killed by SIGSEGV
time:           Mon 21 Sep 2020 05:23:38 AM EDT
cmdline:        ovs-vswitchd unix:/var/run/openvswitch/db.sock --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log
package:        openvswitch2.13-2.13.0-60.el8fdp
uid:            0 (root)
count:          3
Directory:      /var/spool/abrt/ccpp-2020-09-21-05:23:38.988843-8317
Run 'abrt-cli report /var/spool/abrt/ccpp-2020-09-21-05:23:38.988843-8317' for creating a case in Red Hat Customer Portal


Actual results:
restart ovs will cause qemu crash and ovs-vswitchd killed.

Expected results:
Both qemu and ovs-vswitchd should keep working well.

Additional info:
1. This issue can not be reproduced with vhost-user 2 queues. We need to test with vhost-user 4 queues to trigger.

Comment 2 Maxime Coquelin 2020-10-21 10:48:24 UTC
DPDK series fixing this issue has been posted upstream:
http://patches.dpdk.org/user/todo/dpdk/?series=13110

Comment 4 Flavio Leitner 2023-06-16 15:29:24 UTC
Patches are merged in 20.11.
Closing this.


Note You need to log in before you can comment on or make changes to this bug.