This bug was initially created as a copy of Bug #1880957
I am copying this bug because:
This bug was initially created as a copy of Bug #1880299
I am copying this bug because:
Description of problem:
Boot VM with vhost-user 4 queues, then boot ovs as vhost-user client. Kill ovs and start ovs again, this will cause qumu and ovs ovs-vswitchd killed.
Version-Release number of selected component (if applicable):
4.18.0-237.el8.x86_64
qemu-kvm-5.1.0-7.module+el8.3.0+8099+dba2fe3e.x86_64
openvswitch2.13-2.13.0-60.el8fdp.x86_64
How reproducible:
100%
Steps to Reproduce:
1. Boot VM with vhost-user 4 queues
<interface type="vhostuser">
<mac address="88:66:da:5f:dd:02" />
<source mode="server" path="/tmp/vhost-user1.sock" type="unix" />
<model type="virtio" />
<driver ats="on" iommu="on" name="vhost" queues="4" rx_queue_size="1024" />
<address bus="0x6" domain="0x0000" function="0x0" slot="0x00" type="pci" />
</interface>
2. Boot ovs
# cat boot_ovs_client.sh
#!/bin/bash
set -e
echo "killing old ovs process"
pkill -f ovs-vswitchd || true
sleep 5
pkill -f ovsdb-server || true
echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch
echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch
rm -f $DB_FILE
echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init
echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log
echo "creating bridge and ports"
ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:5e:00.0
ovs-vsctl add-port ovsbr0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser0.sock
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"
ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1554
ovs-vsctl set Interface dpdk0 options:n_rxq=4
echo "all done"
# sh boot_ovs_client.sh
3. Start testpmd again
# sh boot_ovs_client.sh
5. Qemu crash, and ovs-vswitchd is killed.
# abrt-cli list
id e70fe052ff1620fa88b4cbfa2f43b704710a1504
reason: vhost_user_iotlb_cache_insert(): ovs-vswitchd killed by SIGSEGV
time: Mon 21 Sep 2020 05:23:38 AM EDT
cmdline: ovs-vswitchd unix:/var/run/openvswitch/db.sock --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log
package: openvswitch2.13-2.13.0-60.el8fdp
uid: 0 (root)
count: 3
Directory: /var/spool/abrt/ccpp-2020-09-21-05:23:38.988843-8317
Run 'abrt-cli report /var/spool/abrt/ccpp-2020-09-21-05:23:38.988843-8317' for creating a case in Red Hat Customer Portal
Actual results:
restart ovs will cause qemu crash and ovs-vswitchd killed.
Expected results:
Both qemu and ovs-vswitchd should keep working well.
Additional info:
1. This issue can not be reproduced with vhost-user 2 queues. We need to test with vhost-user 4 queues to trigger.