Bug 1832708

Summary: [ovs-dpdk] [vhost-user] too many destroy_connection calls on restarting ovs-vswitchd without tso
Product: Red Hat OpenStack Reporter: Gowrishankar Muthukrishnan <gmuthukr>
Component: openvswitchAssignee: Gowrishankar Muthukrishnan <gmuthukr>
Status: CLOSED DEFERRED QA Contact: Eran Kuris <ekuris>
Severity: low Docs Contact:
Priority: low    
Version: 16.0 (Train)CC: apevec, cfontain, chrisw, rhos-maint, vchundur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1845488 (view as bug list) Environment:
Last Closed: 2020-06-09 11:47:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1845488    
Attachments:
Description Flags
ovs-vswitchd.log during restart none

Description Gowrishankar Muthukrishnan 2020-05-07 07:07:12 UTC
Description of problem:
After disabling userspace_tso in OVS-DPDK datapath (meaning it was enabled before) and restarting openvswitch service leads too many attempts on destroying connection (to the qemu-kvm).

Below message is repeating in ovs-vswitchd.log:
2020-05-07T06:16:01.648Z|02465|netdev_dpdk|INFO|vHost Device '/var/lib/vhost_sockets/vhuc10acbd9-89' connection has been destroyed
2020-05-07T06:16:01.648Z|02466|dpdk|INFO|Dropped 4755 log messages in last 0 seconds (most recently, 0 seconds ago) due to excessive rate

Version-Release number of selected component (if applicable):
openvswitch2.13-2.13.0-18flim.el8.x86_64

How reproducible:
Every time following steps below.

Steps to Reproduce:
1. If TSO is not enabled, do so in ovs-dpdk.
   sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true

2. Launch a VM with vhost-user socket in server mode.
   Eg. <source type='unix' path='/var/lib/vhost_sockets/vhue8ff400b-30' mode='server'/> in libvirt xml.

   If in OSP, create instance.

3. Ensure guest kernel driver is able to detect TSO.
   sudo ethtool -k eth2 | egrep '(scatter|tcp|gso|csum|check|segment)'
   Check "on" for tx-checksumming, scatter-gather, tcp-segmentation-offload

4. Disable TSO in ovs-dpdk.
   sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=false

5. Restart openvswitch.
   sudo systemctl restart openvswitch.service

6. Check /var/log/messages/ovs-vswitchd.service for repeating messages as reported in problem statement.

Actual results:
Repeating calls to destroy_connection.

Expected results:
No too many attempts on destroying vhost-user connection.

Additional info:

Comment 1 Gowrishankar Muthukrishnan 2020-05-07 07:22:17 UTC
Created attachment 1686073 [details]
ovs-vswitchd.log during restart

Comment 2 Gowrishankar Muthukrishnan 2020-05-07 07:24:01 UTC
Socket files used for VM vNICs:
[heat-admin@overcloud-computeovsdpdksriov-0 ~]$ sudo virsh dumpxml instance-00000053|grep vhu
      <source type='unix' path='/var/lib/vhost_sockets/vhue8ff400b-30' mode='server'/>
      <source type='unix' path='/var/lib/vhost_sockets/vhu85273654-c7' mode='server'/>
      <source type='unix' path='/var/lib/vhost_sockets/vhuc10acbd9-89' mode='server'/>