Bug 1828165

Summary: [qemu-kvm][vhost-user] guest virtio-net not allowed to enable offloads when ovs-dpdk datapath re-enabling TSO support
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Gowrishankar Muthukrishnan <gmuthukr>
Component: openvswitchAssignee: Timothy Redaelli <tredaelli>
openvswitch sub component: ovs-dpdk QA Contact: qding
Status: NEW --- Docs Contact:
Severity: high    
Priority: medium CC: apevec, chrisw, ctrautma, fleitner, froyo, hakhande, jhsiao, ktraynor, vchundur
Version: FDP 20.DKeywords: Triaged
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gowrishankar Muthukrishnan 2020-04-27 07:57:08 UTC
Description of problem:
KVM guest virtio-net seems not allowed to enable offloaded features (tcp csum, tso, ip csum) when the host datapath in ovs-dpdk re-enables TSO support.
Qemu runs as vhostuser and ovs-dpdk as vhostuser-client in this case. KVM guest is reboot once and verified if its virtio-net now have the ability to turn on TSO related offloads, after TSO is turned on in OVS-DPDK switch.

Impact of the problem:

* VM can not benefit from TSO related offloads when TSO is re-enabled in OVS DPDK datapath.
* Potential blocker for migrating VM from non-TSO to TSO enabled datapath.
Version-Release number of selected component (if applicable):
OSP 16.0

How reproducible:
Any time after restarting ovs-dpdk following TSO enablement in datapath.

Steps to Reproduce:
1. Start ovs-dpdk without TSO enabled in datapath.
   sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=false

2. Add a netdev bridge and vhost socket (dpdkvhostuserclient type) in ovs-dpdk bridge.
3. Launch KVM guest (VM1) with vhostuser backed ethernet interface.
4. Ensure tso turned off for the guest virtio-net interface.
5. Turn TSO on in ovs-dpdk datapath and restart ovs-dpdk daemon.
   sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true
   sudo systemctl restart openvswitch.service
   
6. Restart KVM guest (VM1) to ensure renegotiated offloads enabled for virtio-net interface in it.
   This test fails in our env.
   
   Expected result:
   ----------------
   transmit offload features (sg, tso) be turned on
   
   Actual result:
   --------------
   transmit offload features (sg, tso) remain off

7. Launch new KVM guest (VM2) to ensure negotiated offloads enabled for virtio-net interface.
   This test passes in our env.
   Expected result:
   ----------------
   transmit offload features (sg, tso) be turned on

   Actual result:
   ----------------
   N/A
   

Additional info:

Environment info:
-----------------
Host:
  openvswitch2.13-2.13.0-18.el8fdp.x86_64
  Red Hat Enterprise Linux release 8.1 (Ootpa)
  Linux overcloud-computeovsdpdksriov-1 4.18.0-147.5.1.el8_1.x86_64 #1 SMP Tue Jan 14 15:50:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  
Qemu:
  qemu-kvm-4.1.0-23
  
Guest:
  CentOS Linux release 8.1.1911 
  Linux iperf-server1 4.18.0-147.3.1.el8_1.x86_64 #1 SMP Fri Jan 3 23:55:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

VM1:
====
$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: off
        tx-scatter-gather: off [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]

HOST:
=====
$ sudo ovs-appctl dpctl/show -s | grep tso
  port 2: dpdk-link1-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=false)
  port 10: dpdk-link2-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=false)

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-vsctl get Open_vSwitch . other_config 
{dpdk-extra=" -n 4", dpdk-init="true", dpdk-lcore-mask="1001", dpdk-socket-mem="1024", emc-insert-inv-prob="100", pmd-cpu-mask="0x2002", userspace-tso-enable="false"}

<< enable TSO in OVS-DPDK datapath >>

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true

<< after OVS-DPDK restart >>

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-appctl dpctl/show -s | grep tso
  port 2: dpdk-link1-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=true)
  port 10: dpdk-link2-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=true)


VM1:
====
[centos@VM1 ~]$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: off
        tx-scatter-gather: off [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]

<< after guest reboot >>

[centos@VM1 ~]$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: off                           <<< NOT TURNED ON >>>
        tx-scatter-gather: off [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off                 <<< NOT TURNED ON >>>
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]

HOST:
=====
Checking ovs-vswitchd log..

2020-04-27T03:44:15.826Z|00531|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_FEATURES
2020-04-27T03:44:15.826Z|00532|dpdk|INFO|VHOST_CONFIG: negotiated Virtio features: 0x17020a782

#define VIRTIO_NET_F_HOST_TSO4  11

(gdb) p /x 0x17020a782 & (1 <<11)
$15 = 0x0

DPDK vhost-user client on reconnecting to qemu-kvm received negotiated features without TSO4 for an instance.

VM2:
====
[centos@VM2 ~]$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on

and we are allowed to turn off as well.

[centos@VM2 ~]$ sudo ethtool -K eth0 sg off tso off
Actual changes:
scatter-gather: off
        tx-scatter-gather: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off
        tx-tcp-ecn-segmentation: off
        tx-tcp6-segmentation: off
generic-segmentation-offload: off [requested on]
[centos@iperf-server1 ~]$

Comment 4 Gowrishankar Muthukrishnan 2020-05-27 06:58:53 UTC
I could confirm it will break VM in live migration (i.e VM after moving to host with TSO enabled OVS-DPDK would not be able to benefit from TSO) as below. Hence, this bug will block live-migrating VM to TSO enabled DP.

Controller node:
----------------

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-1.localdomain

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-0.localdomain
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| ID                                   | Name         | Status | Networks                                                                                           | Image        | Flavor |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| edf8f464-e44b-4937-ae54-21d961a27a6b | iperf_server | ACTIVE | mgmt_dpdk1=50.0.2.42, 172.20.0.223; testpmd_net_nic0_308=50.0.4.39; testpmd_net_nic1_309=50.0.5.93 | centos8cloud |        |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+

Instance before migration:

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
[centos@iperf-server ~]$ 


Source node:

[heat-admin@overcloud-computeovsdpdksriov-0 ~]$ sudo ovs-vsctl get Open_vSwitch . other_config
{dpdk-extra=" -n 4", dpdk-init="true", dpdk-lcore-mask="1001", dpdk-socket-mem="1024", emc-insert-inv-prob="100", pmd-cpu-mask="6006"}


Target node:

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-vsctl get Open_vSwitch . other_config
{dpdk-extra=" -n 4", dpdk-init="true", dpdk-lcore-mask="1001", dpdk-socket-mem="1024", emc-insert-inv-prob="100", pmd-cpu-mask="6006", userspace-tso-enable="true"}


Live migration:

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server migrate --live-migration  --block-migration --host  overcloud-computeovsdpdksriov-1.localdomain  iperf_server

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-0.localdomain

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-1.localdomain
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| ID                                   | Name         | Status | Networks                                                                                           | Image        | Flavor |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| edf8f464-e44b-4937-ae54-21d961a27a6b | iperf_server | ACTIVE | mgmt_dpdk1=50.0.2.42, 172.20.0.223; testpmd_net_nic0_308=50.0.4.39; testpmd_net_nic1_309=50.0.5.93 | centos8cloud |        |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+

ping stats to the VM during migration:

--- 172.20.0.223 ping statistics ---
65 packets transmitted, 55 received, 15.3846% packet loss, time 1052ms


Verification inside VM post migration:

actual:

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]

expected:

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
tx-checksumming: on
        tx-checksum-ip-generic: on
scatter-gather: on
        tx-scatter-gather: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
tx-gso-robust: on [fixed]

Work around for VM to have TSO benefit from backend is to "hard" reboot it.

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server reboot iperf_server

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
tx-checksumming: on
        tx-checksum-ip-generic: on
scatter-gather: on
        tx-scatter-gather: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
tx-gso-robust: on [fixed]
[centos@iperf-server ~]$ 

As shown above, tso features are available to VM after hard reboot.