Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1828165

Summary: [qemu-kvm][vhost-user] guest virtio-net not allowed to enable offloads when ovs-dpdk datapath re-enabling TSO support
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Gowrishankar Muthukrishnan <gmuthukr>
Component: openvswitchAssignee: Timothy Redaelli <tredaelli>
openvswitch sub component: ovs-dpdk QA Contact: qding
Status: CLOSED EOL Docs Contact:
Severity: high    
Priority: medium CC: apevec, chrisw, ctrautma, fleitner, froyo, hakhande, jhsiao, ktraynor, vchundur
Version: FDP 20.DKeywords: Triaged
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-03-12 14:35:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gowrishankar Muthukrishnan 2020-04-27 07:57:08 UTC
Description of problem:
KVM guest virtio-net seems not allowed to enable offloaded features (tcp csum, tso, ip csum) when the host datapath in ovs-dpdk re-enables TSO support.
Qemu runs as vhostuser and ovs-dpdk as vhostuser-client in this case. KVM guest is reboot once and verified if its virtio-net now have the ability to turn on TSO related offloads, after TSO is turned on in OVS-DPDK switch.

Impact of the problem:

* VM can not benefit from TSO related offloads when TSO is re-enabled in OVS DPDK datapath.
* Potential blocker for migrating VM from non-TSO to TSO enabled datapath.
Version-Release number of selected component (if applicable):
OSP 16.0

How reproducible:
Any time after restarting ovs-dpdk following TSO enablement in datapath.

Steps to Reproduce:
1. Start ovs-dpdk without TSO enabled in datapath.
   sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=false

2. Add a netdev bridge and vhost socket (dpdkvhostuserclient type) in ovs-dpdk bridge.
3. Launch KVM guest (VM1) with vhostuser backed ethernet interface.
4. Ensure tso turned off for the guest virtio-net interface.
5. Turn TSO on in ovs-dpdk datapath and restart ovs-dpdk daemon.
   sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true
   sudo systemctl restart openvswitch.service
   
6. Restart KVM guest (VM1) to ensure renegotiated offloads enabled for virtio-net interface in it.
   This test fails in our env.
   
   Expected result:
   ----------------
   transmit offload features (sg, tso) be turned on
   
   Actual result:
   --------------
   transmit offload features (sg, tso) remain off

7. Launch new KVM guest (VM2) to ensure negotiated offloads enabled for virtio-net interface.
   This test passes in our env.
   Expected result:
   ----------------
   transmit offload features (sg, tso) be turned on

   Actual result:
   ----------------
   N/A
   

Additional info:

Environment info:
-----------------
Host:
  openvswitch2.13-2.13.0-18.el8fdp.x86_64
  Red Hat Enterprise Linux release 8.1 (Ootpa)
  Linux overcloud-computeovsdpdksriov-1 4.18.0-147.5.1.el8_1.x86_64 #1 SMP Tue Jan 14 15:50:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  
Qemu:
  qemu-kvm-4.1.0-23
  
Guest:
  CentOS Linux release 8.1.1911 
  Linux iperf-server1 4.18.0-147.3.1.el8_1.x86_64 #1 SMP Fri Jan 3 23:55:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

VM1:
====
$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: off
        tx-scatter-gather: off [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]

HOST:
=====
$ sudo ovs-appctl dpctl/show -s | grep tso
  port 2: dpdk-link1-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=false)
  port 10: dpdk-link2-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=false)

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-vsctl get Open_vSwitch . other_config 
{dpdk-extra=" -n 4", dpdk-init="true", dpdk-lcore-mask="1001", dpdk-socket-mem="1024", emc-insert-inv-prob="100", pmd-cpu-mask="0x2002", userspace-tso-enable="false"}

<< enable TSO in OVS-DPDK datapath >>

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true

<< after OVS-DPDK restart >>

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-appctl dpctl/show -s | grep tso
  port 2: dpdk-link1-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=true)
  port 10: dpdk-link2-port (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=3, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=3, requested_txq_descriptors=2048, rx_csum_offload=true, tx_tso_offload=true)


VM1:
====
[centos@VM1 ~]$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: off
        tx-scatter-gather: off [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]

<< after guest reboot >>

[centos@VM1 ~]$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: off                           <<< NOT TURNED ON >>>
        tx-scatter-gather: off [fixed]
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off                 <<< NOT TURNED ON >>>
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]

HOST:
=====
Checking ovs-vswitchd log..

2020-04-27T03:44:15.826Z|00531|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_FEATURES
2020-04-27T03:44:15.826Z|00532|dpdk|INFO|VHOST_CONFIG: negotiated Virtio features: 0x17020a782

#define VIRTIO_NET_F_HOST_TSO4  11

(gdb) p /x 0x17020a782 & (1 <<11)
$15 = 0x0

DPDK vhost-user client on reconnecting to qemu-kvm received negotiated features without TSO4 for an instance.

VM2:
====
[centos@VM2 ~]$ ethtool -k eth0 |egrep '(scatter|tcp)'
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on

and we are allowed to turn off as well.

[centos@VM2 ~]$ sudo ethtool -K eth0 sg off tso off
Actual changes:
scatter-gather: off
        tx-scatter-gather: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off
        tx-tcp-ecn-segmentation: off
        tx-tcp6-segmentation: off
generic-segmentation-offload: off [requested on]
[centos@iperf-server1 ~]$

Comment 4 Gowrishankar Muthukrishnan 2020-05-27 06:58:53 UTC
I could confirm it will break VM in live migration (i.e VM after moving to host with TSO enabled OVS-DPDK would not be able to benefit from TSO) as below. Hence, this bug will block live-migrating VM to TSO enabled DP.

Controller node:
----------------

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-1.localdomain

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-0.localdomain
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| ID                                   | Name         | Status | Networks                                                                                           | Image        | Flavor |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| edf8f464-e44b-4937-ae54-21d961a27a6b | iperf_server | ACTIVE | mgmt_dpdk1=50.0.2.42, 172.20.0.223; testpmd_net_nic0_308=50.0.4.39; testpmd_net_nic1_309=50.0.5.93 | centos8cloud |        |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+

Instance before migration:

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
[centos@iperf-server ~]$ 


Source node:

[heat-admin@overcloud-computeovsdpdksriov-0 ~]$ sudo ovs-vsctl get Open_vSwitch . other_config
{dpdk-extra=" -n 4", dpdk-init="true", dpdk-lcore-mask="1001", dpdk-socket-mem="1024", emc-insert-inv-prob="100", pmd-cpu-mask="6006"}


Target node:

[heat-admin@overcloud-computeovsdpdksriov-1 ~]$ sudo ovs-vsctl get Open_vSwitch . other_config
{dpdk-extra=" -n 4", dpdk-init="true", dpdk-lcore-mask="1001", dpdk-socket-mem="1024", emc-insert-inv-prob="100", pmd-cpu-mask="6006", userspace-tso-enable="true"}


Live migration:

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server migrate --live-migration  --block-migration --host  overcloud-computeovsdpdksriov-1.localdomain  iperf_server

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-0.localdomain

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server list --all-projects --host overcloud-computeovsdpdksriov-1.localdomain
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| ID                                   | Name         | Status | Networks                                                                                           | Image        | Flavor |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+
| edf8f464-e44b-4937-ae54-21d961a27a6b | iperf_server | ACTIVE | mgmt_dpdk1=50.0.2.42, 172.20.0.223; testpmd_net_nic0_308=50.0.4.39; testpmd_net_nic1_309=50.0.5.93 | centos8cloud |        |
+--------------------------------------+--------------+--------+----------------------------------------------------------------------------------------------------+--------------+--------+

ping stats to the VM during migration:

--- 172.20.0.223 ping statistics ---
65 packets transmitted, 55 received, 15.3846% packet loss, time 1052ms


Verification inside VM post migration:

actual:

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]

expected:

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
tx-checksumming: on
        tx-checksum-ip-generic: on
scatter-gather: on
        tx-scatter-gather: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
tx-gso-robust: on [fixed]

Work around for VM to have TSO benefit from backend is to "hard" reboot it.

(overcloud) [heat-admin@overcloud-controller-0 ~]$ openstack server reboot iperf_server

[centos@iperf-server ~]$ sudo ethtool -k eth2 | grep ': on'
rx-checksumming: on [fixed]
tx-checksumming: on
        tx-checksum-ip-generic: on
scatter-gather: on
        tx-scatter-gather: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
tx-gso-robust: on [fixed]
[centos@iperf-server ~]$ 

As shown above, tso features are available to VM after hard reboot.

Comment 9 Flavio Leitner 2024-03-12 14:35:44 UTC
Hi,
The Open vSwitch 2.13 is no longer supported.
If this issue is still relevant, please open a new ticket using a supported Open vSwitch version at https://issues.redhat.com/projects/FDP
Thanks