commit 90b219275d167a49a7743fef1fadb14a66231d94 Author: Eli Britstein <elibr> Date: Mon Jul 26 11:14:52 2021 +0300 dpif-netdev: Do not flush PMD offloads on reload. Before flushing offloads of a removed port was supported by [1], it was necessary to flush the 'marks'. In doing so, all offloads of the PMD are removed, include the ones that are not related to the removed port and that are not modified following this removal. As a result such flows are evicted from being offloaded, and won't resume offloading. As PMD offload flush is not necessary, avoid it. [1] 62d1c28e9ce0 ("dpif-netdev: Flush offload rules upon port deletion.") Signed-off-by: Eli Britstein <elibr> Reviewed-by: Gaetan Rivet <gaetanr> Reviewed-by: David Marchand <david.marchand> Signed-off-by: Ilya Maximets <i.maximets> commit b29b04f85fe0906f7472bc135ae8bf6b22481e24 Author: Eli Britstein <elibr> Date: Mon Jul 26 11:14:54 2021 +0300 dpif-netdev: Fix offloads of modified flows. Association of a mark to a flow is done as part of its offload handling, in the offloading thread. However, the PMD thread specifies whether an offload request is an "add" or "modify" by the association of a mark to the flow. This is exposed to a race condition. A flow might be created with actions that cannot be fully offloaded, for example flooding (before MAC learning), and later modified to have actions that can be fully offloaded. If the two requests are queued before the offload thread handling, they are both marked as "add". When the offload thread handles them, the first request is partially offloaded, and the second one is ignored as the flow is already considered as offloaded. Fix it by specifying add/modify of an offload request by the actual flow state change, without relying on the mark. Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.") Signed-off-by: Eli Britstein <elibr> Reviewed-by: Gaetan Rivet <gaetanr> Signed-off-by: Ilya Maximets <i.maximets> commit 1d0b89ea7b6fe4aaab3addfa3b19017b96455293 Author: Eli Britstein <elibr> Date: Mon Jul 26 11:14:53 2021 +0300 dpif-netdev: Fix flow modification after failure. dp_netdev_flow_offload_main thread is asynchronous, by the cited commit. There might be a case where there are modification requests of the same flow submitted before handled. Then, if the first handling fails, the rule for the flow is deleted, and the mark is freed. Then, the following one should not be handled as a modification, but rather as an "add". Fixes: 02bb2824e51d ("dpif-netdev: do hw flow offload in a thread") Signed-off-by: Eli Britstein <elibr> Signed-off-by: Ilya Maximets <i.maximets> commit 8d84a4b1668b712a428c336d3db587cd08d8261f Author: Eli Britstein <elibr> Date: Sun Jul 25 11:54:36 2021 +0300 netdev-offload-dpdk: Fix IPv6 rewrite cast-align warning. Fixes: b6207b1d2711 ("netdev-offload-dpdk: Support offload of set IPv6 actions.") Signed-off-by: Eli Britstein <elibr> Signed-off-by: Ilya Maximets <i.maximets> commit f3f7849cbbc8241c7472a0c0468361297e5e7c28 Author: Ilya Maximets <i.maximets> Date: Fri Jul 23 15:41:20 2021 +0200 daemon-unix: Fix leak of a fork error message. 19 bytes in 1 blocks are definitely lost in loss record 24 of 121 at 0x4839748: malloc (vg_replace_malloc.c:306) by 0x483BD63: realloc (vg_replace_malloc.c:834) by 0x521C26: xrealloc (util.c:149) by 0x478F91: ds_reserve (dynamic-string.c:63) by 0x47928B: ds_put_format_valist (dynamic-string.c:161) by 0x47920A: ds_put_format (dynamic-string.c:142) by 0x506DE5: process_status_msg (process.c:0) by 0x52A6D0: fork_and_wait_for_startup (daemon-unix.c:284) by 0x52A54D: daemonize_start (daemon-unix.c:453) by 0x40EB3E: main (ovs-vswitchd.c:91) Fixes: b925336a36e6 ("daemon: restart child process if it died before signaling its readiness") Signed-off-by: Ilya Maximets <i.maximets> Acked-by: Roi Dayan <roid> commit 8aa0f037471d1708fe94d9377d4999efda06703a Author: Dumitru Ceara <dceara> Date: Wed Jul 21 14:51:00 2021 +0200 ovsdb-cs: Perform forced reconnects without a backoff. The ovsdb-cs layer triggers a forced reconnect in various cases: - when an inconsistency is detected in the data received from the remote server. - when the remote server is running in clustered mode and transitioned to "follower", if the client is configured in "leader-only" mode. - when explicitly requested by upper layers (e.g., by the user application, through the IDL layer). In such cases it's desirable that reconnection should happen as fast as possible, without the current exponential backoff maintained by the underlying reconnect object. Furthermore, since 3c2d6274bcee ("raft: Transfer leadership before creating snapshots."), leadership changes inside the clustered database happen more often and, therefore, "leader-only" clients need to reconnect more often too. Forced reconnects call jsonrpc_session_force_reconnect() which will not reset backoff. To make sure clients reconnect as fast as possible in the aforementioned scenarios we first call the new API, jsonrpc_session_reset_backoff(), in ovsdb-cs, for sessions that are in state CS_S_MONITORING (i.e., the remote is likely still alive and functioning fine). jsonrpc_session_reset_backoff() resets the number of backoff-free reconnect retries to the number of remotes configured for the session, ensuring that all remotes are retried exactly once with backoff 0. This commit also updates the Python IDL and jsonrpc implementations. The Python IDL wasn't tracking the IDL_S_MONITORING state explicitly, we now do that too. Tests were also added to make sure the IDL forced reconnects happen without backoff. Reported-at: https://bugzilla.redhat.com/1977264 Suggested-by: Ilya Maximets <i.maximets> Signed-off-by: Dumitru Ceara <dceara> Signed-off-by: Ilya Maximets <i.maximets> commit ee4e034dc9a4d1134d37545a86733ed22452d9cb Author: Wilson Peng <pweisong> Date: Tue Jul 20 00:23:57 2021 -0700 datapath-windows:Correct checksum for DNAT action While testing OVS-windows flows for the DNAT action, the checksum In TCP header is set incorrectly when TCP offload is enabled by Default. As a result, the packet will be dropped on receiver linuxVM. >>>sample flow default configuration on both Windows VM and Linux VM (src=40.0.1.2,dst=10.150.0.1) --dnat--> (src=40.0.1.2,dst==30.1.0.2) Without the fix for some TCP packet(40.0.1.2->30.1.0.2 with payload len 207) the TCP checksum will be pseduo header checksum and the value is 0x01d6. With the fix the checksum will be 0x47ee, it could be got the correct TCP checksum on the receiver Linux VM. Signed-off-by: Wilson Peng<pweisong> Signed-off-by: Anand Kumar<kumaranand> Acked-by: Alin-Gabriel Serdean <aserdean> Signed-off-by: Alin-Gabriel Serdean <aserdean> commit 72132a9403a70cd89d84d0148c0ddddba3d993bc Author: Ilya Maximets <i.maximets> Date: Tue Jul 13 17:32:06 2021 +0200 bond: Fix broken rebalancing after link state changes. There are 3 constraints for moving hashes from one member to another: 1. The load difference exceeds ~ 3% of the load of one member. 2. The difference in load between members exceeds 100,000 bytes. 3. Moving the hash reduces the load difference by more than 10%. In the current implementation, if one of the members transitions to the DOWN state, all hashes assigned to it will be moved to the other members. After that, if the member goes UP, it will wait for rebalancing to get hashes. But in case we have more than 10 equally loaded hashes, it will never meet constraint # 3, because each hash will handle less than 10% of the load. The situation gets worse when the number of flows grows and it is almost impossible to transfer any hash when all 256 hash records are used, which is very likely when we have few hundred/thousand flows. As a result, if one of the members goes down and back up while traffic flows, it will never be used to transmit packets again. This will not be fixed even if we completely stop the traffic and start it again, because the first two constraints will block rebalancing in the earlier stages, while we have low traffic volume. Moving a single hash if the destination does not have any hashes, as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the rebalancing logic"), will not help, because a single hash is not enough to make the difference in load less than 10% of the total load, and this member will handle only that one hash forever. To fix this, let's try to move multiple hashes at the same time to meet constraint # 3. The implementation includes sorting the "records" to be able to collect records with a cumulative load close enough to the ideal value. Acked-by: Ben Pfaff <blp> Signed-off-by: Ilya Maximets <i.maximets> commit aa84cfe25d5d6614ebce1bfca96a5c645eb3fd20 Author: Mark Gray <mark.d.gray> Date: Fri Jul 16 06:17:35 2021 -0400 dpif-netlink: Fix report_loss() message. Fixes: 1579cf677fcb ("dpif-linux: Implement the API functions to allow multiple handler threads read upcall.") Signed-off-by: Mark Gray <mark.d.gray> Acked-by: Flavio Leitner <fbl> Acked-by: Aaron Conole <aconole> Signed-off-by: Ilya Maximets <i.maximets> commit aec05f7cd18477f83597dad40617ab84e43cc64a Author: Dumitru Ceara <dceara> Date: Wed Jul 14 09:21:19 2021 +0200 ovsdb-server: Fix memleak when failing to read storage. Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.") Signed-off-by: Dumitru Ceara <dceara> Signed-off-by: Ben Pfaff <blp> commit 05bdf11fc31f332b98fdca589062a77dad2f58d5 Author: Gaetan Rivet <grive> Date: Wed Jun 16 01:22:46 2021 +0200 conntrack: Init hash basis first at creation. The 'hash_basis' field is used sometimes during sub-systems init routine. It will be 0 by default before randomization. Sub-systems would then init some nodes with incorrect hash values. The timeout policies module is affected, making the default policy being referenced using an incorrect hash value. Fixes: 2078901a4c14 ("userspace: Add conntrack timeout policy support.") Signed-off-by: Gaetan Rivet <grive> Reviewed-by: Eli Britstein <elibr> Acked-by: William Tu <u9012063> Signed-off-by: Ilya Maximets <i.maximets> commit 94e3b9d9ce90934f00963ebdf661ecc918eb208c Author: Eelco Chaudron <echaudro> Date: Mon Jul 5 07:57:41 2021 -0400 netdev-linux: Ignore TSO packets when TSO is not enabled for userspace. When TSO is disabled from a userspace forwarding datapath perspective, but TSO has been wrongly enabled on the kernel side, log a warning message, and drop the packet. With the current implementation, OVS will crash. [i.maximets]: The call stack looks like this: 0 dp_packet_set_size (b=0x0, b=0x0, v=13028) at lib/dp-packet.h:578 1 netdev_linux_batch_rxq_recv_sock at lib/netdev-linux.c:1310 2 netdev_linux_rxq_recv at lib/netdev-linux.c 3 netdev_rxq_recv at lib/netdev.c 4 dp_netdev_process_rxq_port at lib/dpif-netdev.c The problem is that the code assumes that (mmsgs[i].msg_len > std_len) can only be true if userpace-tso is enabled and additional buffers are provided to the kernel. However, since recvmmsg() is called with MSG_TRUNC, the resulting msg_len reflects the original packet size before truncation, and it can be larger than the buffer if TSO / GRO is enabled on the network interface. If TSO support for user space is not enabled in OVS, the aux_bufs are not allocated and are left NULL, resulting in a crash. Fixes: 73858f9dbe83 ("netdev-linux: Prepend the std packet in the TSO packet") Fixes: 2109841b7984 ("Use batch process recv for tap and raw socket in netdev datapath") Signed-off-by: Eelco Chaudron <echaudro> Acked-by: Flavio Leitner <fbl> Signed-off-by: Ilya Maximets <i.maximets> commit 842bfb899f325fe758ac41b60261e9cc014abb35 Author: Paolo Valerio <pvalerio> Date: Tue Jul 6 15:03:05 2021 +0200 conntrack: Handle already natted packets. When a packet gets dnatted and then recirculated, it could be possible that it matches another rule that performs another nat action. The kernel datapath handles this situation turning to a no-op the second nat action, so natting only once the packet. In the userspace datapath instead, when the ct action gets executed, an initial lookup of the translated packet fails to retrieve the connection related to the packet, leading to the creation of a new entry in ct for the src nat action with a subsequent failure of the connection establishment. with the following flows: table=0,priority=30,in_port=1,ip,nw_dst=192.168.2.100,actions=ct(commit,nat(dst=10.1.1.2:80),table=1) table=0,priority=20,in_port=2,ip,actions=ct(nat,table=1) table=0,priority=10,ip,actions=resubmit(,2) table=0,priority=10,arp,actions=NORMAL table=0,priority=0,actions=drop table=1,priority=5,ip,actions=ct(commit,nat(src=10.1.1.240),table=2) table=2,in_port=ovs-l0,actions=2 table=2,in_port=ovs-r0,actions=1 Establishing a connection from 10.1.1.1 to 192.168.2.100 the outcome is: tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=4000,dport=80), reply=(src=10.1.1.2,dst=10.1.1.240,sport=80,dport=4000), protoinfo=(state=ESTABLISHED) tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=4000,dport=80), reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=4000), protoinfo=(state=ESTABLISHED) With this patch applied the outcome is: tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=4000,dport=80), reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=4000), protoinfo=(state=ESTABLISHED) The patch performs, for already natted packets, a lookup of the reverse key in order to retrieve the related entry, it also adds a test case that besides testing the scenario ensures that the other ct actions are executed. Reported-by: Dumitru Ceara <dceara> Signed-off-by: Paolo Valerio <pvalerio> Acked-by: Dumitru Ceara <dceara> Signed-off-by: Ilya Maximets <i.maximets> commit ab873c1afe05d2ac73503edfccf395b7d58a6136 Author: Eelco Chaudron <echaudro> Date: Thu Jun 10 11:24:15 2021 +0200 conntrack: Document all-zero IP SNAT behavior and add a test case. Currently, conntrack in the kernel has an undocumented feature referred to as all-zero IP address SNAT. Basically, when a source port collision is detected during the commit, the source port will be translated to an ephemeral port. If there is no collision, no SNAT is performed. This patchset documents this behavior and adds a self-test to verify it's not changing. In addition, a datapath feature flag is added for the all-zero IP SNAT case. This will help applications on top of OVS, like OVN, to determine this feature can be used. Signed-off-by: Eelco Chaudron <echaudro> Acked-by: Aaron Conole <aconole> Acked-by: Dumitru Ceara <dceara> Acked-by: Alin-Gabriel Serdean <aserdean> Acked-by: Paolo Valerio <pvalerio> Signed-off-by: Ilya Maximets <i.maximets> commit 86d6a9ee14456a76ef4a5d4bf633eca6bb7d0cd2 Author: Bodo Petermann <b.petermann> Date: Wed Jun 16 12:32:14 2021 +0200 python: Fix Idl.run change_seqno update. Fix an issue where Idl.run() returned False even if there was a change. If Idl.run() reads multiple messages from the database server, some may constitute changes and some may not. Changed the way change_seqno is reset: if a message is not a change, reset change_seqno only to the value before reading this message, not to the value before reading the first message. This will fix the return value in a scenario where some message was a change and the last one wasn't. The new change_seqno will now be the value after handling the message with the last change. Fixes: c39751e44539 ("python: Monitor Database table to manage lifecycle of IDL client.") Signed-off-by: Bodo Petermann <b.petermann> Acked-by: Alin Gabriel Serdean <aserdean> Signed-off-by: Ilya Maximets <i.maximets> commit 1ba0c83655a6baeb8b508357cbdf236cb7a834e2 Author: Ben Pfaff <blp> Date: Tue Jul 6 15:37:09 2021 -0700 bridge: Use correct (legacy) role names in database. The vswitchd database schema requires role names to be "master" or "slave", but this code tried to use "primary" and "secondary". Signed-off-by: Ben Pfaff <blp> Reported-at: https://github.com/openvswitch/ovs-issues/issues/218 Tested-at: https://github.com/openvswitch/ovs-issues/issues/218#issuecomment-875374045 Fixes: 807152a4ddfb ("Use primary/secondary, not master/slave, as names for OpenFlow roles.") commit 7e5293ea5a6864db446baf872546f34e9b5efdd0 Author: Ilya Maximets <i.maximets> Date: Thu Jul 1 20:19:52 2021 +0200 Prepare for 2.15.2. Acked-by: Aaron Conole <aconole> Signed-off-by: Ilya Maximets <i.maximets> commit b855bbc326a2a77b62003c593c074e8ae07a03e9 Author: Ilya Maximets <i.maximets> Date: Thu Jul 1 20:19:52 2021 +0200 Set release date for 2.15.1. Acked-by: Aaron Conole <aconole> Signed-off-by: Ilya Maximets <i.maximets> commit 007a4f48fec60df5347c1a19cbcf7f82111a8ea2 Author: Timothy Redaelli <tredaelli> Date: Thu Jun 24 00:09:50 2021 +0200 dpif-netdev: Apply subtable-lookup-prio-set on any datapath. Currently, if you try to set subtable-lookup-prio-set when you don't have any datapath (for example if an user wants to set AVX512 before creating any bridge) it sets it globally (dpcls_subtable_set_prio), but it returns an error: please specify an existing datapath ovs-appctl: ovs-vswitchd: server returned an error and, in this case, the exit code of ovs-appctl is 2. This commit changes the behaviour by removing the [datapath] optional parameter of subtable-lookup-prio-set and by changing the priority level on any datapath and globally. This means if you don't have any datapath or if you have only one datapath, the behaviour is the same as now, but without the confusing error when you don't have any datapath. Fixes: 3d018c3ea79d ("dpif-netdev: add subtable lookup prio set command.") Signed-off-by: Timothy Redaelli <tredaelli> Acked-by: Harry van Haaren <harry.van.haaren> Signed-off-by: Ilya Maximets <i.maximets> commit c93358a563d4a847f71486a58e744068e70ac589 Author: Toms Atteka <cpp.code.lv> Date: Mon Jun 7 11:31:39 2021 -0700 netlink: removed incorrect optimization This optimization caused FLOW_TNL_F_UDPIF flag not to be used in hash calculation for geneve tunnel when revalidating flows which resulted in different cache hash values and incorrect behaviour. Added test to prevent regression. CC: Jesse Gross <jesse> Fixes: 6728d578f64e ("dpif-netdev: Translate Geneve options per-flow, not per-packet.") Reported-at: https://github.com/vmware-tanzu/antrea/issues/897 Signed-off-by: Toms Atteka <cpp.code.lv> Acked-by: Ansis Atteka <aatteka> Signed-off-by: Ilya Maximets <i.maximets> commit 31626579fa7ae5f6b2987110c08b3dd185fc893f Author: Paolo Valerio <pvalerio> Date: Wed May 26 00:26:00 2021 +0200 ovs-actions.xml: Add missing bracket. A bracket is apparently missing in ovs-actions(7). The patch changes the relevant row from: ct(argument]...) to: ct([argument]...) Signed-off-by: Paolo Valerio <pvalerio> Signed-off-by: Ilya Maximets <i.maximets> commit 30596ec27855fa938d41dde4cb54edd6275a2e53 Author: Paolo Valerio <pvalerio> Date: Wed Apr 7 13:43:06 2021 +0200 netdev-offload-tc: Use nl_msg_put_flag for OVS_TUNNEL_KEY_ATTR_CSUM. When a tunnel port gets added to the bridge setting the checksum option to true: ovs-vsctl add-port br0 geneve0 \ -- set interface geneve0 type=geneve \ options:remote_ip=<remote_ip> options:key=<key> options:csum=true the flow dump for the outgoing traffic will include a "bad key length 1 ..." message: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, key6(bad key length 1, expected 0)(01)flags(key))) ,genev_sys_6081 This is due to a mismatch present between the expected length (zero for OVS_TUNNEL_KEY_ATTR_CSUM in ovs_tun_key_attr_lens) and the current one. With this patch the same flow dump becomes: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, flags(csum|key))),genev_sys_6081 Fixes: d9677a1f0eaf ("netdev-tc-offloads: TC csum option is not matched with tunnel configuration") Suggested-by: Ilya Maximets <i.maximets> Signed-off-by: Paolo Valerio <pvalerio> Acked-by: Eelco Chaudron <echaudro> Signed-off-by: Ilya Maximets <i.maximets> commit 728980291a17478d961429f8c21dcddfc025759b Author: Paolo Valerio <pvalerio> Date: Mon Mar 8 00:24:40 2021 +0100 conntrack: Increment coverage counter for all bad checksum cases. conntrack_l4csum_err gets incremented only when corrupted icmp pass through conntrack. Increase it for the remaining bad checksum cases including when checksum is offloaded. Fixes: 38c69ccf8e29 ("conntrack: Add coverage count for l4csum error.") Signed-off-by: Paolo Valerio <pvalerio> Acked-by: Tonghao Zhang <xiangxia.m.yue> Signed-off-by: Ilya Maximets <i.maximets>
Regression run and no blockers found.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openvswitch2.15 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3449