Bug 1964582
| Summary: | [21.E RHEL-8] Fast Datapath Release | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Timothy Redaelli <tredaelli> |
| Component: | openvswitch2.13 | Assignee: | Timothy Redaelli <tredaelli> |
| Status: | CLOSED ERRATA | QA Contact: | ovs-qe |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | FDP 21.E | CC: | ctrautma, jhsiao, kfida, ralongi |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openvswitch2.13-2.13.0-114.el8fdp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-06-21 14:24:43 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Regression executed and no major issues found. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openvswitch2.13 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2508 |
commit 160bb80a321538141b48ba2b4bcff262f2635e5d Author: Adrian Moreno <amorenoz> Date: Tue May 18 09:50:27 2021 +0200 ofp_actions: Fix set_mpls_tc formatting. Apart from a cut-and-paste typo, the man page claims that mpls_labels can be provided in hexadecimal format but that's currently not the case. Fix mpls ofp-action formatting, add size checks on ofp-action parsing and add some unit tests. Signed-off-by: Adrian Moreno <amorenoz> Signed-off-by: Ilya Maximets <i.maximets> commit 32a197f845106da48833dd6194abcce01114cbe4 Author: Ilya Maximets <i.maximets> Date: Wed Apr 21 15:48:16 2021 +0200 dpif-netdev: Remove meter rate from the bucket size calculation. Implementation of meters supposed to be a classic token bucket with 2 typical parameters: rate and burst size. Burst size in this schema is the maximum number of bytes/packets that could pass without being rate limited. Recent changes to userspace datapath made meter implementation to be in line with the kernel one, and this uncovered several issues. The main problem is that maximum bucket size for unknown reason accounts not only burst size, but also the numerical value of rate. This creates a lot of confusion around behavior of meters. For example, if rate is configured as 1000 pps and burst size set to 1, this should mean that meter will tolerate bursts of 1 packet at most, i.e. not a single packet above the rate should pass the meter. However, current implementation calculates maximum bucket size as (rate + burst size), so the effective bucket size will be 1001. This means that first 1000 packets will not be rate limited and average rate might be twice as high as the configured rate. This also makes it practically impossible to configure meter that will have burst size lower than the rate, which might be a desirable configuration if the rate is high. Inability to configure low values of a burst size and overall inability for a user to predict what will be a maximum and average rate from the configured parameters of a meter without looking at the OVS and kernel code might be also classified as a security issue, because drop meters are frequently used as a way of protection from DoS attacks. This change removes rate from the calculation of a bucket size, making it in line with the classic token bucket algorithm and essentially making the rate and burst tolerance being predictable from a users' perspective. Same change will be proposed for the kernel implementation. Unit tests changed back to their correct version and enhanced. Signed-off-by: Ilya Maximets <i.maximets> Acked-by: Eelco Chaudron <echaudro> Reviewed-by: Tonghao Zhang <xiangxia.m.yue> commit 1b4d8fe1d9cd1b558f2f093f909090c49fcda6b4 Author: Dumitru Ceara <dceara> Date: Mon May 17 11:22:27 2021 +0200 ovsdb-idl: Consider all tables when computing expected cond seqno. In ovsdb_idl_db_set_condition(), take into account all pending condition changes for all tables when computing the db->cond_seqno at which the monitor is expected to be updated. In the following scenario, with two tables, A and B, the old code performed the following steps: 1. Initial db->cond_seqno = X. 2. Client changes condition for table A: - A->new_cond gets set - expected cond seqno returned to the client: X + 1 3. ovsdb-idl sends the monitor_cond_change for table A - A->req_cond <- A->new_cond 4. Client changes condition for table B: - B->new_cond gets set - expected cond seqno returned to the client: X + 1 - however, because the condition change at step 3 is still not replied to, table B's monitor_cond_change request is not sent yet. 5. ovsdb-idl receives the reply for the condition change at step 3: - db->cond_seqno <- X + 1 6. ovsdb-idl sends the monitor_cond_change for table B 7. ovsdb-idl receives the reply for the condition change at step 6: - db->cond_seqno <- X + 2 The client was incorrectly informed that it will have all relevant updates for table B at seqno X + 1 while actually that happens later, at seqno X + 2. Fixes: 46437c5232bd ("ovsdb-idl: Enhance conditional monitoring API") Acked-by: Ben Pfaff <blp> Signed-off-by: Dumitru Ceara <dceara> (cherry picked from commit b5bb044fbe4c1395dcde5cc7d5081ef0099bb8b3) Signed-off-by: Ilya Maximets <i.maximets> commit 9acbf2cff7df1389e069eeb74a78c7f625a2459f Author: Wang Yibo <bobxxwang> Date: Thu May 6 11:14:07 2021 +0800 ovs-ofctl: Fix coredump when using "add-groups" command. When using ovs-ofctl add-groups with only "switch" argument, a coredump is generated. The main reason is that the command "ovs-ofctl add-groups" need two arguments but the limitation of min-args of this command is set to 1. Fixes: 7395c05254df ("Implement OpenFlow 1.1+ "groups" protocol.") Submitted-at: https://github.com/openvswitch/ovs/pull/360 Signed-off-by: Wang Yibo <bobxxwang> Signed-off-by: Ilya Maximets <i.maximets> commit 058702e3dcc61700bd587621d750d80985660a54 Author: Ilya Maximets <i.maximets> Date: Thu May 6 14:47:31 2021 +0200 raft: Transfer leadership before creating snapshots. With a big database writing snapshot could take a lot of time, for example, on one of the systems compaction of 300MB database takes about 10 seconds to complete. For the clustered database, 40% of this time takes conversion of the database to the file transaction json format, the rest of time is formatting a string and writing to disk. Of course, this highly depends on the disc and CPU speeds. 300MB is the very possible database size for the OVN Southbound DB, and it might be even bigger than that. During compaction the database is not available and the ovsdb-server doesn't do any other tasks. If leader spends 10-15 seconds writing a snapshot, the cluster is not functional for that time period. Leader also, likely, has some monitors to serve, so the one poll interval may be 15-20 seconds long in the end. Systems with so big databases typically has very high election timers configured (16 seconds), so followers will start election only after this significant amount of time. Once leader is back to the operational state, it will re-connect and try to join the cluster back. In some cases, this might also trigger the 'connected' state flapping on the old leader triggering a re-connection of clients. This issue has been observed with large-scale OVN deployments. One of the methods to improve the situation is to transfer leadership before compacting. This allows to keep the cluster functional, while one of the servers writes a snapshot. Additionally logging the time spent for compaction if it was longer than 1 second. This adds a bit of visibility to 'unreasonably long poll interval's. Reported-at: https://bugzilla.redhat.com/1960391 Signed-off-by: Ilya Maximets <i.maximets> Acked-by: Dumitru Ceara <dceara> commit 3bc41e2b6a170dcd6b5c728a4ed3410a6ce0865d Author: Hariprasad Govindharajan <hariprasad.govindharajan> Date: Fri Apr 23 13:16:37 2021 +0100 dpdk: Use DPDK 19.11.8 release. Modify ci linux build script to use the latest DPDK stable release. Modify Documentation to use the latest DPDK stable release 19.11.8 Update NEWS file to reflect the latest DPDK stable release. Building DPDK releases 19.11.6 and 19.11.7,with meson was found to break the OvS build. DPDK 19.11.8 was released to fix the above said issue. Link to the discussion about the issue can be found below: http://mails.dpdk.org/archives/stable/2021-April/029786.html Signed-off-by: Hariprasad Govindharajan <hariprasad.govindharajan> Acked-by: Sunil Pai G <sunil.pai.g> Signed-off-by: Ian Stokes <ian.stokes> commit fbebe7377dc00bd0627588f48d37b9b2658b6532 Author: Ilya Maximets <i.maximets> Date: Tue May 11 15:20:49 2021 +0200 github: Fix up malformed /etc/hosts. For some reason /etc/hosts in GHA now contains a plain text line like this: Note: Don't Delete this file. Also, don't remove this line. ... This breaks libunbound and makes a series of unit tests to emit following warning: |00001|dns_resolve|WARN|Failed to read etc/hosts: syntax error Working around this issue by removing a bad line from /etc/hosts until this fixed properly by GitHub team. This in combination with other fixes should unblock CI. Bug for virtual-environments: https://github.com/actions/virtual-environments/issues/3353 Signed-off-by: Ilya Maximets <i.maximets> Reviewed-by: David Marchand <david.marchand> Acked-by: Aaron Conole <aconole> commit 2a3d1f497152505e2d27febf1ff5329e3a277164 Author: Ilya Maximets <i.maximets> Date: Mon May 10 17:57:45 2021 +0200 doc: automake: Add support for sphinx 4.0. File layout for man pages in sphinx 4 by default changed [1] from: Documentation/_ref/man/page.section to: Documentation/_ref/man/section/page.section Ajusting our build scripts so they will be able to locate files in new places. This fixes our CI build. [1] https://github.com/sphinx-doc/sphinx/issues/7996 Signed-off-by: Ilya Maximets <i.maximets> Acked-by: Ben Pfaff <blp> Reviewed-by: David Marchand <david.marchand> Reviewed-by: Aaron Conole <aconole> commit 4ab5524dbb6f71232230a4a7cd7e13313c5db28b Author: Ilya Maximets <i.maximets> Date: Thu May 6 14:27:23 2021 +0200 cirrus: Look up existing versions of python dependencies. FreeBSD sometimes changes the base version of python3 that is used for packages. This affects package names. For example, currently CI is broken, because there is no more py37- versions of sphinx and openssl available, only py38- ones: pkg: No packages available to install matching 'py37-openssl' have been found in the repositories pkg: No packages available to install matching 'py37-sphinx' have been found in the repositories We had the same issue last year with 3.6 -> 3.7 transition: dfa2e3d04948 ("cirrus: Use python 3.7 packages on FreeBSD.") Fixing that by searching for a package instead of using a specific version. This should help to avoid same issues in the future. Signed-off-by: Ilya Maximets <i.maximets> Reviewed-by: Aaron Conole <aconole> Acked-by: Kevin Traynor <ktraynor> Reviewed-by: David Marchand <david.marchand> commit ea98fe0743d6902ab4b521e54cfa8edc467e703d Author: Ben Pfaff <blp> Date: Thu May 6 10:41:40 2021 -0700 ofp-group: Use big-enough buffer in ofputil_format_group(). GCC 11 pointed out that ofputil_group_to_string()'s prototype asks for a buffer with one byte more than supplied. This fixes the problem. This wasn't a buffer overflow because ofputil_group_to_string() honors the buffer size passed in, which was correct. The worst that could happen was truncating the last byte of a group name. Signed-off-by: Ben Pfaff <blp> Acked-by: Paolo Valerio <pvalerio> commit 798a40ffb6b8702630c8c1298a43bcd9ec399259 Author: Gavin Li <gavinl> Date: Mon Apr 19 10:26:04 2021 +0300 ofproto/ofproto-dpif-sflow: Check sflow agent in case of race sFlow agent may not exist while calling dpif_sflow_received. The sFlow may be deleted in another thread, eg, by user, which will cause crash. Signed-off-by: Gavin Li <gavinl> Reviewed-by: Gavi Teitz <gavi> Signed-off-by: Simon Horman <simon.horman> commit 5641bf9d72c0bb4a56a4a84dfb9d673ef60b42ab Author: Timothy Redaelli <tredaelli> Date: Tue Apr 20 17:23:25 2021 +0200 Fix typo in rh-mock-srpm Thanks fbl for reporting commit 9984d6112495934bdb3ec668830ddcd0df7399c8 Author: Ilya Maximets <i.maximets> Date: Sun Apr 4 14:43:52 2021 +0200 dpif: Fix use of uninitialized execute hash. 'dpif_execute_helper_cb' doesn't initilalize the 'hash' field that may be passed down to datapath and might cause execution of a different set of actions, e.g. on recirculation. Thread 6 handler27: Conditional jump or move depends on uninitialised value(s) at 0x53A2C2: dpif_netlink_encode_execute (dpif-netlink.c:1841) by 0x53A2C2: dpif_netlink_operate__ (dpif-netlink.c:1919) by 0x53A82D: dpif_netlink_operate_chunks (dpif-netlink.c:2238) by 0x53A82D: dpif_netlink_operate (dpif-netlink.c:2297) by 0x48135F: dpif_operate (dpif.c:1366) by 0x481923: dpif_execute.part.24 (dpif.c:1320) by 0x481C46: dpif_execute (dpif.c:1312) by 0x481C46: dpif_execute_helper_cb (dpif.c:1243) by 0x4AE943: odp_execute_actions (odp-execute.c:865) by 0x47F272: dpif_execute_with_help (dpif.c:1296) by 0x4812FF: dpif_operate (dpif.c:1422) by 0x442226: handle_upcalls (ofproto-dpif-upcall.c:1617) by 0x442226: recv_upcalls.isra.36 (ofproto-dpif-upcall.c:855) by 0x442351: udpif_upcall_handler (ofproto-dpif-upcall.c:755) by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383) by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so) by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so) Uninitialised value was created by a stack allocation at 0x481966: dpif_execute_helper_cb (dpif.c:1159) Additionally added a missing comment to the 'struct dpif_execute'. Fixes: 0442bfb11d6c ("ofproto-dpif-upcall: Echo HASH attribute back to datapath.") Acked-by: Tonghao Zhang <xiangxia.m.yue> Signed-off-by: Ilya Maximets <i.maximets> commit 43c2a10922b9a0efee4b1c12fbefb257fc242707 Author: Ilya Maximets <i.maximets> Date: Sun Apr 4 14:32:29 2021 +0200 odp-util: Fix use of uninitialized erspan metadata. 'struct erspan_metadata' contains union with fields of different sizes, hence not all the memory initiliazed. This memory goes to syscalls and also used to compare ukeys with memcmp which may cause unexpected behavior. Thread 15 revalidator13: Conditional jump or move depends on uninitialised value(s) at 0x4C377B6: bcmp (vg_replace_strmem.c:1111) by 0x43F844: ofpbuf_equal (ofpbuf.h:273) by 0x43F844: revalidate_ukey__ (ofproto-dpif-upcall.c:2227) by 0x43F9C9: revalidate_ukey (ofproto-dpif-upcall.c:2294) by 0x4425C2: revalidate.isra.33 (ofproto-dpif-upcall.c:2734) by 0x4434B8: udpif_revalidator (ofproto-dpif-upcall.c:943) by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383) by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so) by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so) Uninitialised value was created by a stack allocation at 0x4B1CE0: tun_key_to_attr (odp-util.c:3129) Fixes: 98514eea21f4 ("erspan: add kernel datapath support") Acked-by: William Tu <u9012063> Signed-off-by: Ilya Maximets <i.maximets> commit 4f68040a5c5242197bd518e3d54564addf786829 Author: Yunjian Wang <wangyunjian> Date: Thu Apr 8 21:33:12 2021 +0800 dpif-netlink: Fix using uninitialized info.tc_modify_flow_deleted in out label. Before info.tc_modify_flow_deleted is assigned a value, error processing of other statements goes to the out label. In the out label, the uninitialized variant is used for condition determination, which may cause uncertain behavior. Fixes: 65b84d4a32bd ("dpif-netlink: avoid netlink modify flow put op failed after tc modify flow put op failed.") Signed-off-by: Mengfan Lv <lvmengfan> Signed-off-by: Yunjian Wang <wangyunjian> Reviewed-by: Simon Horman <simon.horman> Signed-off-by: Ilya Maximets <i.maximets> commit 1c2a5f170fd98b4884b39c249c5d215a80d511de Author: Paul Blakey <paulb> Date: Fri Apr 9 09:35:35 2021 -0400 netdev-offload-tc: Probe for support for any of the ct_state flags. Upstream kernel now rejects unsupported ct_state flags. Earlier kernels, ignored it but still echoed back the requested ct_state, if ct_state was supported. ct_state initial support had trk, new, est, and rel flags. If kernel echos back ct_state, assume support for trk, new, est, and rel. If kernel rejects a specific unsupported flag, continue and use reject mechanisim to probe for flags rep and inv. Disallow inserting rules with unnsupported ct_state flags. Signed-off-by: Paul Blakey <paulb> Acked-by: Roi Dayan <roid> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner> Signed-off-by: Simon Horman <simon.horman> (cherry picked from commit 1e4aa061ac8b71196c0b0c8ab23d1627cd2e4a27) Signed-off-by: Aaron Conole <aconole> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner> Signed-off-by: Ilya Maximets <i.maximets> commit a4730043a43e44bd66eb658849aca1425e3a260f Author: Paul Blakey <paulb> Date: Fri Apr 9 09:35:34 2021 -0400 compat: Add ct_state flags definitions. Add TCA_FLOWER_KEY_CT_FLAGS_REPLY, and TCA_FLOWER_KEY_CT_FLAGS_INVALID. Signed-off-by: Paul Blakey <paulb> Acked-by: Roi Dayan <roid> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner> Signed-off-by: Simon Horman <simon.horman> (cherry picked from commit 0a8bd432ae0cf750b5560312343ddaa05af8c876) Signed-off-by: Aaron Conole <aconole> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner> Signed-off-by: Ilya Maximets <i.maximets> commit bbd62d7a444e889ea7e05f7f42bedf76e8178f52 Author: Roi Dayan <roid> Date: Fri Apr 9 09:35:33 2021 -0400 tc: Use skip_hw flag when probing tc features. There is no need to pass tc rules to hw when just probing for tc features. this will avoid redundant errors from hw drivers that may happen. Signed-off-by: Roi Dayan <roid> Acked-By: Vlad Buslov <vladbu> Reviewed-by: Tonghao Zhang <xiangxia.m.yue> Signed-off-by: Simon Horman <simon.horman> (cherry picked from commit d5659751f65ebc17d9aec40b60c1cff3a2d87162) Signed-off-by: Aaron Conole <aconole> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner> Signed-off-by: Ilya Maximets <i.maximets>