The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2175928 - ovn-controller: Incremental processing may grow conjunctive flows in size indefinitely
Summary: ovn-controller: Incremental processing may grow conjunctive flows in size ind...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn23.03
Version: FDP 23.B
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
: ---
Assignee: Ales Musil
QA Contact: Ehsan Elahi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-06 20:25 UTC by Ilya Maximets
Modified: 2023-10-18 00:27 UTC (History)
5 users (show)

Fixed In Version: ovn23.03-23.03.1-13.el9fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-10-18 00:27:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2721 0 None None None 2023-03-06 20:26:56 UTC
Red Hat Product Errata RHBA-2023:5822 0 None None None 2023-10-18 00:27:51 UTC

Description Ilya Maximets 2023-03-06 20:25:51 UTC
ofctrl_add_or_append_flow() is looking for the flow with the same match
and conjunction() action present in the action list, but it doesn't check
if the same conjunction is already in the action list.

At the same time remove_flows_from_sb_to_flow(), that supposed to clean
up flows before adding new ones during incremental processing, only removes
the flow reference form the list, but doesn't remove conjunctions form the
action list.

That leads to situation where actions on a single flow may grow indefinitely
due to re-addition of the same conjunctions to the action list without ever
removing them, unless the whole flow has to be removed.

Also, broken outdated conjunctions may stay in the flow after removal of
logical flows that triggered their addition.

We were able to reproduce that behavior with hairpin_snat_ip before BZ2171423
got fixed.  But there might be other types of flows that might be affected.

Comment 1 Ilya Maximets 2023-03-06 21:03:17 UTC
On a quick glance over the code, ACLs with address sets in them might be affected,
if the address set is frequently modified without modifying the ACL itself.
But I didn't test to confirm.

Comment 2 Mark Michelson 2023-03-10 14:28:10 UTC
Is the implication that a full recompute erases the extra unnecessary conjunctions?

Comment 3 Ilya Maximets 2023-03-10 14:30:35 UTC
(In reply to Mark Michelson from comment #2)
> Is the implication that a full recompute erases the extra unnecessary
> conjunctions?

Probably, yes.

Comment 5 Ales Musil 2023-08-22 09:45:31 UTC
I actually tried to reproduce this using ACLs as this is the only thing 
with complicated conjunction that comes to my mind, but nothing. I have
tried several approaches like updating the ACLs itself, updating the
address sets in various ways, the result was always fine. I would suggest
to close this BZ and if we come across scenario in the future we can
reopen it. WDYT?

Thanks,
Ales

Comment 6 Ilya Maximets 2023-08-22 10:53:15 UTC
I'd keep it open.  It's a logical bug in ovn-controller and we need to
fix it before users will step into it, even if it's not easy to reproduce
with the current code.  If you want a solid reproducer, you may revert a
fix for BZ2171423.

For the ACls, is there some sort of recompute always involved?

Comment 7 Ales Musil 2023-08-22 11:09:15 UTC
I will probably need to try that revert or on 22.12. 
Because the ACLs are not triggering any recompute e.g.:
Node: logical_flow_output
- recompute:            0
- compute:              2
- abort:                0
Node: physical_flow_output
- recompute:            0
- compute:              2
- abort:                0
Node: controller_output
- recompute:            0
- compute:              2
- abort:                0

Comment 9 OVN Bot 2023-09-15 04:06:45 UTC
ovn23.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239060
ovn23.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239062
ovn22.12 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239065
ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239067
ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239068
ovn22.06 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239071
ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239072
ovn22.03 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239075
ovn22.03 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239076

Comment 14 errata-xmlrpc 2023-10-18 00:27:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn23.03 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:5822


Note You need to log in before you can comment on or make changes to this bug.