Bug 2175928

Summary: ovn-controller: Incremental processing may grow conjunctive flows in size indefinitely
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Ilya Maximets <i.maximets>
Component: ovn23.03Assignee: Ales Musil <amusil>
Status: CLOSED ERRATA QA Contact: Ehsan Elahi <eelahi>
Severity: unspecified Docs Contact:
Priority: medium    
Version: FDP 23.BCC: amusil, ctrautma, jiji, jishi, mmichels
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn23.03-23.03.1-13.el9fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-10-18 00:27:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ilya Maximets 2023-03-06 20:25:51 UTC
ofctrl_add_or_append_flow() is looking for the flow with the same match
and conjunction() action present in the action list, but it doesn't check
if the same conjunction is already in the action list.

At the same time remove_flows_from_sb_to_flow(), that supposed to clean
up flows before adding new ones during incremental processing, only removes
the flow reference form the list, but doesn't remove conjunctions form the
action list.

That leads to situation where actions on a single flow may grow indefinitely
due to re-addition of the same conjunctions to the action list without ever
removing them, unless the whole flow has to be removed.

Also, broken outdated conjunctions may stay in the flow after removal of
logical flows that triggered their addition.

We were able to reproduce that behavior with hairpin_snat_ip before BZ2171423
got fixed.  But there might be other types of flows that might be affected.

Comment 1 Ilya Maximets 2023-03-06 21:03:17 UTC
On a quick glance over the code, ACLs with address sets in them might be affected,
if the address set is frequently modified without modifying the ACL itself.
But I didn't test to confirm.

Comment 2 Mark Michelson 2023-03-10 14:28:10 UTC
Is the implication that a full recompute erases the extra unnecessary conjunctions?

Comment 3 Ilya Maximets 2023-03-10 14:30:35 UTC
(In reply to Mark Michelson from comment #2)
> Is the implication that a full recompute erases the extra unnecessary
> conjunctions?

Probably, yes.

Comment 5 Ales Musil 2023-08-22 09:45:31 UTC
I actually tried to reproduce this using ACLs as this is the only thing 
with complicated conjunction that comes to my mind, but nothing. I have
tried several approaches like updating the ACLs itself, updating the
address sets in various ways, the result was always fine. I would suggest
to close this BZ and if we come across scenario in the future we can
reopen it. WDYT?

Thanks,
Ales

Comment 6 Ilya Maximets 2023-08-22 10:53:15 UTC
I'd keep it open.  It's a logical bug in ovn-controller and we need to
fix it before users will step into it, even if it's not easy to reproduce
with the current code.  If you want a solid reproducer, you may revert a
fix for BZ2171423.

For the ACls, is there some sort of recompute always involved?

Comment 7 Ales Musil 2023-08-22 11:09:15 UTC
I will probably need to try that revert or on 22.12. 
Because the ACLs are not triggering any recompute e.g.:
Node: logical_flow_output
- recompute:            0
- compute:              2
- abort:                0
Node: physical_flow_output
- recompute:            0
- compute:              2
- abort:                0
Node: controller_output
- recompute:            0
- compute:              2
- abort:                0

Comment 9 OVN Bot 2023-09-15 04:06:45 UTC
ovn23.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239060
ovn23.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239062
ovn22.12 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239065
ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239067
ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239068
ovn22.06 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239071
ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239072
ovn22.03 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239075
ovn22.03 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2239076

Comment 14 errata-xmlrpc 2023-10-18 00:27:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn23.03 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:5822