Hi all,
There's a fix that has been added to OVN since this issue was filed and that are intended to help with this situation.
Ales added a delay for multicast ARP packets. This ensures that full recomputes are not required due to race conditions that occur when multiple controllers receive a GARP simultaneously. This was backported to 21.12 already and is first present in ovn-2021-21.12.0-94 .
Updating to this version *should* prevent the constant 100% CPU issue.
Also, there is a secondary patch that is present in ovn22.09+ that adds MAC_Binding aging. This ensures that MAC_Bindings are eventually deleted after a certain time. This was alluded to in the copied comments above. This has not been backported to ovn-2021 because it's a new feature, not a bug fix. While it is likely to result in smaller SB database sizes, it's not expected to contribute directly to fixing the 100% CPU issue.
It would be good to know if an update to ovn-2021 to -94 or newer fixes the problem. The original issue was opened on 30 September 2022, and the fix in -94 was committed 20 October 2022. So it's reasonable to assume that an update could alleviate the problem. Please let us know if this helps.