Bug 2103668 - ovnkube-node pod fails to start - unable to add OVN masquerade route to host, error: failed to add route for subnet - after upgrading to 4.10
Summary: ovnkube-node pod fails to start - unable to add OVN masquerade route to host,...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.12.0
Assignee: Tim Rozet
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-04 12:42 UTC by siva kanakala
Modified: 2023-01-17 19:51 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:51:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1289 0 None Merged [DownstreamMerge] 9-23-22 b - dualstack fixed 2022-10-11 17:50:27 UTC
Github ovn-org ovn-kubernetes pull 3136 0 None open ovn node, gw router: set node masquerade IP on br-ex 2022-09-08 22:22:46 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:51:37 UTC

Comment 3 Tim Rozet 2022-07-12 19:27:29 UTC
The issue here is that the default gateway route is changing to another interface other than br-ex before ovnkube comes up. This coupled with 4.10.18 which is missing the fix: https://github.com/ovn-org/ovn-kubernetes/pull/2782 causes an invalid route to be added via the wrong interface. With 4.10.20 and later the fix is present, however that would have also resulted in error because there is no default route available via br-ex. There are however, other non-default routes available via br-ex. In Local gateway mode all of the traffic egresses via the kernel routing stack, so egress traffic out another non-br-ex interface would work. Due to this behavior, there really is no logical reason that we *have* to use a default route next hop via br-ex for masquerade/service routes. Any next hop would do. Therefore to fix this issue we can add code to also consider non-default routes legitimate to identify next hops for these routes. I believe this should only be supported for local gateway mode. In shared gateway mode traffic egresses the br-ex bridge directly, and therefore if there is no default route via br-ex we cannot send external pod traffic. I don't really see this as a valid use case.

Comment 9 Tim Rozet 2022-09-08 22:22:32 UTC
We have come up with a better fix that requires no routes at all or a valid next hop external the cluster. This has been posted upstream and will be backportable to 4.10.

Comment 14 errata-xmlrpc 2023-01-17 19:51:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.