Bug 2106862

Summary: After ovnkube-node restart, external traffic policy local no longer works
Product: OpenShift Container Platform Reporter: Tim Rozet <trozet>
Component: NetworkingAssignee: Surya Seetharaman <surya>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: surya
Version: 4.10   
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:52:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 2106855    

Description Tim Rozet 2022-07-13 17:23:43 UTC
Description of problem:
With local gateway mode when external traffic policy is used the source address should not be NAT'ed on its way to the destination endpoint. This works as expected until ovnkube-node is restarted. At this point the iptables rules get configured in the wrong order, so that the packets are SNAT'ed to teh management (mp0) interface:

Before restart:
Chain OVN-KUBE-SNAT-MGMTPORT (1 references)
target     prot opt source               destination         
RETURN     tcp  --  anywhere             anywhere             tcp dpt:30486
RETURN     udp  --  anywhere             anywhere             udp dpt:32397
SNAT       all  --  anywhere             anywhere             /* OVN SNAT to Management Port */ to:10.244.1.2


After restart:

Chain OVN-KUBE-SNAT-MGMTPORT (1 references)
target     prot opt source               destination
SNAT       all  --  anywhere             anywhere             /* OVN SNAT to Management Port */ to:10.244.1.2
RETURN     tcp  --  anywhere             anywhere             tcp dpt:30486
RETURN     udp  --  anywhere             anywhere             udp dpt:32397

Accessing service endpoint before and after restart:
[root@trozet3 /]# python -m SimpleHTTPServer 80
Serving HTTP on 0.0.0.0 port 80 ...
172.18.0.4 - - [13/Jul/2022 17:11:13] "GET / HTTP/1.1" 200 -
172.18.0.4 - - [13/Jul/2022 17:15:36] "GET / HTTP/1.1" 200 -
10.244.1.2 - - [13/Jul/2022 17:16:57] "GET / HTTP/1.1" 200 - <---After restart, ip is SNAT'ed to mp0

Comment 1 Tim Rozet 2022-07-13 17:24:33 UTC
Related upstream issue: https://github.com/ovn-org/ovn-kubernetes/issues/2969

Comment 2 Surya Seetharaman 2022-07-15 10:20:19 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=2107309#c1 will fix the cause of this issue

Comment 7 errata-xmlrpc 2023-01-17 19:52:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399