Bug 1943637

Summary: upgrade from ocp 4.5 to 4.6 does not clear SNAT rules on ovn
Product: OpenShift Container Platform Reporter: Nabeel Cocker <ncocker>
Component: NetworkingAssignee: Tim Rozet <trozet>
Networking sub component: ovn-kubernetes QA Contact: Arti Sood <asood>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: anbhat, astoycos, rbrattai, trozet, zzhao
Version: 4.6.z   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1947097 (view as bug list) Environment:
Last Closed: 2021-07-27 22:56:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1947097    

Description Nabeel Cocker 2021-03-26 17:14:07 UTC
Description of problem:
Observed that when cluster was upgrade from OCP 4.5.16 to 4.6.17, there appear to be SNAT rules that are not getting removed/cleaned.  Output below is from and OCP 4.5.16. cluster upgraded to OCP 4.6.17

Version-Release number of selected component (if applicable):

OCP 4.6.17
How reproducible:
upgrade from 4.5.16 

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

oc rsh -n  openshift-ovn-kubernetes ovnkube-master-x2tnl
sh-4.4# 

sh-4.4# ovn-nbctl lr-nat-list GR_master-0
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             169.254.33.2                        172.10.0.0/16
sh-4.4# 
sh-4.4# 
sh-4.4# 
sh-4.4# ovn-nbctl lr-nat-list GR_master-1
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             169.254.33.2                        172.10.0.0/16
sh-4.4# 
sh-4.4# 
sh-4.4# ovn-nbctl lr-nat-list GR_master-2
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             169.254.33.2                        172.10.0.0/16
sh-4.4# 
sh-4.4# 
sh-4.4# 
sh-4.4# ovn-nbctl lr-nat-list GR_worker-13
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             169.254.33.2                        172.10.0.0/16
sh-4.4# ovn-nbctl lr-nat-list GR_worker-14
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             169.254.33.2                        172.10.0.0/16

Comment 1 Tim Rozet 2021-03-29 16:21:56 UTC
These snat rules are leftover from the "old local gateway" mode when transitioning to the new. The old local gateway mode used br-local bridge with a 169.254.x.x with ovn-k8s-gw0 as the GR. In the new local gw mode (same topology as shared) the GR connects to the shared gw bridge. During this upgrade from 4.5 (old mode) -> 4.6 (new mode) it looks like we are not removing the old snat entry. In 4.6 we deploy new local gateway mode with disable-snat-multiple-gws, which means there should be no subnet wide snat on the GR. The multiple gatways will not be snat'ed so they should not have any SNAT entries. The only thing that may have SNAT entry is egress IP.

Comment 9 errata-xmlrpc 2021-07-27 22:56:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438