Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2130939

Summary: OVN-Kubernetes: SNAT not applied for existing connections during egress IP failover
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Patryk Diak <pdiak>
Component: OVNAssignee: Dumitru Ceara <dceara>
Status: CLOSED WONTFIX QA Contact: Jianlin Shi <jishi>
Severity: urgent Docs Contact:
Priority: urgent    
Version: FDP 22.LCC: ctrautma, dceara, jiji, mmichels, skanakal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-13 10:34:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Patryk Diak 2022-09-29 13:58:53 UTC
Description of problem:

During egress IP failover traffic sent from a matching pod is redirected to a new node where it should be SNATed on the GR.
It was observed that the SNAT is not always applied when the connection was initialized on on one node and egress IP moved to another one.

OVN-Kubernetes bug: https://issues.redhat.com/browse/OCPBUGS-283
Slack thread: https://coreos.slack.com/archives/C01G7T6SYSD/p1664296212811939


Version-Release number of selected component (if applicable):
OpenShift version: 4.12.0-0.ci-2022-09-28-013725                                                                                                                                                                                                                                                                                                                                                                              
ovn-nbctl 22.06.1                                                                                                                                                                                                                                                                                                                                                                                                                     
Open vSwitch Library 2.17.90                                                                                                                                                                                                                                                                                                                                                                                                          
DB Schema 6.3.0  

See the slack thread for reproducer: https://coreos.slack.com/archives/C01G7T6SYSD/p1664379654907989?thread_ts=1664296212.811939&cid=C01G7T6SYSD

egress IP: 10.0.128.101
pod IP:10.0.128.101

Egress IP node before failover: pdiak-09-28-2022-6ml7m-worker-c-rlgb4
Egress IP node after failover: pdiak-09-28-2022-6ml7m-worker-b-7l2x8

Please find the attached network must-gathers for NB/SB databases.

Actual results:
After egress IP failover packets belonging to existing connections are not always SNATed and are sent out with POD ip as source.

Expected results:
After egress IP failover packets that belong to an existing connection should be SNATed

Comment 6 Dumitru Ceara 2022-11-30 15:29:42 UTC
Checking the datapath flows after egress IP moved, on the chassis that
now owns the egress IP, for the FIN+ACK packet we see:

recirc_id(0x40),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-rpl+trk),eth(),eth_type(0x0800),ipv4(src=10.244.0.6,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct(commit,nat(src=10.89.0.199)),recirc(0x41)
recirc_id(0x41),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(src=d6:db:93:1b:af:5c,dst=ea:8c:57:97:14:b7),eth_type(0x0800),ipv4(dst=10.64.0.0/255.224.0.0,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct_clear,ct(commit,zone=64000,mark=0x1/0xffffffff),4

So we first try to SNAT: actions:ct(commit,nat(src=10.89.0.199))

But for some reason this doesn't happen.

This is the case only if there was no data traffic on the session after
the egress IP move happened.  That is, no conntrack entry exists for the
session on this chassis.

If instead we first generate data traffic, the conntrack session on the
new chassis gets created and moves to ESTABLISHED before the FIN+ACK
packet is processed.  SNAT happens fine for all packets then.

Comment 7 Dumitru Ceara 2022-11-30 15:40:11 UTC
I had missed a datapath flow above, for completeness:

recirc_id(0),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({class=0x102,type=0x80,len=4,0x10003/0x7fffffff}),flags(-df+csum+key)),in_port(2),ct_state(-new-est-rel-rpl-inv-trk),ct_mark(0/0x3),eth(src=0a:58:64:40:00:01,dst=0a:58:64:40:00:03),eth_type(0x0800),ipv4(src=10.244.0.4/255.255.255.252,dst=10.89.0.1,ttl=63,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:set(eth(src=d6:db:93:1b:af:5c,dst=ea:8c:57:97:14:b7)),set(ipv4(ttl=62)),ct(zone=5,nat),recirc(0x40)
recirc_id(0x40),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-rpl+trk),eth(),eth_type(0x0800),ipv4(src=10.244.0.6,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct(commit,nat(src=10.89.0.199)),recirc(0x41)
recirc_id(0x41),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(src=d6:db:93:1b:af:5c,dst=ea:8c:57:97:14:b7),eth_type(0x0800),ipv4(dst=10.64.0.0/255.224.0.0,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct_clear,ct(commit,zone=64000,mark=0x1/0xffffffff),4

Comment 9 Dumitru Ceara 2023-01-13 10:34:39 UTC
PR https://github.com/ovn-org/ovn-kubernetes/pull/3349 tries to minimize the race condition window in which packets can reach a gateway (after failover) before SNAT is configured in openflow.

Closing this BZ for now; bug 2130939 tracks the not-SNAT-ed FIN and RST packets.

Comment 10 Dumitru Ceara 2023-01-16 13:10:37 UTC
(In reply to Dumitru Ceara from comment #9)

> Closing this BZ for now; bug 2130939 tracks the not-SNAT-ed FIN and RST
> packets.

Bug 2160685 is actually the one tracking the not-SNAT-ed FIN and RST packets.