The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2130939 - OVN-Kubernetes: SNAT not applied for existing connections during egress IP failover
Summary: OVN-Kubernetes: SNAT not applied for existing connections during egress IP fa...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: FDP 22.L
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-29 13:58 UTC by Patryk Diak
Modified: 2023-01-16 13:10 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-13 10:34:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2330 0 None None None 2022-09-29 14:01:48 UTC

Description Patryk Diak 2022-09-29 13:58:53 UTC
Description of problem:

During egress IP failover traffic sent from a matching pod is redirected to a new node where it should be SNATed on the GR.
It was observed that the SNAT is not always applied when the connection was initialized on on one node and egress IP moved to another one.

OVN-Kubernetes bug: https://issues.redhat.com/browse/OCPBUGS-283
Slack thread: https://coreos.slack.com/archives/C01G7T6SYSD/p1664296212811939


Version-Release number of selected component (if applicable):
OpenShift version: 4.12.0-0.ci-2022-09-28-013725                                                                                                                                                                                                                                                                                                                                                                              
ovn-nbctl 22.06.1                                                                                                                                                                                                                                                                                                                                                                                                                     
Open vSwitch Library 2.17.90                                                                                                                                                                                                                                                                                                                                                                                                          
DB Schema 6.3.0  

See the slack thread for reproducer: https://coreos.slack.com/archives/C01G7T6SYSD/p1664379654907989?thread_ts=1664296212.811939&cid=C01G7T6SYSD

egress IP: 10.0.128.101
pod IP:10.0.128.101

Egress IP node before failover: pdiak-09-28-2022-6ml7m-worker-c-rlgb4
Egress IP node after failover: pdiak-09-28-2022-6ml7m-worker-b-7l2x8

Please find the attached network must-gathers for NB/SB databases.

Actual results:
After egress IP failover packets belonging to existing connections are not always SNATed and are sent out with POD ip as source.

Expected results:
After egress IP failover packets that belong to an existing connection should be SNATed

Comment 6 Dumitru Ceara 2022-11-30 15:29:42 UTC
Checking the datapath flows after egress IP moved, on the chassis that
now owns the egress IP, for the FIN+ACK packet we see:

recirc_id(0x40),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-rpl+trk),eth(),eth_type(0x0800),ipv4(src=10.244.0.6,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct(commit,nat(src=10.89.0.199)),recirc(0x41)
recirc_id(0x41),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(src=d6:db:93:1b:af:5c,dst=ea:8c:57:97:14:b7),eth_type(0x0800),ipv4(dst=10.64.0.0/255.224.0.0,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct_clear,ct(commit,zone=64000,mark=0x1/0xffffffff),4

So we first try to SNAT: actions:ct(commit,nat(src=10.89.0.199))

But for some reason this doesn't happen.

This is the case only if there was no data traffic on the session after
the egress IP move happened.  That is, no conntrack entry exists for the
session on this chassis.

If instead we first generate data traffic, the conntrack session on the
new chassis gets created and moves to ESTABLISHED before the FIN+ACK
packet is processed.  SNAT happens fine for all packets then.

Comment 7 Dumitru Ceara 2022-11-30 15:40:11 UTC
I had missed a datapath flow above, for completeness:

recirc_id(0),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({class=0x102,type=0x80,len=4,0x10003/0x7fffffff}),flags(-df+csum+key)),in_port(2),ct_state(-new-est-rel-rpl-inv-trk),ct_mark(0/0x3),eth(src=0a:58:64:40:00:01,dst=0a:58:64:40:00:03),eth_type(0x0800),ipv4(src=10.244.0.4/255.255.255.252,dst=10.89.0.1,ttl=63,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:set(eth(src=d6:db:93:1b:af:5c,dst=ea:8c:57:97:14:b7)),set(ipv4(ttl=62)),ct(zone=5,nat),recirc(0x40)
recirc_id(0x40),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-rpl+trk),eth(),eth_type(0x0800),ipv4(src=10.244.0.6,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct(commit,nat(src=10.89.0.199)),recirc(0x41)
recirc_id(0x41),tunnel(tun_id=0x2,src=10.89.0.5,dst=10.89.0.6,geneve({}{}),flags(-df+csum+key)),in_port(2),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(src=d6:db:93:1b:af:5c,dst=ea:8c:57:97:14:b7),eth_type(0x0800),ipv4(dst=10.64.0.0/255.224.0.0,frag=no), packets:4, bytes:264, used:0.899s, flags:F., actions:ct_clear,ct(commit,zone=64000,mark=0x1/0xffffffff),4

Comment 9 Dumitru Ceara 2023-01-13 10:34:39 UTC
PR https://github.com/ovn-org/ovn-kubernetes/pull/3349 tries to minimize the race condition window in which packets can reach a gateway (after failover) before SNAT is configured in openflow.

Closing this BZ for now; bug 2130939 tracks the not-SNAT-ed FIN and RST packets.

Comment 10 Dumitru Ceara 2023-01-16 13:10:37 UTC
(In reply to Dumitru Ceara from comment #9)

> Closing this BZ for now; bug 2130939 tracks the not-SNAT-ed FIN and RST
> packets.

Bug 2160685 is actually the one tracking the not-SNAT-ed FIN and RST packets.


Note You need to log in before you can comment on or make changes to this bug.