Description of problem: traffic is still snatted when 1- loadbalancer has skip_snat=true 2- logical router has lb_force_snat_ip=router_ip 3- loadbalancer has affinity_timeout In the context of OVN-Kubernetes, traffic for services with externalTrafficPolicy: Local and sessionAffinity: ClientIP are still getting snatted Version-Release number of selected component (if applicable): ovn main branch (commit 30952c248d4f804c25af9b1c9565f23c0045e915) How reproducible: all the time Steps to Reproduce: (greatly helped by reusing instructions from bz1995326) 1. in OVN sandbox: # Create the first logical switch with one port ovn-nbctl ls-add sw0 ovn-nbctl lsp-add sw0 sw0-port1 ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 192.168.0.2" ovs-vsctl add-port br-int sw0-port1 -- set interface sw0-port1 type=internal external_ids:iface-id=sw0-port1 ip netns add sw0-port1 ip link set sw0-port1 netns sw0-port1 ip netns exec sw0-port1 ip link set sw0-port1 address 50:54:00:00:00:01 ip netns exec sw0-port1 ip link set sw0-port1 up ip netns exec sw0-port1 ip addr add 192.168.0.2/24 dev sw0-port1 ip netns exec sw0-port1 ip route add default via 192.168.0.1 # Create the second logical switch with one port ovn-nbctl ls-add sw1 ovn-nbctl lsp-add sw1 sw1-port1 ovn-nbctl lsp-set-addresses sw1-port1 "50:54:00:00:00:03 11.0.0.2" ovs-vsctl add-port br-int sw1-port1 -- set interface sw1-port1 type=internal external_ids:iface-id=sw1-port1 ip netns add sw1-port1 ip link set sw1-port1 netns sw1-port1 ip netns exec sw1-port1 ip link set sw1-port1 address 50:54:00:00:00:03 ip netns exec sw1-port1 ip link set sw1-port1 up ip netns exec sw1-port1 ip addr add 11.0.0.2/24 dev sw1-port1 ip netns exec sw1-port1 ip route add default via 11.0.0.1 # Create a logical router and attach both logical switches ovn-nbctl lr-add lr0 ovn-nbctl lrp-add lr0 lrp0 00:00:00:00:ff:01 192.168.0.1/24 ovn-nbctl lsp-add sw0 lrp0-attachment ovn-nbctl lsp-set-type lrp0-attachment router ovn-nbctl lsp-set-addresses lrp0-attachment 00:00:00:00:ff:01 ovn-nbctl lsp-set-options lrp0-attachment router-port=lrp0 ovn-nbctl lrp-add lr0 lrp1 00:00:00:00:ff:02 11.0.0.1/24 ovn-nbctl lsp-add sw1 lrp1-attachment ovn-nbctl lsp-set-type lrp1-attachment router ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02 ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1 ovn-nbctl set Logical_Router lr0 options:chassis=chassis-1 ovn-nbctl set Logical_Router lr0 options:lb_force_snat_ip=router_ip ovn-nbctl lb-add lb0 11.0.0.200:1234 192.168.0.2:8080 ovn-nbctl set Load_Balancer lb0 options:skip_snat=true ovn-nbctl set load_balancer lb0 options:affinity_timeout=1200 ovn-nbctl lr-lb-add lr0 lb0 ovn-sbctl dump-flows lr0 | grep lr_in_dnat ovn-nbctl --wait=hv sync ip netns exec sw0-port1 python3 -m http.server 8080 & ip netns exec sw1-port1 curl 11.0.0.200:1234 ip netns exec sw1-port1 curl 11.0.0.200:1234 Actual results: at least the second curl succeeds but after SNAT: 192.168.0.1 - - [20/Jul/2023 09:24:39] "GET / HTTP/1.1" 200 - Expected results: curl succeeds with the proper IP 11.0.0.2 - - [20/Jul/2023 09:27:27] "GET / HTTP/1.1" 200 - 11.0.0.2 - - [20/Jul/2023 09:27:27] "GET / HTTP/1.1" 200 - (as it is the case when removing the affinity_timeout with ovn-nbctl remove load_balancer lb0 options affinity_timeout=1200 ) Additional info: also RH case https://access.redhat.com/support/cases/#/case/03563137
Patch posted u/s: https://patchwork.ozlabs.org/project/ovn/patch/20230720125708.132830-1-amusil@redhat.com/
I tried it and it works (thanks!), note that this fails now: ip netns exec sw0-port1 curl 11.0.0.200:1234 (the fun case of the pod contacting the service for which it is its own endpoint, and thus requires the hairpin thing)
@amusil Thanks, can you make sure that this gets backported to whichever version of OVN is present in OCP 4.12?
(In reply to François Rigault from comment #2) > I tried it and it works (thanks!), note that this fails now: > > > ip netns exec sw0-port1 curl 11.0.0.200:1234 > > (the fun case of the pod contacting the service for which it is its own > endpoint, and thus requires the hairpin thing) That also fails when you remove the affinity_timeout (on current main). AFAIK that's correct. (In reply to Scott Dodson from comment #4) > @amusil Thanks, can you make sure that this gets backported to > whichever version of OVN is present in OCP 4.12? Yeah, I'll make sure it gets backported. Thanks, Ales
ovn23.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2236359 ovn23.03 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2236360 ovn22.12 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2236361
verified on version: :: [ 02:44:04 ] :: [ BEGIN ] :: Running 'ip netns exec bar1 python3 -m http.server 8080 &' :: [ 02:44:04 ] :: [ PASS ] :: Command 'ip netns exec bar1 python3 -m http.server 8080 &' (Expected 0, got 0) Serving HTTP on :: port 8080 (http://[::]:8080/) ... tcpdump: data link type LINUX_SLL2 dropped privs to tcpdump tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes :: [ 02:44:11 ] :: [ BEGIN ] :: Running 'ip netns exec foo1 curl 192.168.1.100:1234 ' ::ffff:192.168.1.2 - - [05/Dec/2023 02:44:12] "GET / HTTP/1.1" 200 - <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Directory listing for /</title> </head> <body> <h1>Directory listing for /</h1> <hr> <ul> <li><a href="bar1.log">bar1.log</a></li> <li><a href="foo1.pcap">foo1.pcap</a></li> <li><a href="Makefile">Makefile</a></li> <li><a href="PURPOSE">PURPOSE</a></li> <li><a href="runtest.sh">runtest.sh</a></li> <li><a href="testinfo.desc">testinfo.desc</a></li> </ul> <hr> </body> </html> :: [ 02:44:12 ] :: [ PASS ] :: Command 'ip netns exec foo1 curl 192.168.1.100:1234 ' (Expected 0, got 0) :: [ 02:44:16 ] :: [ BEGIN ] :: Running 'tcpdump -r foo1.pcap -nnle |grep 192.168.1.2' reading from file foo1.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 Warning: interface names might be incorrect dropped privs to tcpdump 02:44:11.281246 ? Out ifindex 117 f0:00:00:01:02:03 ethertype ARP (0x0806), length 48: Request who-has 192.168.1.100 tell 192.168.1.2, length 28 02:44:11.281700 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 80: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [S], seq 2148007364, win 64240, options [mss 1460,sackOK,TS val 143038351 ecr 0,nop,wscale 7], length 0 02:44:12.310273 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 80: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [S], seq 2148007364, win 64240, options [mss 1460,sackOK,TS val 143039380 ecr 0,nop,wscale 7], length 0 02:44:12.311819 ? In ifindex 117 00:00:01:01:02:03 ethertype IPv4 (0x0800), length 80: 192.168.1.100.1234 > 192.168.1.2.42242: Flags [S.], seq 2960556580, ack 2148007365, win 65160, options [mss 1460,sackOK,TS val 3716241302 ecr 143039380,nop,wscale 7], length 0 02:44:12.311863 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [.], ack 1, win 502, options [nop,nop,TS val 143039382 ecr 3716241302], length 0 02:44:12.311923 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 154: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [P.], seq 1:83, ack 1, win 502, options [nop,nop,TS val 143039382 ecr 3716241302], length 82 02:44:12.312297 ? In ifindex 117 00:00:01:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.100.1234 > 192.168.1.2.42242: Flags [.], ack 83, win 509, options [nop,nop,TS val 3716241303 ecr 143039382], length 0 02:44:12.313970 ? In ifindex 117 00:00:01:01:02:03 ethertype IPv4 (0x0800), length 227: 192.168.1.100.1234 > 192.168.1.2.42242: Flags [P.], seq 1:156, ack 83, win 509, options [nop,nop,TS val 3716241305 ecr 143039382], length 155 02:44:12.313988 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [.], ack 156, win 501, options [nop,nop,TS val 143039384 ecr 3716241305], length 0 02:44:12.314054 ? In ifindex 117 00:00:01:01:02:03 ethertype IPv4 (0x0800), length 629: 192.168.1.100.1234 > 192.168.1.2.42242: Flags [P.], seq 156:713, ack 83, win 509, options [nop,nop,TS val 3716241305 ecr 143039384], length 557 02:44:12.314061 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [.], ack 713, win 501, options [nop,nop,TS val 143039384 ecr 3716241305], length 0 02:44:12.314139 ? In ifindex 117 00:00:01:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.100.1234 > 192.168.1.2.42242: Flags [F.], seq 713, ack 83, win 509, options [nop,nop,TS val 3716241305 ecr 143039384], length 0 02:44:12.314184 ? Out ifindex 117 f0:00:00:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.2.42242 > 192.168.1.100.1234: Flags [F.], seq 83, ack 714, win 501, options [nop,nop,TS val 143039384 ecr 3716241305], length 0 02:44:12.314237 ? In ifindex 117 00:00:01:01:02:03 ethertype IPv4 (0x0800), length 72: 192.168.1.100.1234 > 192.168.1.2.42242: Flags [.], ack 84, win 509, options [nop,nop,TS val 3716241305 ecr 143039384], length 0 :: [ 02:44:16 ] :: [ PASS ] :: Command 'tcpdump -r foo1.pcap -nnle |grep 192.168.1.2' (Expected 0, got 0) :: [ 02:44:16 ] :: [ BEGIN ] :: Running 'tcpdump -r foo1.pcap -nnle |grep 192.168.2.1' reading from file foo1.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 Warning: interface names might be incorrect dropped privs to tcpdump :: [ 02:44:16 ] :: [ PASS ] :: Command 'tcpdump -r foo1.pcap -nnle |grep 192.168.2.1' (Expected 1, got 1) set verified.
version: # rpm -qa|grep ovn ovn23.09-23.09.0-73.el9fdp.x86_64 ovn23.09-central-23.09.0-73.el9fdp.x86_64 ovn23.09-host-23.09.0-73.el9fdp.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn23.09 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:0392