Description of problem: note: this was raised from the integration team, and I was asked to report the bug. When an application exposed via a service receives a fragmentation needed icmp message, there is no guarantee that the message will receive the right pod. So what happens is that the icmp message receives the wrong pod behind the service, and the mtu is not reduced. In case of services with a huge number of pods, it may take a long number of reconncetions until it converges. One additional note is the fact that the customer is using local traffic policy, so this needs to be taken in account when addressing the issue (if addressable).
I managed to get the setup ready thanks to the instructions left by Konstantinos in comment 13! ++karma points. Rest is on me to solve. Here is what I have gathered so far. The needs fragmentation packet of icmp type 3 code 4, makes its way into entry node (my case ovn-worker) -> goes to breth0 -> goes into OVN... then I loose it. It is not going into the geneve interface to go into ovn-control-plane. I have a strong feeling this is the way OVN load balancers work. When I create a TCP load balancer, probably ICMP packets towards the VIP are not allowed? I need to check with the OVN team. But currently I can see from the tcpdump that packets go into OVN but don't exit via geneve tunnel: root@ovn-worker:/# tcpdump -nnevvvp -i any 'icmp and icmp[0] == 3 and icmp[1] == 4' or host 192.168.10.0 tcpdump: data link type LINUX_SLL2 tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 14:27:08.494442 eth0 In ifindex 481 02:42:ac:12:00:06 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 63, id 31667, offset 0, flags [DF], proto TCP (6), length 52) 172.19.0.3.49744 > 192.168.10.0.80: Flags [.], cksum 0x76e5 (incorrect -> 0x8764), seq 1338881475, ack 2174118587, win 501, options [nop,nop,TS val 637811908 ecr 3160974 064], length 0 14:27:08.494472 breth0 In ifindex 6 02:42:ac:12:00:06 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 63, id 31667, offset 0, flags [DF], proto TCP (6), length 52) 172.19.0.3.49744 > 192.168.10.0.80: Flags [.], cksum 0x76e5 (incorrect -> 0x8764), seq 0, ack 1, win 501, options [nop,nop,TS val 637811908 ecr 3160974064], length 0 14:27:08.498611 breth0 Out ifindex 6 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 61, id 3922, offset 0, flags [DF], proto TCP (6), length 52) 192.168.10.0.80 > 172.19.0.3.49744: Flags [.], cksum 0x76e5 (incorrect -> 0x4d6e), seq 14829, ack 1, win 505, options [nop,nop,TS val 3161036500 ecr 637749477], length 0 14:27:08.498625 eth0 Out ifindex 481 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 61, id 3922, offset 0, flags [DF], proto TCP (6), length 52) 192.168.10.0.80 > 172.19.0.3.49744: Flags [.], cksum 0x76e5 (incorrect -> 0x4d6e), seq 14829, ack 1, win 505, options [nop,nop,TS val 3161036500 ecr 637749477], length 0 I tried to do an OVS trace and in that it shows packet going to geneve interface, but I am not really able to find the specific flows in OVS that does this trick. I will confirm with OVN team if the blockage is at GR or not and go from there.
OVS trace: oc exec -n ovn-kubernetes ovnkube-node-wtjvz -- ovs-appctl ofproto/trace breth0 ct_state=rel,in_port=LOCAL,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,icmp,icmp_type=3,icmp_code=4,nw_src=172.18.0.6,nw_dst=10.96.85.121,nw_ttl=64,dp_hash=1 Flow: dp_hash=0x1,ct_state=rel,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=172.18.0.6,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 bridge("breth0") ---------------- 0. ip,in_port=LOCAL,nw_dst=10.96.0.0/16, priority 500, cookie 0xdeff105 ct(commit,table=2,zone=64001,nat(src=169.254.169.2)) nat(src=169.254.169.2) -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 2. -> Sets the packet to an untracked state, and clears all the conntrack fields. Final flow: dp_hash=0x1,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=172.18.0.6,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 Megaflow: recirc_id=0,eth,ip,in_port=LOCAL,nw_dst=10.96.0.0/16,nw_frag=no Datapath actions: ct(commit,zone=64001,nat(src=169.254.169.2)),recirc(0x1272) =============================================================================== recirc(0x1272) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) Replacing src/dst IP/ports to simulate NAT: Initial flow: nw_src=172.18.0.6,tp_src=3,nw_dst=10.96.85.121,tp_dst=4 Modified flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.96.85.121,tp_dst=4 =============================================================================== Flow: recirc_id=0x1272,dp_hash=0x1,ct_state=new|trk,ct_zone=64001,eth,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 bridge("breth0") ---------------- thaw Resuming from table 2 2. priority 32768, cookie 0xdeff105 set_field:02:42:ac:12:00:02->eth_dst output:2 bridge("br-int") ---------------- 0. in_port=4,vlan_tci=0x0000/0x1000, priority 100, cookie 0xa03c160d set_field:0x9->reg11 set_field:0xa->reg12 set_field:0xa->metadata set_field:0x1->reg14 resubmit(,8) 8. metadata=0xa, priority 50, cookie 0x675b1766 set_field:0/0x1000->reg10 resubmit(,73) 73. reg0=0x2, priority 0 drop move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111] -> NXM_NX_XXREG0[111] is now 0 resubmit(,9) 9. metadata=0xa, priority 0, cookie 0xa01a2cb3 resubmit(,10) 10. metadata=0xa, priority 0, cookie 0xa61c6270 resubmit(,11) 11. metadata=0xa, priority 0, cookie 0x80f7e256 resubmit(,12) 12. metadata=0xa, priority 0, cookie 0xc2f1fcfa resubmit(,13) 13. ip,reg14=0x1,metadata=0xa, priority 110, cookie 0xbcd19ddd resubmit(,14) 14. metadata=0xa, priority 0, cookie 0x3bd3a1a8 resubmit(,15) 15. metadata=0xa, priority 65535, cookie 0x2ca64e66 resubmit(,16) 16. metadata=0xa, priority 65535, cookie 0xe10553c3 resubmit(,17) 17. metadata=0xa, priority 0, cookie 0xd290a09a resubmit(,18) 18. metadata=0xa, priority 0, cookie 0x847b5a53 resubmit(,19) 19. metadata=0xa, priority 0, cookie 0x57d98c45 resubmit(,20) 20. metadata=0xa, priority 0, cookie 0xbc41f65b resubmit(,21) 21. metadata=0xa, priority 0, cookie 0xf755a524 resubmit(,22) 22. metadata=0xa, priority 0, cookie 0xab2e9bcf resubmit(,23) 23. metadata=0xa, priority 0, cookie 0x23a0f81 resubmit(,24) 24. metadata=0xa, priority 0, cookie 0xfe1a39f0 resubmit(,25) 25. reg14=0x1,metadata=0xa, priority 100, cookie 0x1ddb8244 resubmit(,26) 26. metadata=0xa, priority 0, cookie 0xf4fecff4 resubmit(,27) 27. metadata=0xa, priority 0, cookie 0x13a55265 resubmit(,28) 28. metadata=0xa, priority 0, cookie 0x4b52d41b resubmit(,29) 29. metadata=0xa, priority 0, cookie 0x949b0327 resubmit(,30) 30. metadata=0xa, priority 0, cookie 0x7fc507f7 resubmit(,31) 31. metadata=0xa,dl_dst=02:42:ac:12:00:02, priority 50, cookie 0x2b8dd290 set_field:0x2->reg15 resubmit(,37) 37. priority 0 resubmit(,38) 38. reg15=0x2,metadata=0xa, priority 100, cookie 0x17a6a94c set_field:0x9->reg11 set_field:0xa->reg12 resubmit(,39) 39. priority 0 set_field:0->reg0 set_field:0->reg1 set_field:0->reg2 set_field:0->reg3 set_field:0->reg4 set_field:0->reg5 set_field:0->reg6 set_field:0->reg7 set_field:0->reg8 set_field:0->reg9 resubmit(,40) 40. ip,reg15=0x2,metadata=0xa, priority 110, cookie 0xfc6ba79d resubmit(,41) 41. metadata=0xa, priority 0, cookie 0x89b8716f resubmit(,42) 42. metadata=0xa, priority 0, cookie 0xad065d45 resubmit(,43) 43. metadata=0xa, priority 65535, cookie 0x9c0fb0a8 resubmit(,44) 44. metadata=0xa, priority 65535, cookie 0x6f927feb resubmit(,45) 45. metadata=0xa, priority 0, cookie 0x17df8f35 resubmit(,46) 46. metadata=0xa, priority 0, cookie 0x9fcd5617 resubmit(,47) 47. metadata=0xa, priority 0, cookie 0x31f290a resubmit(,48) 48. metadata=0xa, priority 0, cookie 0x1c286523 set_field:0/0x1000->reg10 resubmit(,75) 75. reg0=0x2, priority 0 drop move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111] -> NXM_NX_XXREG0[111] is now 0 resubmit(,49) 49. metadata=0xa, priority 0, cookie 0x932b1117 resubmit(,64) 64. priority 0 resubmit(,65) 65. reg15=0x2,metadata=0xa, priority 100, cookie 0x17a6a94c clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x8->reg11,set_field:0x7->metadata,set_field:0x2->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8)) ct_clear set_field:0->reg11 set_field:0->reg12 set_field:0->reg13 set_field:0x8->reg11 set_field:0x7->metadata set_field:0x2->reg14 set_field:0->reg10 set_field:0->reg15 set_field:0->reg0 set_field:0->reg1 set_field:0->reg2 set_field:0->reg3 set_field:0->reg4 set_field:0->reg5 set_field:0->reg6 set_field:0->reg7 set_field:0->reg8 set_field:0->reg9 resubmit(,8) 8. reg14=0x2,metadata=0x7,dl_dst=02:42:ac:12:00:02, priority 50, cookie 0x63ff44ef set_field:0x242ac1200020000000000000000/0xffffffffffff0000000000000000->xxreg0 resubmit(,9) 9. metadata=0x7, priority 0, cookie 0x5d391334 set_field:0x4/0x4->xreg4 resubmit(,10) 10. reg9=0/0x8,metadata=0x7, priority 100, cookie 0x147d0e01 resubmit(,11) 11. metadata=0x7, priority 0, cookie 0xb3b01a32 resubmit(,12) 12. metadata=0x7, priority 0, cookie 0x6d08ac9f resubmit(,13) 13. ip,metadata=0x7,nw_dst=10.96.85.121, priority 100, cookie 0x8e7b2b70 set_field:0xa605579000000000000000000000000/0xffffffff000000000000000000000000->xxreg0 ct(table=14,zone=NXM_NX_REG11[0..15],nat) nat -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 14. -> Sets the packet to an untracked state, and clears all the conntrack fields. Final flow: recirc_id=0x1272,dp_hash=0x1,ct_state=new|trk,ct_zone=64001,eth,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 Megaflow: recirc_id=0x1272,ct_state=+new-est+trk,ct_mark=0/0x2,eth,icmp,in_port=LOCAL,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=168.0.0.0/6,nw_dst=10.96.85.121,nw_ttl=64,nw_frag=no Datapath actions: set(eth(dst=02:42:ac:12:00:02)),ct(zone=8,nat),recirc(0x1276) =============================================================================== recirc(0x1276) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) Replacing src/dst IP/ports to simulate NAT: Initial flow: Modified flow: =============================================================================== Flow: recirc_id=0x1276,dp_hash=0x1,ct_state=new|trk,ct_zone=8,eth,icmp,reg0=0xa605579,reg1=0xac120002,reg9=0x4,reg11=0x8,reg14=0x2,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 bridge("breth0") ---------------- thaw Resuming from table 14 14. ct_state=+new+trk,ip,reg0=0xa605579,metadata=0x7, priority 110, cookie 0x25a58ee5 set_field:0x8/0x8->reg10 group:3 -> using bucket 0 bucket 0 ct(commit,table=15,zone=NXM_NX_REG11[0..15],nat(dst=10.244.2.7),exec(set_field:0x2/0x2->ct_mark)) nat(dst=10.244.2.7) set_field:0x2/0x2->ct_mark -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 15. -> Sets the packet to an untracked state, and clears all the conntrack fields. Final flow: recirc_id=0x1276,dp_hash=0x1,ct_state=new|trk,ct_zone=8,eth,icmp,reg0=0xa605579,reg1=0xac120002,reg9=0x4,reg10=0x8,reg11=0x8,reg14=0x2,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 Megaflow: recirc_id=0x1276,dp_hash=0x1/0xf,ct_state=+new-est+trk,ct_mark=0/0x2,eth,ip,in_port=4,nw_frag=no Datapath actions: ct(commit,zone=8,mark=0x2/0x2,nat(dst=10.244.2.7)),recirc(0x1277) =============================================================================== recirc(0x1277) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) Replacing src/dst IP/ports to simulate NAT: Initial flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.96.85.121,tp_dst=4 Modified flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.244.2.7,tp_dst=4 =============================================================================== Flow: recirc_id=0x1277,dp_hash=0x1,ct_state=new|trk,ct_zone=8,ct_mark=0x2,eth,icmp,reg0=0xa605579,reg1=0xac120002,reg9=0x4,reg10=0x8,reg11=0x8,reg14=0x2,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4 bridge("breth0") ---------------- thaw Resuming from table 15 15. metadata=0x7, priority 0, cookie 0x55306c60 resubmit(,16) 16. metadata=0x7, priority 0, cookie 0xc8e906fb resubmit(,17) 17. metadata=0x7, priority 0, cookie 0x96fbdd00 resubmit(,18) 18. metadata=0x7, priority 0, cookie 0x9ba30ecb set_field:0/0xffffffff->xxreg1 resubmit(,19) 19. ip,reg7=0,metadata=0x7,nw_dst=10.244.0.0/16, priority 49, cookie 0xc2f154 dec_ttl() set_field:0/0xffff00000000->xreg4 set_field:0x64400001000000000000000000000000/0xffffffff000000000000000000000000->xxreg0 set_field:0x644000040000000000000000/0xffffffff0000000000000000->xxreg0 set_field:0a:58:64:40:00:04->eth_src set_field:0x1->reg15 set_field:0x1/0x1->reg10 resubmit(,20) 20. reg8=0/0xffff,metadata=0x7, priority 150, cookie 0x72ae663a resubmit(,21) 21. metadata=0x7, priority 0, cookie 0x8ffb031 set_field:0/0xffff00000000->xreg4 resubmit(,22) 22. reg8=0/0xffff,metadata=0x7, priority 150, cookie 0x667193ad resubmit(,23) 23. ip,metadata=0x7, priority 0, cookie 0xe1496db1 push:NXM_NX_REG0[] push:NXM_NX_XXREG0[96..127] pop:NXM_NX_REG0[] -> NXM_NX_REG0[] is now 0x64400001 set_field:00:00:00:00:00:00->eth_dst resubmit(,66) 66. reg0=0x64400001,reg15=0x1,metadata=0x7, priority 100, cookie 0xcdb61423 set_field:0a:58:64:40:00:01->eth_dst set_field:0x40/0x40->reg10 pop:NXM_NX_REG0[] -> NXM_NX_REG0[] is now 0x64400001 resubmit(,24) 24. metadata=0x7, priority 0, cookie 0xea88d1b1 resubmit(,25) 25. metadata=0x7, priority 0, cookie 0xb58db599 resubmit(,26) 26. metadata=0x7, priority 0, cookie 0x1a3bb75c resubmit(,27) 27. metadata=0x7, priority 0, cookie 0x6695c541 resubmit(,37) 37. priority 0 resubmit(,38) 38. reg15=0x1,metadata=0x7, priority 100, cookie 0xc260d7b3 set_field:0x8->reg11 resubmit(,39) 39. priority 0 set_field:0->reg0 set_field:0->reg1 set_field:0->reg2 set_field:0->reg3 set_field:0->reg4 set_field:0->reg5 set_field:0->reg6 set_field:0->reg7 set_field:0->reg8 set_field:0->reg9 resubmit(,40) 40. metadata=0x7, priority 0, cookie 0xe6226cf5 set_field:0/0x10->xreg4 resubmit(,41) 41. ip,metadata=0x7, priority 50, cookie 0xa39487b0 set_field:0x1/0x1->reg10 ct(table=42,zone=NXM_NX_REG11[0..15],nat) nat -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 42. -> Sets the packet to an untracked state, and clears all the conntrack fields. Final flow: recirc_id=0x1277,dp_hash=0x1,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4 Megaflow: recirc_id=0x1277,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x1,eth,ip,in_port=4,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_dst=10.244.0.0/16,nw_ttl=64,nw_frag=no Datapath actions: set(eth(src=0a:58:64:40:00:04,dst=0a:58:64:40:00:01)),set(ipv4(ttl=63)),ct(zone=8,nat),recirc(0x1278) =============================================================================== recirc(0x1278) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) Replacing src/dst IP/ports to simulate NAT: Initial flow: Modified flow: =============================================================================== Flow: recirc_id=0x1278,dp_hash=0x1,ct_state=new|trk,ct_zone=8,ct_mark=0x2,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4 bridge("breth0") ---------------- thaw Resuming from table 42 42. ct_state=+new+trk,ip,metadata=0x7, priority 50, cookie 0x73b392f ct(commit,zone=NXM_NX_REG11[0..15],nat(src)) nat(src) -> Sets the packet to an untracked state, and clears all the conntrack fields. resubmit(,43) 43. ip,reg10=0x8/0x8,reg15=0x1,metadata=0x7, priority 110, cookie 0xaa7da553 ct(commit,table=44,zone=NXM_NX_REG12[0..15],nat(src=100.64.0.4)) nat(src=100.64.0.4) -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 44. -> Sets the packet to an untracked state, and clears all the conntrack fields. Final flow: recirc_id=0x1278,dp_hash=0x1,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4 Megaflow: recirc_id=0x1278,ct_state=+new+trk,eth,ip,in_port=4,nw_frag=no Datapath actions: ct(commit,zone=8,nat(src)),ct(commit,nat(src=100.64.0.4)),recirc(0x1279) =============================================================================== recirc(0x1279) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) Replacing src/dst IP/ports to simulate NAT: Initial flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.244.2.7,tp_dst=4 Modified flow: nw_src=100.64.0.4,tp_src=3,nw_dst=10.244.2.7,tp_dst=4 =============================================================================== Flow: recirc_id=0x1279,dp_hash=0x1,ct_state=new|trk,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=100.64.0.4,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4 bridge("breth0") ---------------- thaw Resuming from table 44 44. metadata=0x7, priority 0, cookie 0xc7eac39d resubmit(,45) 45. metadata=0x7, priority 0, cookie 0x99a1f85e resubmit(,46) 46. reg15=0x1,metadata=0x7, priority 100, cookie 0xb9cdbffb resubmit(,64) 64. reg10=0x1/0x1,reg15=0x1,metadata=0x7, priority 100, cookie 0xc260d7b3 push:NXM_OF_IN_PORT[] set_field:ANY->in_port resubmit(,65) 65. reg15=0x1,metadata=0x7, priority 100, cookie 0xc260d7b3 clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x6->reg11,set_field:0x1->reg12,set_field:0x2->metadata,set_field:0x4->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8)) ct_clear set_field:0->reg11 set_field:0->reg12 set_field:0->reg13 set_field:0x6->reg11 set_field:0x1->reg12 set_field:0x2->metadata set_field:0x4->reg14 set_field:0->reg10 set_field:0->reg15 set_field:0->reg0 set_field:0->reg1 set_field:0->reg2 set_field:0->reg3 set_field:0->reg4 set_field:0->reg5 set_field:0->reg6 set_field:0->reg7 set_field:0->reg8 set_field:0->reg9 resubmit(,8) 8. metadata=0x2, priority 50, cookie 0x675b1766 set_field:0/0x1000->reg10 resubmit(,73) 73. reg0=0x2, priority 0 drop move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111] -> NXM_NX_XXREG0[111] is now 0 resubmit(,9) 9. metadata=0x2, priority 0, cookie 0xa01a2cb3 resubmit(,10) 10. metadata=0x2, priority 0, cookie 0xa61c6270 resubmit(,11) 11. metadata=0x2, priority 0, cookie 0x80f7e256 resubmit(,12) 12. metadata=0x2, priority 0, cookie 0xc2f1fcfa resubmit(,13) 13. ip,reg14=0x4,metadata=0x2, priority 110, cookie 0xf6d92dd resubmit(,14) 14. metadata=0x2, priority 0, cookie 0x3bd3a1a8 resubmit(,15) 15. metadata=0x2, priority 65535, cookie 0x2ca64e66 resubmit(,16) 16. metadata=0x2, priority 65535, cookie 0xe10553c3 resubmit(,17) 17. metadata=0x2, priority 0, cookie 0xd290a09a resubmit(,18) 18. metadata=0x2, priority 0, cookie 0x847b5a53 resubmit(,19) 19. metadata=0x2, priority 0, cookie 0x57d98c45 resubmit(,20) 20. metadata=0x2, priority 0, cookie 0xbc41f65b resubmit(,21) 21. metadata=0x2, priority 0, cookie 0xf755a524 resubmit(,22) 22. metadata=0x2, priority 0, cookie 0xab2e9bcf resubmit(,23) 23. metadata=0x2, priority 0, cookie 0x23a0f81 resubmit(,24) 24. metadata=0x2, priority 0, cookie 0xfe1a39f0 resubmit(,25) 25. metadata=0x2, priority 0, cookie 0xdeb00bf1 resubmit(,26) 26. metadata=0x2, priority 0, cookie 0xf4fecff4 resubmit(,27) 27. metadata=0x2, priority 0, cookie 0x13a55265 resubmit(,28) 28. metadata=0x2, priority 0, cookie 0x4b52d41b resubmit(,29) 29. metadata=0x2, priority 0, cookie 0x949b0327 resubmit(,30) 30. metadata=0x2, priority 0, cookie 0x7fc507f7 resubmit(,31) 31. metadata=0x2,dl_dst=0a:58:64:40:00:01, priority 50, cookie 0xfe5621e5 set_field:0x1->reg15 resubmit(,37) 37. priority 0 resubmit(,38) 38. reg15=0x1,metadata=0x2, priority 100, cookie 0xbedd970f set_field:0x6->reg11 set_field:0x1->reg12 resubmit(,39) 39. priority 0 set_field:0->reg0 set_field:0->reg1 set_field:0->reg2 set_field:0->reg3 set_field:0->reg4 set_field:0->reg5 set_field:0->reg6 set_field:0->reg7 set_field:0->reg8 set_field:0->reg9 resubmit(,40) 40. ip,reg15=0x1,metadata=0x2, priority 110, cookie 0x4e61006d resubmit(,41) 41. metadata=0x2, priority 0, cookie 0x89b8716f resubmit(,42) 42. metadata=0x2, priority 0, cookie 0xad065d45 resubmit(,43) 43. metadata=0x2, priority 65535, cookie 0x9c0fb0a8 resubmit(,44) 44. metadata=0x2, priority 65535, cookie 0x6f927feb resubmit(,45) 45. metadata=0x2, priority 0, cookie 0x17df8f35 resubmit(,46) 46. metadata=0x2, priority 0, cookie 0x9fcd5617 resubmit(,47) 47. metadata=0x2, priority 0, cookie 0x31f290a resubmit(,48) 48. metadata=0x2, priority 0, cookie 0x1c286523 set_field:0/0x1000->reg10 resubmit(,75) 75. reg0=0x2, priority 0 drop move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111] -> NXM_NX_XXREG0[111] is now 0 resubmit(,49) 49. metadata=0x2, priority 0, cookie 0x932b1117 resubmit(,64) 64. priority 0 resubmit(,65) 65. reg15=0x1,metadata=0x2, priority 100, cookie 0xbedd970f clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x2->reg11,set_field:0x3->reg12,set_field:0x1->metadata,set_field:0x1->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8)) ct_clear set_field:0->reg11 set_field:0->reg12 set_field:0->reg13 set_field:0x2->reg11 set_field:0x3->reg12 set_field:0x1->metadata set_field:0x1->reg14 set_field:0->reg10 set_field:0->reg15 set_field:0->reg0 set_field:0->reg1 set_field:0->reg2 set_field:0->reg3 set_field:0->reg4 set_field:0->reg5 set_field:0->reg6 set_field:0->reg7 set_field:0->reg8 set_field:0->reg9 resubmit(,8) 8. reg14=0x1,metadata=0x1,dl_dst=0a:58:64:40:00:01, priority 50, cookie 0x30c52ddd set_field:0xa58644000010000000000000000/0xffffffffffff0000000000000000->xxreg0 resubmit(,9) 9. metadata=0x1, priority 0, cookie 0x5d391334 set_field:0x4/0x4->xreg4 resubmit(,10) 10. reg9=0/0x8,metadata=0x1, priority 100, cookie 0x147d0e01 resubmit(,11) 11. metadata=0x1, priority 0, cookie 0xb3b01a32 resubmit(,12) 12. metadata=0x1, priority 0, cookie 0x6d08ac9f resubmit(,13) 13. metadata=0x1, priority 0, cookie 0x63fe1fc2 resubmit(,14) 14. metadata=0x1, priority 0, cookie 0x750d1480 resubmit(,15) 15. metadata=0x1, priority 0, cookie 0x55306c60 resubmit(,16) 16. metadata=0x1, priority 0, cookie 0xc8e906fb resubmit(,17) 17. metadata=0x1, priority 0, cookie 0x96fbdd00 resubmit(,18) 18. metadata=0x1, priority 0, cookie 0x9ba30ecb set_field:0/0xffffffff->xxreg1 resubmit(,19) 19. ip,metadata=0x1,nw_dst=10.244.2.0/24, priority 74, cookie 0x71f49e7e dec_ttl() set_field:0/0xffff00000000->xreg4 move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127] -> NXM_NX_XXREG0[96..127] is now 0xaf40207 set_field:0xaf402010000000000000000/0xffffffff0000000000000000->xxreg0 set_field:0a:58:0a:f4:02:01->eth_src set_field:0x3->reg15 set_field:0x1/0x1->reg10 resubmit(,20) 20. reg8=0/0xffff,metadata=0x1, priority 150, cookie 0x72ae663a resubmit(,21) 21. metadata=0x1, priority 0, cookie 0x8ffb031 set_field:0/0xffff00000000->xreg4 resubmit(,22) 22. reg8=0/0xffff,metadata=0x1, priority 150, cookie 0x667193ad resubmit(,23) 23. reg0=0xaf40207,reg15=0x3,metadata=0x1, priority 100, cookie 0xce04e12f set_field:0a:58:0a:f4:02:07->eth_dst resubmit(,24) 24. metadata=0x1, priority 0, cookie 0xea88d1b1 resubmit(,25) 25. metadata=0x1, priority 0, cookie 0xb58db599 resubmit(,26) 26. reg15=0x3,metadata=0x1, priority 50, cookie 0x6c85f190 set_field:0x5->reg15 resubmit(,27) 27. metadata=0x1, priority 0, cookie 0x6695c541 resubmit(,37) 37. reg15=0x5,metadata=0x1, priority 100, cookie 0x6dbb654c set_field:0x1/0xffffff->tun_id set_field:0x5->tun_metadata0 move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30] -> NXM_NX_TUN_METADATA0[16..30] is now 0x1 output:2 -> output to kernel tunnel resubmit(,38) 38. reg0=0x2, priority 0 drop pop:NXM_OF_IN_PORT[] -> NXM_OF_IN_PORT[] is now 4 Final flow: unchanged Megaflow: recirc_id=0x1279,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x3,eth,ip,tun_id=0/0xffffff,tun_metadata0=NP,in_port=4,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=100.64.0.4,nw_dst=10.244.2.7,nw_ecn=0,nw_ttl=63,nw_frag=no Datapath actions: ct_clear,set(tunnel(tun_id=0x1,dst=172.18.0.4,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x10005}),flags(df|csum|key))),set(eth(src=0a:58:0a:f4:02:01,dst=0a:58:0a:f4:02:07)),set(ipv4(ttl=62)),2 Bridge br-int fail_mode: secure datapath_type: system Port ovn-6f5615-0 Interface ovn-6f5615-0 type: geneve options: {csum="true", key=flow, remote_ip="172.18.0.4"} 2(ovn-6f5615-0): addr:f2:98:e2:94:8c:da config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max but I don't see the frag packets at geneve tunnel in tcpdump :/
(In reply to Surya Seetharaman from comment #20) > Hmmm disregard my previous comments, I can see the RELATED ICMP type 3 > packet indeed getting DNAT-ed at the host thanks to conntrack and heading > towards breth0. The last stage mangle POSTROUTING was hit. In breth0 we see: > > 12:08:46.044178 breth0 In ifindex 6 02:42:ac:12:00:06 ethertype IPv4 > (0x0800), length 596: (tos 0xc0, ttl 64, id 7313, offset 0, flags [none], > proto ICMP (1), length 576) > 172.18.0.6 > 192.168.10.0: ICMP 172.19.0.3 unreachable - need to frag > (mtu 1200), length 556 > (tos 0x0, ttl 61, id 34752, offset 0, flags [DF], proto TCP (6), > length 1400) > 192.168.10.0.80 > 172.19.0.3.37486: Flags [.], seq 264:1612, ack 84, win > 505, options [nop,nop,TS val 3178929358 ecr 655651396], length 1348: HTTP > > > So I think we back to the original question in comment17. Do OVN LB's allow > "RELATED" ICMP packets through if protocol is set to TCP or UDP ? Answer is no they don't. Opened https://bugzilla.redhat.com/show_bug.cgi?id=2126083 to merge first and then we can retest to make sure rest of it is fine. Original problem or purpose of this bug was to see if ICMP packets indeed go to the same backend or not when we have like say 1000 endpoints. I think for conntrack-ed connections this should be the case. The ICMP frag needed is a related packet so there should be no problem in taking that need frag to the same backend.
Update: I tested the OVN fix that was provided, but it still didn't work as expected. Today Dumitru, Ales and I had a debugging sessions and we found that in order for this to work well, we need to fix two more bugs in OVN: https://bugzilla.redhat.com/show_bug.cgi?id=2126083#c12 So we need to wait till those fixes go in.
I have tested the latest fixes that went in. We are looking good now. upstream fix went in: https://github.com/ovn-org/ovn-kubernetes/pull/3330
Also for the original bug description, we shouldn't be having that problem, since conntrack tracks the state as "related" it will give the frag needed packet back to the same backend. At least this is what I see when I test things, if that's not the case lmk after testing it in the lab setup. I was unable to reproduce the case where if there are multiple backends the DF needed packet gets sent to different backends.. IMHO its sent to the backend where the connection was established to.
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira. https://issues.redhat.com/browse/OCPBUGS-9079