Bug 2041746 - ICMP fragmentation needed sent to pods behind a service don't seem to reach the pods
Summary: ICMP fragmentation needed sent to pods behind a service don't seem to reach t...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 4.13.0
Assignee: Surya Seetharaman
QA Contact: Konstantinos
URL:
Whiteboard:
Depends On: 2126083
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-18 08:16 UTC by Federico Paolinelli
Modified: 2023-07-24 07:33 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-09 01:11:17 UTC
Target Upstream Version:
Embargoed:
surya: needinfo-
surya: needinfo-
kkarampo: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1468 0 None open Bug 2041746: Bump OVN to 22.12.0-4 2023-01-11 12:00:41 UTC
Github ovn-org ovn-kubernetes pull 3330 0 None Merged Bump OVN to 22.12 2022-12-29 12:32:34 UTC

Description Federico Paolinelli 2022-01-18 08:16:16 UTC
Description of problem:

note: this was raised from the integration team, and I was asked to report the bug.

When an application exposed via a service receives a fragmentation needed icmp message, there is no guarantee that the message will receive the right pod.

So what happens is that the icmp message receives the wrong pod behind the service, and the mtu is not reduced. In case of services with a huge number of pods, it may take a long number of reconncetions until it converges.

One additional note is the fact that the customer is using local traffic policy, so this needs to be taken in account when addressing the issue (if addressable).

Comment 17 Surya Seetharaman 2022-09-10 15:15:51 UTC
I managed to get the setup ready thanks to the instructions left by Konstantinos in comment 13! ++karma points. Rest is on me to solve.

Here is what I have gathered so far. The needs fragmentation packet of icmp type 3 code 4, makes its way into entry node (my case ovn-worker) -> goes to breth0 -> goes into OVN... then I loose it. It is not going into the geneve interface to go into ovn-control-plane. I have a strong feeling this is the way OVN load balancers work. When I create a TCP load balancer, probably ICMP packets towards the VIP are not allowed? I need to check with the OVN team.

But currently I can see from the tcpdump that packets go into OVN but don't exit via geneve tunnel:


root@ovn-worker:/# tcpdump -nnevvvp -i any 'icmp and icmp[0] == 3 and icmp[1] == 4' or host 192.168.10.0                                                                     
tcpdump: data link type LINUX_SLL2                                                                                                                                           
tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes                                                                              
14:27:08.494442 eth0  In  ifindex 481 02:42:ac:12:00:06 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 63, id 31667, offset 0, flags [DF], proto TCP (6), length 52)      
    172.19.0.3.49744 > 192.168.10.0.80: Flags [.], cksum 0x76e5 (incorrect -> 0x8764), seq 1338881475, ack 2174118587, win 501, options [nop,nop,TS val 637811908 ecr 3160974
064], length 0                                                                                                                                                               
14:27:08.494472 breth0 In  ifindex 6 02:42:ac:12:00:06 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 63, id 31667, offset 0, flags [DF], proto TCP (6), length 52)       
    172.19.0.3.49744 > 192.168.10.0.80: Flags [.], cksum 0x76e5 (incorrect -> 0x8764), seq 0, ack 1, win 501, options [nop,nop,TS val 637811908 ecr 3160974064], length 0    
14:27:08.498611 breth0 Out ifindex 6 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 61, id 3922, offset 0, flags [DF], proto TCP (6), length 52)        
    192.168.10.0.80 > 172.19.0.3.49744: Flags [.], cksum 0x76e5 (incorrect -> 0x4d6e), seq 14829, ack 1, win 505, options [nop,nop,TS val 3161036500 ecr 637749477], length 0
14:27:08.498625 eth0  Out ifindex 481 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 72: (tos 0x0, ttl 61, id 3922, offset 0, flags [DF], proto TCP (6), length 52)       
    192.168.10.0.80 > 172.19.0.3.49744: Flags [.], cksum 0x76e5 (incorrect -> 0x4d6e), seq 14829, ack 1, win 505, options [nop,nop,TS val 3161036500 ecr 637749477], length 0



I tried to do an OVS trace and in that it shows packet going to geneve interface, but I am not really able to find the specific flows in OVS that does this trick. I will confirm with OVN team if the blockage is at GR or not and go from there.

Comment 21 Surya Seetharaman 2022-09-11 12:38:13 UTC
OVS trace:

oc exec -n ovn-kubernetes ovnkube-node-wtjvz -- ovs-appctl ofproto/trace breth0 ct_state=rel,in_port=LOCAL,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,icmp,icmp_type=3,icmp_code=4,nw_src=172.18.0.6,nw_dst=10.96.85.121,nw_ttl=64,dp_hash=1

Flow: dp_hash=0x1,ct_state=rel,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=172.18.0.6,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4

bridge("breth0")
----------------
 0. ip,in_port=LOCAL,nw_dst=10.96.0.0/16, priority 500, cookie 0xdeff105
    ct(commit,table=2,zone=64001,nat(src=169.254.169.2))
    nat(src=169.254.169.2)
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 2.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: dp_hash=0x1,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=172.18.0.6,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4
Megaflow: recirc_id=0,eth,ip,in_port=LOCAL,nw_dst=10.96.0.0/16,nw_frag=no
Datapath actions: ct(commit,zone=64001,nat(src=169.254.169.2)),recirc(0x1272)

===============================================================================
recirc(0x1272) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
 Initial flow: nw_src=172.18.0.6,tp_src=3,nw_dst=10.96.85.121,tp_dst=4
 Modified flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.96.85.121,tp_dst=4
===============================================================================

Flow: recirc_id=0x1272,dp_hash=0x1,ct_state=new|trk,ct_zone=64001,eth,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4

bridge("breth0")
----------------
    thaw
        Resuming from table 2
 2. priority 32768, cookie 0xdeff105
    set_field:02:42:ac:12:00:02->eth_dst
    output:2

bridge("br-int")
----------------
 0. in_port=4,vlan_tci=0x0000/0x1000, priority 100, cookie 0xa03c160d
    set_field:0x9->reg11
    set_field:0xa->reg12
    set_field:0xa->metadata
    set_field:0x1->reg14
    resubmit(,8)
 8. metadata=0xa, priority 50, cookie 0x675b1766
    set_field:0/0x1000->reg10
    resubmit(,73)
    73. reg0=0x2, priority 0
            drop
    move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
     -> NXM_NX_XXREG0[111] is now 0
    resubmit(,9)
 9. metadata=0xa, priority 0, cookie 0xa01a2cb3
    resubmit(,10)
10. metadata=0xa, priority 0, cookie 0xa61c6270
    resubmit(,11)
11. metadata=0xa, priority 0, cookie 0x80f7e256
    resubmit(,12)
12. metadata=0xa, priority 0, cookie 0xc2f1fcfa
    resubmit(,13)
13. ip,reg14=0x1,metadata=0xa, priority 110, cookie 0xbcd19ddd
    resubmit(,14)
14. metadata=0xa, priority 0, cookie 0x3bd3a1a8
    resubmit(,15)
15. metadata=0xa, priority 65535, cookie 0x2ca64e66
    resubmit(,16)
16. metadata=0xa, priority 65535, cookie 0xe10553c3
    resubmit(,17)
17. metadata=0xa, priority 0, cookie 0xd290a09a
    resubmit(,18)
18. metadata=0xa, priority 0, cookie 0x847b5a53
    resubmit(,19)
19. metadata=0xa, priority 0, cookie 0x57d98c45
    resubmit(,20)
20. metadata=0xa, priority 0, cookie 0xbc41f65b
    resubmit(,21)
21. metadata=0xa, priority 0, cookie 0xf755a524
    resubmit(,22)
22. metadata=0xa, priority 0, cookie 0xab2e9bcf
    resubmit(,23)
23. metadata=0xa, priority 0, cookie 0x23a0f81
    resubmit(,24)
24. metadata=0xa, priority 0, cookie 0xfe1a39f0
    resubmit(,25)
25. reg14=0x1,metadata=0xa, priority 100, cookie 0x1ddb8244
    resubmit(,26)
26. metadata=0xa, priority 0, cookie 0xf4fecff4
    resubmit(,27)
27. metadata=0xa, priority 0, cookie 0x13a55265
    resubmit(,28)
28. metadata=0xa, priority 0, cookie 0x4b52d41b
    resubmit(,29)
29. metadata=0xa, priority 0, cookie 0x949b0327
    resubmit(,30)
30. metadata=0xa, priority 0, cookie 0x7fc507f7
    resubmit(,31)
31. metadata=0xa,dl_dst=02:42:ac:12:00:02, priority 50, cookie 0x2b8dd290
    set_field:0x2->reg15
    resubmit(,37)
37. priority 0
    resubmit(,38)
38. reg15=0x2,metadata=0xa, priority 100, cookie 0x17a6a94c
    set_field:0x9->reg11
    set_field:0xa->reg12
    resubmit(,39)
39. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. ip,reg15=0x2,metadata=0xa, priority 110, cookie 0xfc6ba79d
    resubmit(,41)
41. metadata=0xa, priority 0, cookie 0x89b8716f
    resubmit(,42)
42. metadata=0xa, priority 0, cookie 0xad065d45
    resubmit(,43)
43. metadata=0xa, priority 65535, cookie 0x9c0fb0a8
    resubmit(,44)
44. metadata=0xa, priority 65535, cookie 0x6f927feb
    resubmit(,45)
45. metadata=0xa, priority 0, cookie 0x17df8f35
    resubmit(,46)
46. metadata=0xa, priority 0, cookie 0x9fcd5617
    resubmit(,47)
47. metadata=0xa, priority 0, cookie 0x31f290a
    resubmit(,48)
48. metadata=0xa, priority 0, cookie 0x1c286523
    set_field:0/0x1000->reg10
    resubmit(,75)
    75. reg0=0x2, priority 0
            drop
    move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
     -> NXM_NX_XXREG0[111] is now 0
    resubmit(,49)
49. metadata=0xa, priority 0, cookie 0x932b1117
    resubmit(,64)
64. priority 0
    resubmit(,65)
65. reg15=0x2,metadata=0xa, priority 100, cookie 0x17a6a94c
    clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x8->reg11,set_field:0x7->metadata,set_field:0x2->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8))
    ct_clear
    set_field:0->reg11
    set_field:0->reg12
    set_field:0->reg13
    set_field:0x8->reg11
    set_field:0x7->metadata
    set_field:0x2->reg14
    set_field:0->reg10
    set_field:0->reg15
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,8)
 8. reg14=0x2,metadata=0x7,dl_dst=02:42:ac:12:00:02, priority 50, cookie 0x63ff44ef
    set_field:0x242ac1200020000000000000000/0xffffffffffff0000000000000000->xxreg0
    resubmit(,9)
 9. metadata=0x7, priority 0, cookie 0x5d391334
    set_field:0x4/0x4->xreg4
    resubmit(,10)
10. reg9=0/0x8,metadata=0x7, priority 100, cookie 0x147d0e01
    resubmit(,11)
11. metadata=0x7, priority 0, cookie 0xb3b01a32
    resubmit(,12)
12. metadata=0x7, priority 0, cookie 0x6d08ac9f
    resubmit(,13)
13. ip,metadata=0x7,nw_dst=10.96.85.121, priority 100, cookie 0x8e7b2b70
    set_field:0xa605579000000000000000000000000/0xffffffff000000000000000000000000->xxreg0
    ct(table=14,zone=NXM_NX_REG11[0..15],nat)
    nat
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 14.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: recirc_id=0x1272,dp_hash=0x1,ct_state=new|trk,ct_zone=64001,eth,icmp,in_port=LOCAL,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4
Megaflow: recirc_id=0x1272,ct_state=+new-est+trk,ct_mark=0/0x2,eth,icmp,in_port=LOCAL,dl_src=02:42:ac:12:00:02,dl_dst=0a:58:64:40:00:04,nw_src=168.0.0.0/6,nw_dst=10.96.85.121,nw_ttl=64,nw_frag=no
Datapath actions: set(eth(dst=02:42:ac:12:00:02)),ct(zone=8,nat),recirc(0x1276)

===============================================================================
recirc(0x1276) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
 Initial flow: 
 Modified flow: 
===============================================================================

Flow: recirc_id=0x1276,dp_hash=0x1,ct_state=new|trk,ct_zone=8,eth,icmp,reg0=0xa605579,reg1=0xac120002,reg9=0x4,reg11=0x8,reg14=0x2,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4

bridge("breth0")
----------------
    thaw
        Resuming from table 14
14. ct_state=+new+trk,ip,reg0=0xa605579,metadata=0x7, priority 110, cookie 0x25a58ee5
    set_field:0x8/0x8->reg10
    group:3
     -> using bucket 0
    bucket 0
            ct(commit,table=15,zone=NXM_NX_REG11[0..15],nat(dst=10.244.2.7),exec(set_field:0x2/0x2->ct_mark))
            nat(dst=10.244.2.7)
            set_field:0x2/0x2->ct_mark
             -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 15.
             -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: recirc_id=0x1276,dp_hash=0x1,ct_state=new|trk,ct_zone=8,eth,icmp,reg0=0xa605579,reg1=0xac120002,reg9=0x4,reg10=0x8,reg11=0x8,reg14=0x2,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.96.85.121,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4
Megaflow: recirc_id=0x1276,dp_hash=0x1/0xf,ct_state=+new-est+trk,ct_mark=0/0x2,eth,ip,in_port=4,nw_frag=no
Datapath actions: ct(commit,zone=8,mark=0x2/0x2,nat(dst=10.244.2.7)),recirc(0x1277)

===============================================================================
recirc(0x1277) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
 Initial flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.96.85.121,tp_dst=4
 Modified flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.244.2.7,tp_dst=4
===============================================================================

Flow: recirc_id=0x1277,dp_hash=0x1,ct_state=new|trk,ct_zone=8,ct_mark=0x2,eth,icmp,reg0=0xa605579,reg1=0xac120002,reg9=0x4,reg10=0x8,reg11=0x8,reg14=0x2,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=4

bridge("breth0")
----------------
    thaw
        Resuming from table 15
15. metadata=0x7, priority 0, cookie 0x55306c60
    resubmit(,16)
16. metadata=0x7, priority 0, cookie 0xc8e906fb
    resubmit(,17)
17. metadata=0x7, priority 0, cookie 0x96fbdd00
    resubmit(,18)
18. metadata=0x7, priority 0, cookie 0x9ba30ecb
    set_field:0/0xffffffff->xxreg1
    resubmit(,19)
19. ip,reg7=0,metadata=0x7,nw_dst=10.244.0.0/16, priority 49, cookie 0xc2f154
    dec_ttl()
    set_field:0/0xffff00000000->xreg4
    set_field:0x64400001000000000000000000000000/0xffffffff000000000000000000000000->xxreg0
    set_field:0x644000040000000000000000/0xffffffff0000000000000000->xxreg0
    set_field:0a:58:64:40:00:04->eth_src
    set_field:0x1->reg15
    set_field:0x1/0x1->reg10
    resubmit(,20)
20. reg8=0/0xffff,metadata=0x7, priority 150, cookie 0x72ae663a
    resubmit(,21)
21. metadata=0x7, priority 0, cookie 0x8ffb031
    set_field:0/0xffff00000000->xreg4
    resubmit(,22)
22. reg8=0/0xffff,metadata=0x7, priority 150, cookie 0x667193ad
    resubmit(,23)
23. ip,metadata=0x7, priority 0, cookie 0xe1496db1
    push:NXM_NX_REG0[]
    push:NXM_NX_XXREG0[96..127]
    pop:NXM_NX_REG0[]
     -> NXM_NX_REG0[] is now 0x64400001
    set_field:00:00:00:00:00:00->eth_dst
    resubmit(,66)
    66. reg0=0x64400001,reg15=0x1,metadata=0x7, priority 100, cookie 0xcdb61423
            set_field:0a:58:64:40:00:01->eth_dst
            set_field:0x40/0x40->reg10
    pop:NXM_NX_REG0[]
     -> NXM_NX_REG0[] is now 0x64400001
    resubmit(,24)
24. metadata=0x7, priority 0, cookie 0xea88d1b1
    resubmit(,25)
25. metadata=0x7, priority 0, cookie 0xb58db599
    resubmit(,26)
26. metadata=0x7, priority 0, cookie 0x1a3bb75c
    resubmit(,27)
27. metadata=0x7, priority 0, cookie 0x6695c541
    resubmit(,37)
37. priority 0
    resubmit(,38)
38. reg15=0x1,metadata=0x7, priority 100, cookie 0xc260d7b3
    set_field:0x8->reg11
    resubmit(,39)
39. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. metadata=0x7, priority 0, cookie 0xe6226cf5
    set_field:0/0x10->xreg4
    resubmit(,41)
41. ip,metadata=0x7, priority 50, cookie 0xa39487b0
    set_field:0x1/0x1->reg10
    ct(table=42,zone=NXM_NX_REG11[0..15],nat)
    nat
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 42.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: recirc_id=0x1277,dp_hash=0x1,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4
Megaflow: recirc_id=0x1277,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x1,eth,ip,in_port=4,dl_src=02:42:ac:12:00:02,dl_dst=02:42:ac:12:00:02,nw_dst=10.244.0.0/16,nw_ttl=64,nw_frag=no
Datapath actions: set(eth(src=0a:58:64:40:00:04,dst=0a:58:64:40:00:01)),set(ipv4(ttl=63)),ct(zone=8,nat),recirc(0x1278)

===============================================================================
recirc(0x1278) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
 Initial flow: 
 Modified flow: 
===============================================================================

Flow: recirc_id=0x1278,dp_hash=0x1,ct_state=new|trk,ct_zone=8,ct_mark=0x2,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4

bridge("breth0")
----------------
    thaw
        Resuming from table 42
42. ct_state=+new+trk,ip,metadata=0x7, priority 50, cookie 0x73b392f
    ct(commit,zone=NXM_NX_REG11[0..15],nat(src))
    nat(src)
     -> Sets the packet to an untracked state, and clears all the conntrack fields.
    resubmit(,43)
43. ip,reg10=0x8/0x8,reg15=0x1,metadata=0x7, priority 110, cookie 0xaa7da553
    ct(commit,table=44,zone=NXM_NX_REG12[0..15],nat(src=100.64.0.4))
    nat(src=100.64.0.4)
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 44.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: recirc_id=0x1278,dp_hash=0x1,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4
Megaflow: recirc_id=0x1278,ct_state=+new+trk,eth,ip,in_port=4,nw_frag=no
Datapath actions: ct(commit,zone=8,nat(src)),ct(commit,nat(src=100.64.0.4)),recirc(0x1279)

===============================================================================
recirc(0x1279) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
 Initial flow: nw_src=169.254.169.2,tp_src=3,nw_dst=10.244.2.7,tp_dst=4
 Modified flow: nw_src=100.64.0.4,tp_src=3,nw_dst=10.244.2.7,tp_dst=4
===============================================================================

Flow: recirc_id=0x1279,dp_hash=0x1,ct_state=new|trk,eth,icmp,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=4,vlan_tci=0x0000,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=100.64.0.4,nw_dst=10.244.2.7,nw_tos=0,nw_ecn=0,nw_ttl=63,icmp_type=3,icmp_code=4

bridge("breth0")
----------------
    thaw
        Resuming from table 44
44. metadata=0x7, priority 0, cookie 0xc7eac39d
    resubmit(,45)
45. metadata=0x7, priority 0, cookie 0x99a1f85e
    resubmit(,46)
46. reg15=0x1,metadata=0x7, priority 100, cookie 0xb9cdbffb
    resubmit(,64)
64. reg10=0x1/0x1,reg15=0x1,metadata=0x7, priority 100, cookie 0xc260d7b3
    push:NXM_OF_IN_PORT[]
    set_field:ANY->in_port
    resubmit(,65)
    65. reg15=0x1,metadata=0x7, priority 100, cookie 0xc260d7b3
            clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x6->reg11,set_field:0x1->reg12,set_field:0x2->metadata,set_field:0x4->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8))
            ct_clear
            set_field:0->reg11
            set_field:0->reg12
            set_field:0->reg13
            set_field:0x6->reg11
            set_field:0x1->reg12
            set_field:0x2->metadata
            set_field:0x4->reg14
            set_field:0->reg10
            set_field:0->reg15
            set_field:0->reg0
            set_field:0->reg1
            set_field:0->reg2
            set_field:0->reg3
            set_field:0->reg4
            set_field:0->reg5
            set_field:0->reg6
            set_field:0->reg7
            set_field:0->reg8
            set_field:0->reg9
            resubmit(,8)
         8. metadata=0x2, priority 50, cookie 0x675b1766
            set_field:0/0x1000->reg10
            resubmit(,73)
            73. reg0=0x2, priority 0
                    drop
            move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
             -> NXM_NX_XXREG0[111] is now 0
            resubmit(,9)
         9. metadata=0x2, priority 0, cookie 0xa01a2cb3
            resubmit(,10)
        10. metadata=0x2, priority 0, cookie 0xa61c6270
            resubmit(,11)
        11. metadata=0x2, priority 0, cookie 0x80f7e256
            resubmit(,12)
        12. metadata=0x2, priority 0, cookie 0xc2f1fcfa
            resubmit(,13)
        13. ip,reg14=0x4,metadata=0x2, priority 110, cookie 0xf6d92dd
            resubmit(,14)
        14. metadata=0x2, priority 0, cookie 0x3bd3a1a8
            resubmit(,15)
        15. metadata=0x2, priority 65535, cookie 0x2ca64e66
            resubmit(,16)
        16. metadata=0x2, priority 65535, cookie 0xe10553c3
            resubmit(,17)
        17. metadata=0x2, priority 0, cookie 0xd290a09a
            resubmit(,18)
        18. metadata=0x2, priority 0, cookie 0x847b5a53
            resubmit(,19)
        19. metadata=0x2, priority 0, cookie 0x57d98c45
            resubmit(,20)
        20. metadata=0x2, priority 0, cookie 0xbc41f65b
            resubmit(,21)
        21. metadata=0x2, priority 0, cookie 0xf755a524
            resubmit(,22)
        22. metadata=0x2, priority 0, cookie 0xab2e9bcf
            resubmit(,23)
        23. metadata=0x2, priority 0, cookie 0x23a0f81
            resubmit(,24)
        24. metadata=0x2, priority 0, cookie 0xfe1a39f0
            resubmit(,25)
        25. metadata=0x2, priority 0, cookie 0xdeb00bf1
            resubmit(,26)
        26. metadata=0x2, priority 0, cookie 0xf4fecff4
            resubmit(,27)
        27. metadata=0x2, priority 0, cookie 0x13a55265
            resubmit(,28)
        28. metadata=0x2, priority 0, cookie 0x4b52d41b
            resubmit(,29)
        29. metadata=0x2, priority 0, cookie 0x949b0327
            resubmit(,30)
        30. metadata=0x2, priority 0, cookie 0x7fc507f7
            resubmit(,31)
        31. metadata=0x2,dl_dst=0a:58:64:40:00:01, priority 50, cookie 0xfe5621e5
            set_field:0x1->reg15
            resubmit(,37)
        37. priority 0
            resubmit(,38)
        38. reg15=0x1,metadata=0x2, priority 100, cookie 0xbedd970f
            set_field:0x6->reg11
            set_field:0x1->reg12
            resubmit(,39)
        39. priority 0
            set_field:0->reg0
            set_field:0->reg1
            set_field:0->reg2
            set_field:0->reg3
            set_field:0->reg4
            set_field:0->reg5
            set_field:0->reg6
            set_field:0->reg7
            set_field:0->reg8
            set_field:0->reg9
            resubmit(,40)
        40. ip,reg15=0x1,metadata=0x2, priority 110, cookie 0x4e61006d
            resubmit(,41)
        41. metadata=0x2, priority 0, cookie 0x89b8716f
            resubmit(,42)
        42. metadata=0x2, priority 0, cookie 0xad065d45
            resubmit(,43)
        43. metadata=0x2, priority 65535, cookie 0x9c0fb0a8
            resubmit(,44)
        44. metadata=0x2, priority 65535, cookie 0x6f927feb
            resubmit(,45)
        45. metadata=0x2, priority 0, cookie 0x17df8f35
            resubmit(,46)
        46. metadata=0x2, priority 0, cookie 0x9fcd5617
            resubmit(,47)
        47. metadata=0x2, priority 0, cookie 0x31f290a
            resubmit(,48)
        48. metadata=0x2, priority 0, cookie 0x1c286523
            set_field:0/0x1000->reg10
            resubmit(,75)
            75. reg0=0x2, priority 0
                    drop
            move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
             -> NXM_NX_XXREG0[111] is now 0
            resubmit(,49)
        49. metadata=0x2, priority 0, cookie 0x932b1117
            resubmit(,64)
        64. priority 0
            resubmit(,65)
        65. reg15=0x1,metadata=0x2, priority 100, cookie 0xbedd970f
            clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x2->reg11,set_field:0x3->reg12,set_field:0x1->metadata,set_field:0x1->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8))
            ct_clear
            set_field:0->reg11
            set_field:0->reg12
            set_field:0->reg13
            set_field:0x2->reg11
            set_field:0x3->reg12
            set_field:0x1->metadata
            set_field:0x1->reg14
            set_field:0->reg10
            set_field:0->reg15
            set_field:0->reg0
            set_field:0->reg1
            set_field:0->reg2
            set_field:0->reg3
            set_field:0->reg4
            set_field:0->reg5
            set_field:0->reg6
            set_field:0->reg7
            set_field:0->reg8
            set_field:0->reg9
            resubmit(,8)
         8. reg14=0x1,metadata=0x1,dl_dst=0a:58:64:40:00:01, priority 50, cookie 0x30c52ddd
            set_field:0xa58644000010000000000000000/0xffffffffffff0000000000000000->xxreg0
            resubmit(,9)
         9. metadata=0x1, priority 0, cookie 0x5d391334
            set_field:0x4/0x4->xreg4
            resubmit(,10)
        10. reg9=0/0x8,metadata=0x1, priority 100, cookie 0x147d0e01
            resubmit(,11)
        11. metadata=0x1, priority 0, cookie 0xb3b01a32
            resubmit(,12)
        12. metadata=0x1, priority 0, cookie 0x6d08ac9f
            resubmit(,13)
        13. metadata=0x1, priority 0, cookie 0x63fe1fc2
            resubmit(,14)
        14. metadata=0x1, priority 0, cookie 0x750d1480
            resubmit(,15)
        15. metadata=0x1, priority 0, cookie 0x55306c60
            resubmit(,16)
        16. metadata=0x1, priority 0, cookie 0xc8e906fb
            resubmit(,17)
        17. metadata=0x1, priority 0, cookie 0x96fbdd00
            resubmit(,18)
        18. metadata=0x1, priority 0, cookie 0x9ba30ecb
            set_field:0/0xffffffff->xxreg1
            resubmit(,19)
        19. ip,metadata=0x1,nw_dst=10.244.2.0/24, priority 74, cookie 0x71f49e7e
            dec_ttl()
            set_field:0/0xffff00000000->xreg4
            move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127]
             -> NXM_NX_XXREG0[96..127] is now 0xaf40207
            set_field:0xaf402010000000000000000/0xffffffff0000000000000000->xxreg0
            set_field:0a:58:0a:f4:02:01->eth_src
            set_field:0x3->reg15
            set_field:0x1/0x1->reg10
            resubmit(,20)
        20. reg8=0/0xffff,metadata=0x1, priority 150, cookie 0x72ae663a
            resubmit(,21)
        21. metadata=0x1, priority 0, cookie 0x8ffb031
            set_field:0/0xffff00000000->xreg4
            resubmit(,22)
        22. reg8=0/0xffff,metadata=0x1, priority 150, cookie 0x667193ad
            resubmit(,23)
        23. reg0=0xaf40207,reg15=0x3,metadata=0x1, priority 100, cookie 0xce04e12f
            set_field:0a:58:0a:f4:02:07->eth_dst
            resubmit(,24)
        24. metadata=0x1, priority 0, cookie 0xea88d1b1
            resubmit(,25)
        25. metadata=0x1, priority 0, cookie 0xb58db599
            resubmit(,26)
        26. reg15=0x3,metadata=0x1, priority 50, cookie 0x6c85f190
            set_field:0x5->reg15
            resubmit(,27)
        27. metadata=0x1, priority 0, cookie 0x6695c541
            resubmit(,37)
        37. reg15=0x5,metadata=0x1, priority 100, cookie 0x6dbb654c
            set_field:0x1/0xffffff->tun_id
            set_field:0x5->tun_metadata0
            move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30]
             -> NXM_NX_TUN_METADATA0[16..30] is now 0x1
            output:2
             -> output to kernel tunnel
            resubmit(,38)
        38. reg0=0x2, priority 0
            drop
    pop:NXM_OF_IN_PORT[]
     -> NXM_OF_IN_PORT[] is now 4

Final flow: unchanged
Megaflow: recirc_id=0x1279,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x3,eth,ip,tun_id=0/0xffffff,tun_metadata0=NP,in_port=4,dl_src=0a:58:64:40:00:04,dl_dst=0a:58:64:40:00:01,nw_src=100.64.0.4,nw_dst=10.244.2.7,nw_ecn=0,nw_ttl=63,nw_frag=no
Datapath actions: ct_clear,set(tunnel(tun_id=0x1,dst=172.18.0.4,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x10005}),flags(df|csum|key))),set(eth(src=0a:58:0a:f4:02:01,dst=0a:58:0a:f4:02:07)),set(ipv4(ttl=62)),2


Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port ovn-6f5615-0
            Interface ovn-6f5615-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="172.18.0.4"}

 2(ovn-6f5615-0): addr:f2:98:e2:94:8c:da
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max


but I don't see the frag packets at geneve tunnel in tcpdump :/

Comment 22 Surya Seetharaman 2022-09-12 11:39:43 UTC
(In reply to Surya Seetharaman from comment #20)
> Hmmm disregard my previous comments, I can see the RELATED ICMP type 3
> packet indeed getting DNAT-ed at the host thanks to conntrack and heading
> towards breth0. The last stage mangle POSTROUTING was hit. In breth0 we see:
> 
> 12:08:46.044178 breth0 In  ifindex 6 02:42:ac:12:00:06 ethertype IPv4
> (0x0800), length 596: (tos 0xc0, ttl 64, id 7313, offset 0, flags [none],
> proto ICMP (1), length 576)
>     172.18.0.6 > 192.168.10.0: ICMP 172.19.0.3 unreachable - need to frag
> (mtu 1200), length 556
>         (tos 0x0, ttl 61, id 34752, offset 0, flags [DF], proto TCP (6),
> length 1400)
>     192.168.10.0.80 > 172.19.0.3.37486: Flags [.], seq 264:1612, ack 84, win
> 505, options [nop,nop,TS val 3178929358 ecr 655651396], length 1348: HTTP
> 
> 
> So I think we back to the original question in comment17. Do OVN LB's allow
> "RELATED" ICMP packets through if protocol is set to TCP or UDP ?

Answer is no they don't. Opened https://bugzilla.redhat.com/show_bug.cgi?id=2126083 to merge first and then we can retest to make sure rest of it is fine. Original problem or purpose of this bug was to see if ICMP packets indeed go to the same backend or not when we have like say 1000 endpoints. I think for conntrack-ed connections this should be the case. The ICMP frag needed is a related packet so there should be no problem in taking that need frag to the same backend.

Comment 23 Surya Seetharaman 2022-12-06 13:29:25 UTC
Update: I tested the OVN fix that was provided, but it still didn't work as expected.
Today Dumitru, Ales and I had a debugging sessions and we found that in order for this to work well, we need to fix two more bugs in OVN: https://bugzilla.redhat.com/show_bug.cgi?id=2126083#c12 

So we need to wait till those fixes go in.

Comment 24 Surya Seetharaman 2022-12-29 12:32:34 UTC
I have tested the latest fixes that went in. We are looking good now.
upstream fix went in: https://github.com/ovn-org/ovn-kubernetes/pull/3330

Comment 25 Surya Seetharaman 2022-12-29 12:38:39 UTC
Also for the original bug description, we shouldn't be having that problem, since conntrack tracks the state as "related" it will give the frag needed packet back to the same backend. At least this is what I see when I test things, if that's not the case lmk after testing it in the lab setup. I was unable to reproduce the case where if there are multiple backends the DF needed packet gets sent to different backends.. IMHO its sent to the backend where the connection was established to.

Comment 38 Shiftzilla 2023-03-09 01:11:17 UTC
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira.

https://issues.redhat.com/browse/OCPBUGS-9079


Note You need to log in before you can comment on or make changes to this bug.