Bug 1870359 - OVN allows invalid CT tracked packets to be sent
Summary: OVN allows invalid CT tracked packets to be sent
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-19 20:46 UTC by Tim Rozet
Modified: 2020-10-27 09:49 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 09:49:12 UTC
Target Upstream Version:


Attachments (Terms of Use)
OVN DBs (267.55 KB, application/gzip)
2020-08-19 21:03 UTC, Tim Rozet
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4356 None None None 2020-10-27 09:49:48 UTC

Description Tim Rozet 2020-08-19 20:46:15 UTC
Description of problem:
When sending packets which are part of an existing conntrack session, invalid packets will fall through the OpenFlow pipeline and still be sent to the destination. Consider the following:

client ----> k8s service ----server
              (OVN load balancer)

client opens a TCP connection with the k8s service (ovn load balancer VIP). OVN DNATs the packet and sends it to the backend server. The server responds and the connection is established. The server then sends a packet with an invalid sequence number. The packet is sent to CT as it enters OVN, CT marks the packet as invalid and sends it to table 43. However table 43 only has flows matching on positive CT states (not +inv). Therefore the packet falls through to the default priority 0 flows in the end of table 43 and the packet ends up getting sent out of the pipeline back to the client. The client then receives a packet that was never unDNAT'ed from the server, and responds with a TCP RST.

Table 43 should have a high priority flow to match on +inv+trk and drop the packet:
cookie=0x4f0293c3, duration=22115.967s, table=43, n_packets=0, n_bytes=0, idle_age=22115, priority=65534,ct_state=-new+est-rel-inv+trk,ct_label=0x2/0x2,metadata=0x5 actions=load:0x1->NXM_NX_XXREG0[98],resubmit(,44)
 cookie=0xad5674dc, duration=22115.966s, table=43, n_packets=0, n_bytes=0, idle_age=22115, priority=65534,ct_state=-new+est-rel-inv+trk,ct_label=0x2/0x2,metadata=0x3 actions=load:0x1->NXM_NX_XXREG0[98],resubmit(,44)
 cookie=0xfa6f76a2, duration=22115.965s, table=43, n_packets=788, n_bytes=108679, idle_age=1730, priority=65534,ct_state=-new+est-rel-inv+trk,ct_label=0x2/0x2,metadata=0x4 actions=load:0x1->NXM_NX_XXREG0[98],resubmit(,44)


ofproto trace showing the invalid packet making it to the client (port31):
Flow: tcp,in_port=30,vlan_tci=0x0000,dl_src=0a:58:0a:f4:01:1b,dl_dst=0a:58:0a:f4:01:1c,nw_src=10.244.1.27,nw_dst=10.244.1.28,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9000,tp_dst=3131,tcp_flags=0

bridge("br-int")
----------------
 0. in_port=30, priority 100, cookie 0xd6e5226c
    set_field:0x1f->reg13
    set_field:0x8->reg11
    set_field:0x3->reg12
    set_field:0x4->metadata
    set_field:0x4->reg14
    resubmit(,8)
 8. reg14=0x4,metadata=0x4,dl_src=0a:58:0a:f4:01:1b, priority 50, cookie 0x89bba098
    resubmit(,9)
 9. ip,reg14=0x4,metadata=0x4,dl_src=0a:58:0a:f4:01:1b,nw_src=10.244.1.27, priority 90, cookie 0x68f65df9
    resubmit(,10)
10. metadata=0x4, priority 0, cookie 0xbd3212b
    resubmit(,11)
11. metadata=0x4, priority 0, cookie 0x1105bf77
    resubmit(,12)
12. metadata=0x4, priority 0, cookie 0x98ecb9aa
    resubmit(,13)
13. metadata=0x4, priority 0, cookie 0xf7f53d2b
    resubmit(,14)
14. metadata=0x4, priority 0, cookie 0xe038fa3f
    resubmit(,15)
15. metadata=0x4, priority 0, cookie 0x99688f3a
    resubmit(,16)
16. metadata=0x4, priority 0, cookie 0xd3a49ca9
    resubmit(,17)
17. metadata=0x4, priority 0, cookie 0xd1edadb
    resubmit(,18)
18. metadata=0x4, priority 0, cookie 0x37209040
    resubmit(,19)
19. metadata=0x4, priority 0, cookie 0x62776d1b
    resubmit(,20)
20. metadata=0x4, priority 0, cookie 0xc2cc5b08
    resubmit(,21)
21. metadata=0x4, priority 0, cookie 0xf759e368
    resubmit(,22)
22. metadata=0x4, priority 0, cookie 0x8b4faee3
    resubmit(,23)
23. metadata=0x4, priority 0, cookie 0xa73b5704
    resubmit(,24)
24. metadata=0x4, priority 0, cookie 0xab02af23
    resubmit(,25)
25. metadata=0x4, priority 0, cookie 0xb8b6401d
    resubmit(,26)
26. metadata=0x4, priority 0, cookie 0x2dc0adf6
    resubmit(,27)
27. metadata=0x4,dl_dst=0a:58:0a:f4:01:1c, priority 50, cookie 0x58d49bbf
    set_field:0x5->reg15
    resubmit(,32)
32. priority 0
    resubmit(,33)
33. reg15=0x5,metadata=0x4, priority 100
    set_field:0x20->reg13
    set_field:0x8->reg11
    set_field:0x3->reg12
    resubmit(,34)
34. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. ip,metadata=0x4, priority 100, cookie 0x2ec3d7b9
    set_field:0x1000000000000000000000000/0x1000000000000000000000000->xxreg0
    resubmit(,41)
41. metadata=0x4, priority 0, cookie 0xe46d6be1
    resubmit(,42)
42. ip,reg0=0x1/0x1,metadata=0x4, priority 100, cookie 0xcdf67991
    ct(table=43,zone=NXM_NX_REG13[0..15])
    drop
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 43.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: tcp,reg0=0x1,reg11=0x8,reg12=0x3,reg13=0x20,reg14=0x4,reg15=0x5,metadata=0x4,in_port=30,vlan_tci=0x0000,dl_src=0a:58:0a:f4:01:1b,dl_dst=0a:58:0a:f4:01:1c,nw_src=10.244.1.27,nw_dst=10.244.1.28,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9000,tp_dst=3131,tcp_flags=0
Megaflow: recirc_id=0,ct_state=-new-est-rel-inv-trk,ct_label=0/0x2,eth,ip,in_port=30,vlan_tci=0x0000/0x1000,dl_src=0a:58:0a:f4:01:1b,dl_dst=0a:58:0a:f4:01:1c,nw_src=10.244.1.27,nw_dst=10.244.1.28/30,nw_frag=no
Datapath actions: ct(zone=32),recirc(0x104)

===============================================================================
recirc(0x104) - resume conntrack with ct_state=inv|trk
===============================================================================

Flow: recirc_id=0x104,ct_state=inv|trk,ct_zone=32,eth,tcp,reg0=0x1,reg11=0x8,reg12=0x3,reg13=0x20,reg14=0x4,reg15=0x5,metadata=0x4,in_port=30,vlan_tci=0x0000,dl_src=0a:58:0a:f4:01:1b,dl_dst=0a:58:0a:f4:01:1c,nw_src=10.244.1.27,nw_dst=10.244.1.28,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=9000,tp_dst=3131,tcp_flags=0

bridge("br-int")
----------------
    thaw
        Resuming from table 43
43. metadata=0x4, priority 0, cookie 0x9e861491
    resubmit(,44)
44. metadata=0x4, priority 0, cookie 0xcd7b2b8d
    resubmit(,45)
45. metadata=0x4, priority 0, cookie 0x26b52d0e
    resubmit(,46)
46. metadata=0x4, priority 0, cookie 0x184ffe2f
    resubmit(,47)
47. metadata=0x4, priority 0, cookie 0xac3acd4
    resubmit(,48)
48. ip,reg15=0x5,metadata=0x4,dl_dst=0a:58:0a:f4:01:1c,nw_dst=10.244.1.28, priority 90, cookie 0x719d0c53
    resubmit(,49)
49. reg15=0x5,metadata=0x4,dl_dst=0a:58:0a:f4:01:1c, priority 50, cookie 0xd19055ca
    resubmit(,64)
64. priority 0
    resubmit(,65)
65. reg15=0x5,metadata=0x4, priority 100, cookie 0xe269684c
    output:31

Final flow: unchanged
Megaflow: recirc_id=0x104,ct_state=-new-est-rel+inv+trk,ct_label=0/0x2,eth,ip,in_port=30,dl_src=0a:58:0a:f4:01:1b,dl_dst=0a:58:0a:f4:01:1c,nw_src=10.244.1.16/28,nw_dst=10.244.1.28,nw_frag=no
Datapath actions: 10

Comment 1 Tim Rozet 2020-08-19 21:03:26 UTC
Created attachment 1711943 [details]
OVN DBs

Comment 2 Tim Rozet 2020-08-28 14:11:37 UTC
Numan and I discussed that we simply cannot add a flow to match +inv and drop. This is because all return traffic will be sent to CT, but only some CT traffic may really be part of a previous session. Therefore it will be marked as inv and should be sent on its way and not dropped. To put it more simply we need a way to differentiate invalid packets that were part of a session vs invalid packets that are invalid because CT has no current session. To do this we can leverage CT_MARK and CT_LABEL to match on as well, which will be zero if there was no previous session for this traffic.

Comment 3 Numan Siddique 2020-09-07 12:45:48 UTC
Submitted the patch to fix this issue by sending all the packets to conntrack if LB is associated with a logical switch.
Not good interms of performance if a logical switch has no ACLs with allow-related configured.

https://patchwork.ozlabs.org/project/ovn/patch/20200907124320.830247-1-numans@ovn.org/

But it needs to be accurate.

Comment 8 Jianlin Shi 2020-10-13 02:59:13 UTC
setup lb with following script:

server:
systemctl start openvswitch                                                                                                                           
systemctl start ovn-northd                                                                                         
ovn-nbctl set-connection ptcp:6641                                                                                            
ovn-sbctl set-connection ptcp:6642                                                                                      
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:1.1.23.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=1.1.23.25
systemctl restart ovn-controller                                                                                                                                                              
ip netns add server0                                    
ip link add veth0_s0 netns server0 type veth peer name veth0_s0_p                                                             
ip netns exec server0 ip link set lo up                                                              
ip netns exec server0 ip link set veth0_s0 up
ip netns exec server0 ip link set veth0_s0 address 00:00:00:01:01:02
ip netns exec server0 ip addr add 192.168.1.1/24 dev veth0_s0
ip netns exec server0 ip -6 addr add 2001::1/64 dev veth0_s0                  
ip netns exec server0 ip route add default via 192.168.1.254 dev veth0_s0
ip netns exec server0 ip -6 route add default via 2001::a dev veth0_s0        
ovs-vsctl add-port br-int veth0_s0_p                                          
ip link set veth0_s0_p up                                                     
ovs-vsctl set interface veth0_s0_p external_ids:iface-id=ls1p1           
                                                                              
ovn-nbctl ls-add ls1                                                          
ovn-nbctl lsp-add ls1 ls1p1                                                   
#ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02 2001::1 192.168.1.1"
ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02 192.168.1.1 2001::1"
ovn-nbctl lsp-add ls1 ls1p2                                      
ovn-nbctl lsp-set-addresses ls1p2 "00:00:00:01:02:02 192.168.1.2 2001::2"
ovn-nbctl lr-add lr1                         
ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:00:01 192.168.1.254/24 2001::a/64
ovn-nbctl lsp-add ls1 ls1-lr1                                
ovn-nbctl lsp-set-addresses ls1-lr1 "00:00:00:00:00:01 192.168.1.254 2001::a"                                                                                    
ovn-nbctl lsp-set-type ls1-lr1 router                                    
ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1                 
                                                                 
ovn-nbctl lrp-add lr1 lr1-ls2 00:00:00:00:00:02 192.168.2.254/24 2002::a/64
                                             
ovn-nbctl ls-add ls2                                                
ovn-nbctl lsp-add ls2 ls2-lr1                                
ovn-nbctl lsp-set-addresses ls2-lr1 "00:00:00:00:00:02 192.168.2.254 2002::a"
ovn-nbctl lsp-set-type ls2-lr1 router                                    
ovn-nbctl lsp-set-options ls2-lr1 router-port=lr1-ls2                 
                                    
ovn-nbctl lsp-add ls2 ls2p1       
ovn-nbctl lsp-set-addresses ls2p1 "00:00:00:02:01:02 192.168.2.1 2002::1"

ovn-nbctl lsp-add ls1 ls1p3
ovn-nbctl lsp-set-addresses ls1p3 "00:00:00:01:03:02 192.168.1.3 2001::3"

ip netns add server2
ip link add veth0_s2 netns server2 type veth peer name veth0_s2_p
ip netns exec server2 ip link set lo up
ip netns exec server2 ip link set veth0_s2 up
ip netns exec server2 ip link set veth0_s2 address 00:00:00:01:03:02
ip netns exec server2 ip addr add 192.168.1.3/24 dev veth0_s2
ip netns exec server2 ip -6 addr add 2001::3/64 dev veth0_s2
ip netns exec server2 ip route add default via 192.168.1.254 dev veth0_s2
ip netns exec server2 ip -6 route add default via 2001::a dev veth0_s2

ovs-vsctl add-port br-int veth0_s2_p
ip link set veth0_s2_p up
ovs-vsctl set interface veth0_s2_p external_ids:iface-id=ls1p3


ovn-nbctl lb-add lb0 192.168.1.100 192.168.1.1,192.168.1.2
ovn-nbctl ls-lb-add ls2 lb0


client:

#!/bin/bash

systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:1.1.23.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=1.1.23.26

systemctl start ovn-controller

ip netns add server1
ip link add veth0_s1 netns server1 type veth peer name veth0_s1_p
ip netns exec server1 ip link set lo up
ip netns exec server1 ip link set veth0_s1 up
ip netns exec server1 ip link set veth0_s1 address 00:00:00:01:02:02
ip netns exec server1 ip addr add 192.168.1.2/24 dev veth0_s1
ip netns exec server1 ip -6 addr add 2001::2/64 dev veth0_s1
ip netns exec server1 ip route add default via 192.168.1.254 dev veth0_s1
ip netns exec server1 ip -6 route add default via 2001::a dev veth0_s1

ovs-vsctl add-port br-int veth0_s1_p
ip link set veth0_s1_p up
ovs-vsctl set interface veth0_s1_p external_ids:iface-id=ls1p2

ip netns add client0
ip link add veth0_c0 netns client0 type veth peer name veth0_c0_p
ip netns exec client0 ip link set lo up
ip netns exec client0 ip link set veth0_c0 up
ip netns exec client0 ip link set veth0_c0 address 00:00:00:02:01:02
ip netns exec client0 ip addr add 192.168.2.1/24 dev veth0_c0
ip netns exec client0 ip -6 addr add 2002::1/64 dev veth0_c0
ip netns exec client0 ip route add default via 192.168.2.254 dev veth0_c0
ip netns exec client0 ip -6 route add default via 2002::a dev veth0_c0

ovs-vsctl add-port br-int veth0_c0_p
ip link set veth0_c0_p up
ovs-vsctl set interface veth0_c0_p external_ids:iface-id=ls2p1

reproduced on ovn20.06.2-11:

[root@wsfd-advnetlab19 bz1870359]# rpm -qa | grep -E "openvswitch|ovn"
openvswitch2.13-2.13.0-51.el7fdp.x86_64
ovn2.13-host-20.06.2-11.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-15.el7fdp.noarch
ovn2.13-central-20.06.2-11.el7fdp.x86_64
ovn2.13-20.06.2-11.el7fdp.x86_64


ping lb on client in background:
[root@wsfd-advnetlab19 bz1870359]# ip netns exec client0 ping -q 192.168.1.100 &
ping lb backend in background on client:
[root@wsfd-advnetlab19 bz1870359]# ip netns exec client0 ping -q 192.168.1.1 &

check conntrack:

[root@wsfd-advnetlab19 bz1870359]# ovs-appctl -t ovs-vswitchd dpctl/dump-conntrack | grep 192.168.1.1
icmp,orig=(src=192.168.2.1,dst=192.168.1.100,id=23730,type=8,code=0),reply=(src=192.168.1.2,dst=192.168.2.1,id=23730,type=0,code=0),zone=2,labels=0x2

<=== on dst=192.168.1.100 is conntracked, dst=192.168.1.1 is not conntracked

Verified on ovn20.09.0-2:

[root@wsfd-advnetlab19 ovn20.09.0-2]# rpm -qa | grep -E "openvswitch|ovn"                                                                            
openvswitch2.13-2.13.0-51.el7fdp.x86_64                                                              
openvswitch-selinux-extra-policy-1.0-15.el7fdp.noarch                                                                                                
ovn2.13-20.09.0-2.el7fdp.x86_64                                       
ovn2.13-host-20.09.0-2.el7fdp.x86_64   
ovn2.13-central-20.09.0-2.el7fdp.x86_64


[root@wsfd-advnetlab19 ovn20.09.0-2]# ovs-appctl -t ovs-vswitchd dpctl/dump-conntrack | grep 192.168.1.1
icmp,orig=(src=192.168.2.1,dst=192.168.1.100,id=23730,type=8,code=0),reply=(src=192.168.1.2,dst=192.168.2.1,id=23730,type=0,code=0),zone=2,labels=0x2
icmp,orig=(src=192.168.2.1,dst=192.168.1.1,id=23731,type=8,code=0),reply=(src=192.168.1.1,dst=192.168.2.1,id=23731,type=0,code=0),zone=2

<==== dst=192.168.1.1 is also conntracked

Comment 9 Jianlin Shi 2020-10-14 01:27:12 UTC
Verified on rhel8 version:

[root@wsfd-advnetlab19 bz1870359]# rpm -qa | grep -E "openvswitch|ovn"
openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch                                                 
openvswitch2.13-2.13.0-61.el8fdp.x86_64
ovn2.13-host-20.09.0-2.el8fdp.x86_64                                                                  
ovn2.13-20.09.0-2.el8fdp.x86_64
ovn2.13-central-20.09.0-2.el8fdp.x86_64

[root@wsfd-advnetlab19 bz1870359]# ovs-appctl -t ovs-vswitchd dpctl/dump-conntrack | grep 192.168.1.1 
icmp,orig=(src=192.168.2.1,dst=192.168.1.100,id=24651,type=8,code=0),reply=(src=192.168.1.2,dst=192.168.2.1,id=24651,type=0,code=0),zone=1,labels=0x2
icmp,orig=(src=192.168.2.1,dst=192.168.1.1,id=24656,type=8,code=0),reply=(src=192.168.1.1,dst=192.168.2.1,id=24656,type=0,code=0),zone=1
icmp,orig=(src=192.168.2.1,dst=192.168.1.1,id=24648,type=8,code=0),reply=(src=192.168.1.1,dst=192.168.2.1,id=24648,type=0,code=0),zone=1

Comment 11 errata-xmlrpc 2020-10-27 09:49:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4356


Note You need to log in before you can comment on or make changes to this bug.