The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1973465 - check_pkt_larger action userspace handling fails to CT packet
Summary: check_pkt_larger action userspace handling fails to CT packet
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.13
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Aaron Conole
QA Contact: ovs-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-17 21:49 UTC by Tim Rozet
Modified: 2023-09-15 01:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-14 15:30:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
output showing the datapath and ofproto traces (181.34 KB, text/plain)
2021-06-17 21:54 UTC, Tim Rozet
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1379 0 None None None 2022-09-14 15:39:31 UTC

Description Tim Rozet 2021-06-17 21:49:17 UTC
Description of problem:
When using check_pkt_larger action on an older kernel where the datapath doesn't support the action, userspace will handle the packet. However, we see that the packet never actually makes it to the next action which is to be sent to conntrack.

Version-Release number of selected component (if applicable):
openvswitch2.13-2.13.0-90.el7fdp.x86_64
RHEL 7.9
sh-4.2# uname -a
Linux ip-10-0-58-75.us-east-2.compute.internal 3.10.0-1160.31.1.el7.x86_64 #1 SMP Wed May 26 20:18:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

The scenario here is OVN with OCP where we have a packet arriving at a node destined to be DNAT'ed by OVN after it passes through the OCP bridge br-ex. So topology is like this:

client ----> eth0---br-ex----br-int--- pod

The client sends an http request towards the node IP of 10.0.55.195 port 32721.

br-ex contains a flow:
cookie=0xa360a49979cae3e0, duration=1397.161s, table=0, n_packets=0, n_bytes=0, idle_age=16891, priority=100,tcp,in_port=1,tp_dst=32721 actions=check_pkt_larger(8915)->NXM_NX_RE
G0[0],resubmit(,11)

the check_pkt_larger will cause the packet to be punted to userspace, and the following flow in OVN will send it to CT zone 29:

12. ip,metadata=0x19,nw_dst=10.0.55.195, priority 100, cookie 0xc0e55757
    ct(table=13,zone=NXM_NX_REG11[0..15])
    drop
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 13.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: unchanged
Megaflow: recirc_id=0,ct_state=-new-est-rel-rpl-inv-trk,ct_label=0/0x3,eth,tcp,in_port=1,dl_src=02:3b:8f:03:17:04,dl_dst=02:18:09:84:de:de,nw_src=10.0.64.0/18,nw_dst=10.0.55.195,nw_ttl=64,nw_frag=no,tp_dst=32721
Datapath actions: ct_clear,ct(zone=29),recirc(0x70)
This flow is handled by the userspace slow path because it:
  - Uses action(s) not supported by datapath.

===============================================================================
recirc(0x70) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
===============================================================================

Flow: recirc_id=0x70,dp_hash=0x1,ct_state=new|trk,ct_zone=29,eth,tcp,reg0=0x218,reg1=0x984dede,reg9=0x4,reg11=0x1d,reg13=0x21,reg14=0x2,metadata=0x19,in_port=9,vlan_tci=0x0000,dl_src=02:3b:8f:03:17:04,dl_dst=02:18:09:84:de:de,nw_src=10.0.74.171,nw_dst=10.0.55.195,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=2222,tp_dst=32721,tcp_flags=0

However the packet never hits kernel conntrack. We see no errors incremented in conntrack, no event hits the log, and no conntrack entry.

If we override the flow in br-ex to not do the check_pkt_length check, the traffic works as expected.

Comment 1 Tim Rozet 2021-06-17 21:54:10 UTC
Created attachment 1791919 [details]
output showing the datapath and ofproto traces

Comment 4 Tim Rozet 2021-06-18 18:59:29 UTC
I think the urgency of this bug is reduced because we've decided to detect if OVS supports the action or not in datapath using ovn-kubernetes and then disable those flows if not:
https://github.com/ovn-org/ovn-kubernetes/pull/2267

Comment 8 Red Hat Bugzilla 2023-09-15 01:34:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days


Note You need to log in before you can comment on or make changes to this bug.