Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1973465

Summary: check_pkt_larger action userspace handling fails to CT packet
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Tim Rozet <trozet>
Component: openvswitch2.13Assignee: Aaron Conole <aconole>
Status: CLOSED INSUFFICIENT_DATA QA Contact: ovs-qe
Severity: medium Docs Contact:
Priority: medium    
Version: RHEL 8.0CC: aconole, ctrautma, jhsiao, nusiddiq, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-14 15:30:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
output showing the datapath and ofproto traces none

Description Tim Rozet 2021-06-17 21:49:17 UTC
Description of problem:
When using check_pkt_larger action on an older kernel where the datapath doesn't support the action, userspace will handle the packet. However, we see that the packet never actually makes it to the next action which is to be sent to conntrack.

Version-Release number of selected component (if applicable):
openvswitch2.13-2.13.0-90.el7fdp.x86_64
RHEL 7.9
sh-4.2# uname -a
Linux ip-10-0-58-75.us-east-2.compute.internal 3.10.0-1160.31.1.el7.x86_64 #1 SMP Wed May 26 20:18:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

The scenario here is OVN with OCP where we have a packet arriving at a node destined to be DNAT'ed by OVN after it passes through the OCP bridge br-ex. So topology is like this:

client ----> eth0---br-ex----br-int--- pod

The client sends an http request towards the node IP of 10.0.55.195 port 32721.

br-ex contains a flow:
cookie=0xa360a49979cae3e0, duration=1397.161s, table=0, n_packets=0, n_bytes=0, idle_age=16891, priority=100,tcp,in_port=1,tp_dst=32721 actions=check_pkt_larger(8915)->NXM_NX_RE
G0[0],resubmit(,11)

the check_pkt_larger will cause the packet to be punted to userspace, and the following flow in OVN will send it to CT zone 29:

12. ip,metadata=0x19,nw_dst=10.0.55.195, priority 100, cookie 0xc0e55757
    ct(table=13,zone=NXM_NX_REG11[0..15])
    drop
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 13.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: unchanged
Megaflow: recirc_id=0,ct_state=-new-est-rel-rpl-inv-trk,ct_label=0/0x3,eth,tcp,in_port=1,dl_src=02:3b:8f:03:17:04,dl_dst=02:18:09:84:de:de,nw_src=10.0.64.0/18,nw_dst=10.0.55.195,nw_ttl=64,nw_frag=no,tp_dst=32721
Datapath actions: ct_clear,ct(zone=29),recirc(0x70)
This flow is handled by the userspace slow path because it:
  - Uses action(s) not supported by datapath.

===============================================================================
recirc(0x70) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
===============================================================================

Flow: recirc_id=0x70,dp_hash=0x1,ct_state=new|trk,ct_zone=29,eth,tcp,reg0=0x218,reg1=0x984dede,reg9=0x4,reg11=0x1d,reg13=0x21,reg14=0x2,metadata=0x19,in_port=9,vlan_tci=0x0000,dl_src=02:3b:8f:03:17:04,dl_dst=02:18:09:84:de:de,nw_src=10.0.74.171,nw_dst=10.0.55.195,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=2222,tp_dst=32721,tcp_flags=0

However the packet never hits kernel conntrack. We see no errors incremented in conntrack, no event hits the log, and no conntrack entry.

If we override the flow in br-ex to not do the check_pkt_length check, the traffic works as expected.

Comment 1 Tim Rozet 2021-06-17 21:54:10 UTC
Created attachment 1791919 [details]
output showing the datapath and ofproto traces

Comment 4 Tim Rozet 2021-06-18 18:59:29 UTC
I think the urgency of this bug is reduced because we've decided to detect if OVS supports the action or not in datapath using ovn-kubernetes and then disable those flows if not:
https://github.com/ovn-org/ovn-kubernetes/pull/2267

Comment 8 Red Hat Bugzilla 2023-09-15 01:34:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days