RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1983894 - Hostnetwork pod to service backed by hostnetwork on the same node is not working with OVN Kubernetes
Summary: Hostnetwork pod to service backed by hostnetwork on the same node is not work...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: ---
Assignee: Xin Long
QA Contact: Li Shuang
URL:
Whiteboard:
Depends On: 1953278 1961063 1986662
Blocks: 2014673 2024410 2024411
TreeView+ depends on / blocked
 
Reported: 2021-07-20 06:01 UTC by zenghui.shi
Modified: 2023-04-04 03:09 UTC (History)
9 users (show)

Fixed In Version: kernel-4.18.0-355.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2024410 2024411 (view as bug list)
Environment:
Last Closed: 2022-05-10 14:59:47 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/kernel rhel-8 merge_requests 1675 0 None None None 2021-11-17 17:08:34 UTC
Red Hat Product Errata RHSA-2022:1988 0 None None None 2022-05-10 15:00:26 UTC

Description zenghui.shi 2021-07-20 06:01:43 UTC
Description of problem:

Hostnetwork pod to service backed by hostnetwork on the same node is not working with OVN Kubernetes when ovs hardware offload is enabled.

Tested with the following ovs configurations

1. hw-offload=true + tc-policy=none   ->  Not working
2. hw-offload=true + tc-policy=skip_hw   -> Not working
3. hw-offload=true + tc-policy=skip_sw   -> Working


Version-Release number of selected component (if applicable):
Kernel: 4.18.0-322.el8.mr942_210708_1548.x86_64
OVS: openvswitch2.15-2.15.0-24.el8fdp.x86_64
OVN: ovn2.13-20.12.0-115.el8fdp.x86_64
OVN-Kubernetes: Built with https://github.com/ovn-org/ovn-kubernetes/pull/2042

How reproducible:
100%

Additional info:

1. hw-offload=true + tc-policy=none   ->  Not working

Service IP: 172.30.35.139:8081
Node IP: 192.168.111.27
Pod IP: 192.168.111.27:8081

======= Original direction =======

ufid:eec1af9e-6ab3-4af7-8dbc-64113a1924b9, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.0.0/255.255.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:27, bytes:2563, used:0.030s, dp:tc, actions:ct(commit,zone=64001,nat(src=169.254.169.2)),recirc(0x2b8)

ZONE 64001

ufid:6e5e6cc7-289e-4774-b91e-0a041a26c949, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2b8),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:0, bytes:0, used:3.390s, dp:tc, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x2b9)

ufid:f0b3c74f-d731-47c1-8055-f2caa77016f0, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2b8),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x2b9)

ZONE 40

ufid:d4666138-c62c-4f18-a51b-0c0bd5eeb476, recirc_id(0x2b9),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs, actions:hash(l4(0)),recirc(0x2ce)

ufid:93929299-d6c4-4166-811b-cadb8ac8b8fc, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2b9),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct(zone=40,nat),recirc(0x2bb)

ufid:d0110001-df86-4b17-ae99-b1b63235c866, recirc_id(0x2ce),dp_hash(0x4/0xf),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:ct(commit,zone=40,label=0x2/0x2,nat(dst=169.254.169.2:8081)),recirc(0x2bb)

ZONE 40
combination of CT/CT(nat) in the datapath

ufid:718eefe2-7118-4cbc-80b8-4e54a005b10e, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),recirc_id(0x2bb),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:0, bytes:0, used:3.390s, dp:tc, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x2bc)

ufid:8c0faf6f-b5cc-4b41-8197-9d114ca5d645, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x3e),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),recirc_id(0x2bb),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:5, bytes:740, used:0.030s, dp:tc, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x2bc)

ZONE 0

ufid:7a53f6e5-5fc9-449f-8578-a47c9164a39a, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2bc),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:3.390s, dp:tc, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x2cf)
ufid:9d928400-1942-4489-b23a-86d20e8d2fcc, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2bc),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x2cf)

ZONE 64001

ufid:46af91f0-2a0a-4f40-9fa4-2929bad59586, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2cf),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct(commit,zone=64002,nat(src=169.254.169.1)),recirc(0x2d0)

ZONE 64002

ufid:8d2f8f6e-dbb2-453b-9a26-e8af9740186e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2d0),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:13, bytes:2858, used:0.030s, dp:tc, actions:set(eth(src=52:54:00:56:00:31,dst=0c:42:a1:08:0a:da)),br-ex


======= Reply direction =======


ufid:9600f203-cad2-468e-aa42-f3dfbf6de620, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=169.254.169.1,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:8, bytes:2118, used:1.570s, dp:tc, actions:ct(zone=64002,nat),recirc(0x2d1)

ZONE 64002

ufid:fd440817-c51f-49c1-97aa-00fd19e34edd, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2d1),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:8, bytes:2118, used:1.570s, dp:tc, actions:ct(commit,zone=64001,nat),recirc(0x2b8)

ZONE 64001

ufid:6e5e6cc7-289e-4774-b91e-0a041a26c949, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2b8),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:0, bytes:0, used:3.390s, dp:tc, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x2b9)

ufid:f0b3c74f-d731-47c1-8055-f2caa77016f0, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2b8),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x2b9)

ZONE 40  (asymmetric paths for original and reply traffic )

ufid:d4666138-c62c-4f18-a51b-0c0bd5eeb476, recirc_id(0x2b9),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs, actions:hash(l4(0)),recirc(0x2ce)

ufid:93929299-d6c4-4166-811b-cadb8ac8b8fc, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2b9),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct(zone=40,nat),recirc(0x2bb)

ufid:d0110001-df86-4b17-ae99-b1b63235c866, recirc_id(0x2ce),dp_hash(0x4/0xf),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:ct(commit,zone=40,label=0x2/0x2,nat(dst=169.254.169.2:8081)),recirc(0x2bb)

ZONE 40

ufid:718eefe2-7118-4cbc-80b8-4e54a005b10e, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),recirc_id(0x2bb),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:0, bytes:0, used:3.390s, dp:tc, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x2bc)

ufid:8c0faf6f-b5cc-4b41-8197-9d114ca5d645, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x3e),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),recirc_id(0x2bb),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0), packets:5, bytes:740, used:0.030s, dp:tc, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x2bc)

ZONE 0

ufid:7a53f6e5-5fc9-449f-8578-a47c9164a39a, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2bc),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:3.390s, dp:tc, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x2cf)
ufid:9d928400-1942-4489-b23a-86d20e8d2fcc, skb_priority(0/0),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2bc),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x2cf)

ZONE 64001

ufid:46af91f0-2a0a-4f40-9fa4-2929bad59586, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2cf),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:5, bytes:740, used:0.030s, dp:tc, actions:ct(commit,zone=64002,nat(src=169.254.169.1)),recirc(0x2d0)

ZONE 64002

ufid:8d2f8f6e-dbb2-453b-9a26-e8af9740186e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x2d0),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:13, bytes:2858, used:0.030s, dp:tc, actions:set(eth(src=52:54:00:56:00:31,dst=0c:42:a1:08:0a:da)),br-ex


3. hw-offload=true + tc-policy=skip_sw   -> Working

Service IP: 172.30.128.247:8081
Node IP: 192.168.111.27
Pod IP: 192.168.111.27:8081

======= Original direction =======

ufid:c1caaed1-9c3e-4762-9b2a-dbfb64418ba5, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.0.0/255.255.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:33, bytes:4133, used:0.596s, flags:SFP., dp:ovs, actions:ct(commit,zone=64001,nat(src=169.254.169.2)),recirc(0x306)

ZONE 64001

ufid:d652a04b-9f09-4247-a78d-54ab6ba41347, recirc_id(0x306),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0),tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x308)

ufid:8707e569-5bad-4cff-990e-8fec8c666390, recirc_id(0x306),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0),tcp_flags(0/0), packets:4, bytes:264, used:0.596s, flags:F., dp:ovs, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x308)

ZONE 40

ufid:0f4f9b52-b45d-4bdb-be81-12c5df2d690f, recirc_id(0x308),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs, actions:hash(l4(0)),recirc(0x372)


ufid:964d18b5-7863-4484-9597-e134836e7e1a, recirc_id(0x308),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:5, bytes:413, used:0.596s, flags:FP., dp:ovs, actions:ct(zone=40,nat),recirc(0x309)

ufid:fb55ccad-fb55-4823-b650-3023609984de, recirc_id(0x372),dp_hash(0xa/0xf),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:ct(commit,zone=40,label=0x2/0x2,nat(dst=169.254.169.2:8081)),recirc(0x309)

ZONE 40
combination of CT/CT(nat) in the datapath (I assume this is fine as our issue here is broken traffic)

ufid:4ddf19a4-dd21-4640-a083-35bfce8187d2, recirc_id(0x309),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=0/0,tos=0/0,ttl=64,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x30a)

ufid:d39c5b03-e29e-4e08-82dc-0b0523a523fa, recirc_id(0x309),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=0/0,tos=0/0,ttl=64,frag=no), packets:4, bytes:264, used:0.595s, flags:F., dp:ovs, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x30a)

ZONE 0

ufid:9567edd0-2194-4a4a-9da6-44ab096258fe, recirc_id(0x30a),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x373)

ufid:facf3210-2fd1-42af-9e96-d33325253063, recirc_id(0x30a),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:4, bytes:264, used:0.596s, flags:F., dp:ovs, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x373)

ZONE 64001

ufid:cc3aaf83-dda8-405c-a6cf-5d9468c0187a, recirc_id(0x373),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:6, bytes:479, used:0.595s, flags:FP., dp:ovs, actions:ct(commit,zone=64002,nat(src=169.254.169.1)),recirc(0x374)

ZONE 64002

ufid:d33c1eef-2398-44b2-b95e-115567064b3f, recirc_id(0x374),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:12, bytes:1238, used:0.596s, flags:SFP., dp:ovs, actions:set(eth(src=52:54:00:56:00:31,dst=0c:42:a1:08:0a:da)),br-ex


======= Reply direction =======

ufid:b30fa22b-63a5-4f7c-8f20-8dd8d7f8967d, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=169.254.169.1,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:5, bytes:685, used:0.595s, flags:FP., dp:ovs, actions:ct(zone=64002,nat),recirc(0x375)

ZONE 64002

ufid:a7490624-14fa-432e-b095-b92e18b08174, recirc_id(0x375),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:5, bytes:685, used:0.596s, flags:FP., dp:ovs, actions:ct(commit,zone=64001,nat),recirc(0x306)

ZONE 64001

ufid:d652a04b-9f09-4247-a78d-54ab6ba41347, recirc_id(0x306),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0),tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x308

ufid:8707e569-5bad-4cff-990e-8fec8c666390, recirc_id(0x306),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0),tcp_flags(0/0), packets:4, bytes:264, used:0.596s, flags:F., dp:ovs, actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x308)

ZONE 40  (asymmetric paths for original and reply traffic )

ufid:0f4f9b52-b45d-4bdb-be81-12c5df2d690f, recirc_id(0x308),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs, actions:hash(l4(0)),recirc(0x372)

ufid:964d18b5-7863-4484-9597-e134836e7e1a, recirc_id(0x308),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:5, bytes:413, used:0.596s, flags:FP., dp:ovs, actions:ct(zone=40,nat),recirc(0x309)

ufid:fb55ccad-fb55-4823-b650-3023609984de, recirc_id(0x372),dp_hash(0xa/0xf),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:ct(commit,zone=40,label=0x2/0x2,nat(dst=169.254.169.2:8081)),recirc(0x309)

ZONE 40

ufid:4ddf19a4-dd21-4640-a083-35bfce8187d2, recirc_id(0x309),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=0/0,tos=0/0,ttl=64,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x30a)

ufid:d39c5b03-e29e-4e08-82dc-0b0523a523fa, recirc_id(0x309),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x3f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x1),eth(src=0c:42:a1:08:0a:da,dst=0c:42:a1:08:0a:da),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=128.0.0.0/192.0.0.0,proto=0/0,tos=0/0,ttl=64,frag=no), packets:4, bytes:264, used:0.595s, flags:F., dp:ovs, actions:set(eth(dst=52:54:00:56:00:31)),set(ipv4(ttl=63)),ct(commit,nat(src=192.168.111.27)),recirc(0x30a)

ZONE 0

ufid:9567edd0-2194-4a4a-9da6-44ab096258fe, recirc_id(0x30a),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x373)

ufid:facf3210-2fd1-42af-9e96-d33325253063, recirc_id(0x30a),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0x22/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=192.168.111.27,dst=169.254.169.2,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:4, bytes:264, used:0.596s, flags:F., dp:ovs, actions:ct_clear,ct(commit,zone=64001,nat(dst=192.168.111.27)),recirc(0x373)

ZONE 64001

ufid:cc3aaf83-dda8-405c-a6cf-5d9468c0187a, recirc_id(0x373),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:6, bytes:479, used:0.595s, flags:FP., dp:ovs, actions:ct(commit,zone=64002,nat(src=169.254.169.1)),recirc(0x374)

ZONE 64002

ufid:d33c1eef-2398-44b2-b95e-115567064b3f, recirc_id(0x374),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:12, bytes:1238, used:0.596s, flags:SFP., dp:ovs, actions:set(eth(src=52:54:00:56:00:31,dst=0c:42:a1:08:0a:da)),br-ex

Comment 3 Marcelo Ricardo Leitner 2021-07-21 03:11:04 UTC
(In reply to zenghui.shi from comment #0)
> Tested with the following ovs configurations
> 
> 1. hw-offload=true + tc-policy=none   ->  Not working

More below.

> 2. hw-offload=true + tc-policy=skip_hw   -> Not working

I don't see the details on this one?

> 3. hw-offload=true + tc-policy=skip_sw   -> Working

this one ends up using dp:ovs everywhere, so it's effectively not using TC+CT.

> Additional info:
> 1. hw-offload=true + tc-policy=none   ->  Not working
...
> 
> ======= Original direction =======
...
> 
> ZONE 40
> 
> ufid:d4666138-c62c-4f18-a51b-0c0bd5eeb476,
> recirc_id(0x2b9),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),
> ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:
> 00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),
> eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.35.139,proto=6,tos=0/0,
> ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:0, bytes:0,
> used:never, dp:ovs, actions:hash(l4(0)),recirc(0x2ce)

This one is okay to be dp:ovs

...
> ufid:d0110001-df86-4b17-ae99-b1b63235c866,
> recirc_id(0x2ce),dp_hash(0x4/0xf),skb_priority(0/0),in_port(br-ex),
> skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
> eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:
> 00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,
> proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs,
> actions:ct(commit,zone=40,label=0x2/0x2,nat(dst=169.254.169.2:8081)),
> recirc(0x2bb)
> 

But not this one. Same happens on the reply direction.
Ok, dp:ovs could still configure this conntrack entry to be committed and with that NAT info, but if nothing changes, this entry won't be on a flowtable and won't be offloaded.

Yet, AFAICT ATM, this shouldn't break it. Just not offload it.

Comment 4 Marcelo Ricardo Leitner 2021-07-21 03:24:07 UTC
And they have 0 pkts handled..
With this use case, br-ex is the src and dst of all packets here.

Zenghui, can you please capture packets on br-ex? Thanks.

Comment 9 Marcelo Ricardo Leitner 2021-07-22 13:36:45 UTC
Oh oh oh. I was so focused on the broken flow that I missed this:

(In reply to zenghui.shi from comment #0)
> ZONE 40  (asymmetric paths for original and reply traffic )

So we have 2 bugs here:
- OVN needs to fix the asymmetric path
- Somehow dp:tc is differing from dp:ovs and is causing the flow to break

From comment #0, these zones are being used and they are quite asymmetric:

         gap here              swapped
          vvvvv             vv----------v
original:          64001 -> 40 -> 40 -> 0 -> 64001 -> 64002
reply:    64002 -> 64001 -> 40 -> 40 -> 0 -> 64001 -> 64002

Zenghui, can you please report a new bz, towards OVN, to fix the asymmetric paths? Then lets keep this one for the broken flow. Thanks.

Comment 10 Marcelo Ricardo Leitner 2021-07-22 14:56:15 UTC
AFAICT from the capture so far:

The peers are able to establish a connection - the TCP handshake is done.
The client issue a HTTP request, which is received by the server.
The server sends the HTTP reply, which is NOT received by the client.

After that:
The client keeps retransmitting the request, because it never got an ack from server saying that it was received.
The server keeps retransmitting the reply, because, well, the client never got it.

The server HTTP reply has TCP FIN flag on it already.
There is a special handling for TCP FIN in act_ct, but I don't see how it can cause issues here.

Comment 11 Marcelo Ricardo Leitner 2021-07-22 20:42:57 UTC
Hangbin, especial note to the kernel that was used: 4.18.0-322.el8.mr942_210708_1548.x86_64
It has these commits: https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/942//commits
For this bz: bz1961063
and now they are being tracked at bz1980537 and bz1980532.

On how HWOL can be impacting here: act_ct instantiates 1 flowtable per zone, regardless of the interfaces involved. That means traffic to other node can lead to act_ct being instantiated on offloaded filters and that causes the conntrack entries to be offloaded to that NIC, even if they don't need to be for the use case here.

Something in dp:tc is not behaving similarly to dp:ovs. It is possible that once OVN team fix the asymmetric path above this issue will go away automatically, but we need to understand what is causing the flow to break here so that we can understand how impactfull this difference is.

Comment 12 zenghui.shi 2021-07-27 01:08:42 UTC
(In reply to Marcelo Ricardo Leitner from comment #9)
> Oh oh oh. I was so focused on the broken flow that I missed this:
> 
> (In reply to zenghui.shi from comment #0)
> > ZONE 40  (asymmetric paths for original and reply traffic )
> 
> So we have 2 bugs here:
> - OVN needs to fix the asymmetric path
> - Somehow dp:tc is differing from dp:ovs and is causing the flow to break
> 
> From comment #0, these zones are being used and they are quite asymmetric:
> 
>          gap here              swapped
>           vvvvv             vv----------v
> original:          64001 -> 40 -> 40 -> 0 -> 64001 -> 64002
> reply:    64002 -> 64001 -> 40 -> 40 -> 0 -> 64001 -> 64002
> 
> Zenghui, can you please report a new bz, towards OVN, to fix the asymmetric
> paths? Then lets keep this one for the broken flow. Thanks.

Marcelo, I think you mean creating a bug towards ovn-k8s, right?

Comment 13 Marcelo Ricardo Leitner 2021-07-27 10:50:15 UTC
(In reply to zenghui.shi from comment #12)
> Marcelo, I think you mean creating a bug towards ovn-k8s, right?

Ah yes. Right.

Comment 14 zenghui.shi 2021-07-28 03:24:51 UTC
(In reply to Marcelo Ricardo Leitner from comment #13)
> (In reply to zenghui.shi from comment #12)
> > Marcelo, I think you mean creating a bug towards ovn-k8s, right?
> 
> Ah yes. Right.

New BZ created to track the asymmetric issue for host -> service -> host endpoint on same node flows: https://bugzilla.redhat.com/show_bug.cgi?id=1986662

Comment 15 zenghui.shi 2021-07-30 02:29:25 UTC
(In reply to zenghui.shi from comment #0)
> Description of problem:
> 
> Hostnetwork pod to service backed by hostnetwork on the same node is not
> working with OVN Kubernetes when ovs hardware offload is enabled.
> 
> Tested with the following ovs configurations
> 
> 1. hw-offload=true + tc-policy=none   ->  Not working
> 2. hw-offload=true + tc-policy=skip_hw   -> Not working
> 3. hw-offload=true + tc-policy=skip_sw   -> Working
> 
> 
> Version-Release number of selected component (if applicable):
> Kernel: 4.18.0-322.el8.mr942_210708_1548.x86_64
> OVS: openvswitch2.15-2.15.0-24.el8fdp.x86_64
> OVN: ovn2.13-20.12.0-115.el8fdp.x86_64
> OVN-Kubernetes: Built with
> https://github.com/ovn-org/ovn-kubernetes/pull/2042

> 3. hw-offload=true + tc-policy=skip_sw   -> Working
> 
> Service IP: 172.30.128.247:8081
> Node IP: 192.168.111.27
> Pod IP: 192.168.111.27:8081
> 
> ======= Original direction =======
> 
> ufid:c1caaed1-9c3e-4762-9b2a-dbfb64418ba5,
> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),
> ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:
> 00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),
> eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.0.0/255.255.0.0,proto=0/
> 0,tos=0/0,ttl=0/0,frag=no), packets:33, bytes:4133, used:0.596s, flags:SFP.,
> dp:ovs, actions:ct(commit,zone=64001,nat(src=169.254.169.2)),recirc(0x306)
> 
> ZONE 64001
> 
> ufid:d652a04b-9f09-4247-a78d-54ab6ba41347,
> recirc_id(0x306),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),
> ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:
> 08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.
> 0,dst=172.30.128.247,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0),
> tcp_flags(0/0), packets:0, bytes:0, used:never, dp:ovs,
> actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x308)
> 

CT(zone=40) flow

> ufid:8707e569-5bad-4cff-990e-8fec8c666390,
> recirc_id(0x306),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),
> ct_state(0x22/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=0c:42:a1:
> 08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),ipv4(src=128.0.0.0/192.0.0.
> 0,dst=172.30.128.247,proto=6,tos=0/0,ttl=64,frag=no),tcp(src=0/0,dst=0/0),
> tcp_flags(0/0), packets:4, bytes:264, used:0.596s, flags:F., dp:ovs,
> actions:ct_clear,set(eth(dst=0c:42:a1:08:0a:da)),ct(zone=40),recirc(0x308)
> 
> ZONE 40
> 
> ufid:0f4f9b52-b45d-4bdb-be81-12c5df2d690f,
> recirc_id(0x308),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),
> ct_state(0x21/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:
> 00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),
> eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,
> ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:0, bytes:0,
> used:never, dp:ovs, actions:hash(l4(0)),recirc(0x372)
> 
> 
> ufid:964d18b5-7863-4484-9597-e134836e7e1a,
> recirc_id(0x308),dp_hash(0/0),skb_priority(0/0),in_port(br-ex),skb_mark(0/0),
> ct_state(0x22/0x22),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:
> 00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),
> eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=172.30.128.247,proto=6,tos=0/0,
> ttl=0/0,frag=no),tcp(src=0/0,dst=8081),tcp_flags(0/0), packets:5, bytes:413,
> used:0.596s, flags:FP., dp:ovs, actions:ct(zone=40,nat),recirc(0x309)
> 
> ufid:fb55ccad-fb55-4823-b650-3023609984de,
> recirc_id(0x372),dp_hash(0xa/0xf),skb_priority(0/0),in_port(br-ex),
> skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
> eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:
> 00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,
> proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:0, bytes:0, used:never, dp:ovs,
> actions:ct(commit,zone=40,label=0x2/0x2,nat(dst=169.254.169.2:8081)),
> recirc(0x309)
> 

CT(zone=40,nat) flow

> ZONE 40
> combination of CT/CT(nat) in the datapath (I assume this is fine as our
> issue here is broken traffic)

The combined use of CT/CT(nat) may not be the cause of broken flow, but it will
prevent flows from being offloaded to Mellanox NICs (e.g. CX-5).

Comment 16 zenghui.shi 2021-07-30 02:33:30 UTC
> 
> The combined use of CT/CT(nat) may not be the cause of broken flow, but it
> will
> prevent flows from being offloaded to Mellanox NICs (e.g. CX-5).

Created bz1988189 to track the combined use of CT/CT(nat) issue.

Comment 17 zenghui.shi 2021-08-09 02:46:32 UTC
Rerun the test with ovn version 21.09-host-21.09.0-8.el8fdp, the original issue remains.

Comment 18 Hangbin Liu 2021-08-12 07:00:11 UTC
Talked with Sushil, assign it back to nst-kernel

Comment 19 Marcelo Ricardo Leitner 2021-08-17 14:32:17 UTC
I was reviewing this bug with Xin Long today and then we realized that the traffic is broken because it hits the same situation of https://bugzilla.redhat.com/show_bug.cgi?id=1961063.

(In reply to zenghui.shi from comment #0)
...
> Additional info:
> 
> 1. hw-offload=true + tc-policy=none   ->  Not working
...
> ======= Original direction =======
> 
> ufid:eec1af9e-6ab3-4af7-8dbc-64113a1924b9,
> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),
> ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/0,
                                          ^^^^^^^^^^^^^^
...
> ZONE 64002
> 
> ufid:8d2f8f6e-dbb2-453b-9a26-e8af9740186e,
> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),
> ct_label(0/0),recirc_id(0x2d0),dp_hash(0/0),in_port(br-ex),packet_type(ns=0/
> 0,id=0/0),eth(src=0c:42:a1:08:0a:da,dst=52:54:00:56:00:31),eth_type(0x0800),
> ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,
> frag=no), packets:13, bytes:2858, used:0.030s, dp:tc,
> actions:set(eth(src=52:54:00:56:00:31,dst=0c:42:a1:08:0a:da)),br-ex
                                                                ^^^^^

That said, this bz now resumes to waiting for the fixes from bz1961063 and from bz1986662 (comment #14).
With that, I can take this bz for now as a place holder.

Comment 20 zenghui.shi 2021-08-20 06:08:58 UTC
(In reply to Marcelo Ricardo Leitner from comment #19)
> I was reviewing this bug with Xin Long today and then we realized that the
> traffic is broken because it hits the same situation of
> https://bugzilla.redhat.com/show_bug.cgi?id=1961063.
> 

I tried running the test (same node) with kernel that fixed the bz1961063 issue, the original issue in this bug remains.
Is it the indication that additional fixes might be required for this bug?

Comment 21 Marcelo Ricardo Leitner 2021-09-02 21:06:20 UTC
Adding dependencies per comment #14 and #16.

Btw, the flows we're observing in the test environment are quite different from the ones in https://bugzilla.redhat.com/show_bug.cgi?id=1986662#c9 .

Comment 22 Xin Long 2021-09-20 10:24:49 UTC
Debugged with Marcelo and Zenghui, another fix is needed:
https://lore.kernel.org/netdev/cover.1632133123.git.lucien.xin@gmail.com/

Comment 23 Marcelo Ricardo Leitner 2021-11-16 16:06:00 UTC
Latest version:
https://patchwork.kernel.org/project/netdevbpf/list/?series=579339

Comment 40 errata-xmlrpc 2022-05-10 14:59:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1988


Note You need to log in before you can comment on or make changes to this bug.