Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1975826

Summary: ovn-kubernetes host directed traffic cannot be offloaded as CT zone 64000 is not established
Product: OpenShift Container Platform Reporter: Alaa Hleihel (NVIDIA Mellanox) <ahleihel>
Component: NetworkingAssignee: Tim Rozet <trozet>
Networking sub component: ovn-kubernetes QA Contact: Ying Wang <yingwang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: adrianc, ahleihel, astoycos, bbennett, mleitner, trozet, zzhao
Version: 4.9Flags: yingwang: needinfo-
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:04:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alaa Hleihel (NVIDIA Mellanox) 2021-06-24 14:00:53 UTC
Description of problem:

Today pod network traffic to k8s workers (host network)
is sent through GR to be SNATed and then to the underlay.
on the receiving end, traffic is sent to CT on the
physical bridge and forwarded to the host
i.e LOCAL port or host representor port in case of Smart-NICs.

By design, the conntrack state on zone 64000 is never
established for host destined traffic which may prevent
offload capable NICs from offloading this type of traffic.

Branch: Master

Comment 2 Adrian Chiris 2021-06-24 14:52:59 UTC
PR:  https://github.com/ovn-org/ovn-kubernetes/pull/2277 proposes a partial solution to the issue for pod to host network traffic.

Comment 3 Tim Rozet 2021-08-17 22:37:09 UTC
will retarget to 4.10

Comment 4 Marcelo Ricardo Leitner 2021-11-09 21:51:04 UTC
Do we need to set Target Release to 4.10 now? I see the MR was merged upstream, https://github.com/openshift/ovn-kubernetes/pull/796 .

Comment 6 Ying Wang 2021-12-06 10:12:04 UTC
Verified this bug on 4.10.0-0.nightly-2021-12-03-213835.

1. Create one hostnetwork service with json file as below,

{
  "apiVersion": "v1",
  "kind": "List",
  "items": [
    {
      "kind": "Pod",
      "apiVersion": "v1",
      "metadata": {
        "name": "http-nodeport-hostnetwork-server",
        "labels": {
          "name": "http-nodeport-hostnetwork-server"
        }
      },
      "spec": {
        "hostNetwork": true,
        "containers": [
          {
           "image": "quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95",
           "name": "http-container",
           "imagePullPolicy": "IfNotPresent", 
           "ports": [
              { "hostPort": 8080,
                "containerPort": 8080
              }
            ]
          }
        ]
      }
    },
    {
      "kind": "Service",
      "apiVersion": "v1",
      "metadata": {
        "name": "http-nodeport-hostnetwork-server",
        "labels": {
          "name": "http-nodeport-hostnetwork-server"
        }
      },
      "spec": {
        "ports": [
          {
            "name": "http",
            "protocol": "TCP",
            "port": 27017,
            "targetPort": 8080
          }
        ],
        "type": "NodePort",
        "selector": {
          "name": "http-nodeport-hostnetwork-server"
        }
      }
    }
  ]
}

2. login one worker node and curl to the hostnetwork service with node ip + node port

sh-4.4# curl 10.0.74.72:30825
Hello OpenShift!

3. check contrack, which includes item to zone 64000
sh-4.4# conntrack -L | grep 30825
tcp      6 117 TIME_WAIT src=10.0.54.55 dst=10.0.74.72 sport=44922 dport=30825 src=10.0.74.72 dst=10.0.54.55 sport=30825 dport=44922 [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 zone=64000 use=1
tcp      6 117 TIME_WAIT src=10.0.54.55 dst=10.0.74.72 sport=44922 dport=30825 src=10.0.74.72 dst=10.0.54.55 sport=30825 dport=44922 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
conntrack v1.4.4 (conntrack-tools): 1125 flow entries have been shown.

4. Did contract testing on version 4.9.0, the conntrack is as below, which doesn't have the item to zone 64000.
sh-4.4# conntrack -L | grep 31105
tcp      6 99 TIME_WAIT src=10.0.183.190 dst=10.0.158.206 sport=38344 dport=31105 src=10.0.158.206 dst=10.0.183.190 sport=31105 dport=38344 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
conntrack v1.4.4 (conntrack-tools): 1101 flow entries have been shown.

@adrianc, please help confirm if it is enough to verify this bug.

Comment 7 zhaozhanqi 2021-12-08 03:06:19 UTC
@trozet Could you help give advice for above comment if it's enough for verifying this bug, thanks.

Comment 8 Adrian Chiris 2021-12-12 08:00:44 UTC
 > login one worker node and curl to the hostnetwork service with node ip + node port

I assume thats the worker's node IP.
This scenario seems OK, just need to ensure that on the receiving end connection is established on zone 64000

Comment 9 Ying Wang 2021-12-16 09:13:52 UTC
Thanks Adrian, I checked on receiving end, connection is established on zone 64000.

sh-4.4# conntrack -L | grep 10.0.128.6 
tcp      6 290 ESTABLISHED src=10.0.128.5 dst=10.0.128.6 sport=10250 dport=56044 src=10.0.128.6 dst=10.0.128.5 sport=56044 dport=10250 [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 zone=64000 use=1

Comment 12 errata-xmlrpc 2022-03-10 16:04:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056