Bug 1939045

Summary: [OCPv4.6] pod to pod communication broken on PFCP procotol over UDP
Product: OpenShift Container Platform Reporter: Angelo Gabrieli <agabriel>
Component: NetworkingAssignee: Tim Rozet <trozet>
Networking sub component: ovn-kubernetes QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: alosadag, anbhat, bbennett, dcbw, djuran, fbaudin, fpan, fpaoline, fsoppels, hchatter, mavazque, mschwabe, openshift-bugs-escalate, pabeni, pibanezr, rkhan, trozet, zzhao
Version: 4.6   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2024910 2024911 2024914 (view as bug list) Environment:
Last Closed: 2021-10-18 17:29:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1939676    
Bug Blocks: 2024910    

Description Angelo Gabrieli 2021-03-15 13:55:31 UTC
Description of problem:

Pod to pod network communication between 2 pods on the same Openshift worker node using PFCP protocol over UDP stops working after a while.
More details in private comments


Version-Release number of selected component (if applicable):
OCPv4.6.17 with OVNKubernetes


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
the network flow is working and after a while it stops workings

Expected results:
the network flow should always work


Additional info:

Comment 3 David Juran 2021-03-15 15:07:40 UTC
Further, possibly significant observation:

We saw that if sending something from pfcp-endpoint -> data-plane continusly (1 message/second) we get the connection working.
So the problem seems to happen when this "path" is not in use for a while.

Comment 9 Tim Rozet 2021-03-16 19:47:19 UTC
After looking through this more I believe the issue is caused by the ip/port collisions between the pods + service. We can see in OVS there is a failure to commit the entry into conntrack, presumably because an entry already exists that would conflict:

2021-03-12T08:12:40.670Z|00004|dpif(handler10)|WARN|system@ovs-system: execute ct(commit,zone=84,label=0/0x1),ct(zone=85),recirc(0x19590) failed (Invalid argument) on packet udp,vlan_tci=0x0000,dl_src=0a:58:0a:81:02:07,dl_dst=0a:58:0a:81:02:08,nw_src=10.129.2.7,nw_dst=10.129.2.8,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=5054,tp_dst=5088 udp_csum:14c4
 with metadata skb_priority(0),skb_mark(0),ct_state(0x21),ct_zone(0x54),ct_tuple4(src=10.129.2.7,dst=10.129.2.8,proto=17,tp_src=5054,tp_dst=5088),in_port(16) mtu 0

conflicting zone 84 entry:
udp,orig=(src=10.129.2.7,dst=172.30.9.90,sport=5054,dport=5088),reply=(src=10.129.2.8,dst=10.129.2.7,sport=5088,dport=5054),zone=84,labels=0x2

I've filed an OVN bug to handle this case:
https://bugzilla.redhat.com/show_bug.cgi?id=1939676

With this potential fix we would SNAT(0.0.0.0) in conntrack, which would only change the source port of traffic if there is a collision. The caveat of this is that the packet may arrive at the server with a different source port, which may or may not be desirable. In order to fully avoid this type of scenario, the service and/or app configuration should be changed to avoid such port collisions.

Comment 15 Tim Rozet 2021-08-17 20:18:41 UTC
4.9 contains 21.09 OVN with the relevant fix as well as openvswitch2.15-2.15.0-28.el8fdp.x86_64

Comment 17 zhaozhanqi 2021-08-19 06:58:15 UTC
Verified this bug on 4.9.0-0.nightly-2021-08-18-144658

with version:

openvswitch2.15-2.15.0-28.el8fdp.x86_64
ovn21.09-21.09.0-13.el8fdp.x86_64

steps:

1. new project z1 
2. Create one test pod and svc with following json file

{
    "apiVersion": "v1",
    "kind": "List",
    "items": [
        {
            "apiVersion": "v1",
            "kind": "ReplicationController",
            "metadata": {
                "labels": {
                    "name": "test-rc"
                },
                "name": "test-rc"
            },
            "spec": {
                "replicas": 1,
                "template": {
                    "metadata": {
                        "labels": {
                            "name": "test-pods"
                        }
                    },
                    "spec": {
                        "containers": [
                            {
                                "image": "quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95",
                                "name": "test-pod",
                                "imagePullPolicy": "IfNotPresent"
                            }
                        ]
                    }
                }
            }
        },
        {
            "apiVersion": "v1",
            "kind": "Service",
            "metadata": {
                "labels": {
                    "name": "test-service"
                },  
                "name": "test-service"
            },
            "spec": {
                "ports": [
                    {
                        "name": "http",
                        "port": 27017,
                        "protocol": "TCP",
                        "targetPort": 8080
                    }
                ],  
                "selector": {
                    "name": "test-pods"
                }   
            }
        }
    ]
}

3. Create another client pod which need to schedule same node with above pod

{
  "kind": "Pod",
  "apiVersion":"v1",
  "metadata": {
        "generateName": "hello-pod2",
        "labels": {
                "name": "hello-pod2"
        }
  },
  "spec": {
      "containers": [{
        "name": "hello-pod",
        "image": "quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95"
      }],
  "nodeName" : "ip-10-0-158-114.us-east-2.compute.internal"
  }
}

4. Check two pods are running on same worker

$ oc get pod -n z1 -o wide
NAME              READY   STATUS    RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
hello-pod22ww4h   1/1     Running   0          34m   10.131.0.15   ip-10-0-158-114.us-east-2.compute.internal   <none>           <none>
test-rc-bcwzm     1/1     Running   0          39m   10.131.0.14   ip-10-0-158-114.us-east-2.compute.internal   <none>           <none>


$ oc get svc -n z1
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
test-service   ClusterIP   172.30.226.85   <none>        27017/TCP   41m

5. open connection from client pod to access svc address

$oc rsh -n z1 hello-pod22ww4h

/ # nc 172.30.226.85 27017 -p 5555

6. open another termial and create another connect to test pod directly

$oc rsh -n z1 hello-pod22ww4h

/ # nc 10.131.0.14 8080 -p 5555


7. wait some times, step 6 should not be stop

8. oc rsh this worker and check 

sh-4.4# conntrack -L | grep 10.131.0.15
tcp      6 431829 ESTABLISHED src=10.131.0.15 dst=10.131.0.14 sport=50151 dport=8080 src=10.131.0.14 dst=10.131.0.15 sport=8080 dport=50151 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=30 use=1
tcp      6 431821 ESTABLISHED src=10.131.0.15 dst=172.30.226.85 sport=5555 dport=27017 src=10.131.0.14 dst=10.131.0.15 sport=8080 dport=5555 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=31 use=1
tcp      6 431829 ESTABLISHED src=10.131.0.15 dst=10.131.0.14 sport=5555 dport=8080 src=10.131.0.14 dst=10.131.0.15 sport=8080 dport=50151 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=31 use=1
tcp      6 431821 ESTABLISHED src=10.131.0.15 dst=10.131.0.14 sport=5555 dport=8080 src=10.131.0.14 dst=10.131.0.15 sport=8080 dport=5555 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=30 use=1


9. Check logs in ovs-vswitchd.log

sh-4.4# tail /var/log/openvswitch/ovs-vswitchd.log | grep "Invalid argument"  ###should show nothing.

Comment 20 errata-xmlrpc 2021-10-18 17:29:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759