Bug 1917240

Summary: [4.6] Network Policies are not working as expected with OVN-Kubernetes when traffic hairpins back to the same source through a service
Product: OpenShift Container Platform Reporter: Andrew Stoycos <astoycos>
Component: NetworkingAssignee: Andrew Stoycos <astoycos>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aconstan, rbost
Version: 4.6Keywords: UpcomingSprint
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-08 13:51:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1903651    
Bug Blocks:    

Description Andrew Stoycos 2021-01-18 07:25:54 UTC
Description of problem:

  + After upgrading the cluster from OCP v4.5.16 to v4.6.1 and as well as in the fresh installation of OCP v4.6.1, and migrating from OpenShift-SDN to OVN-Kubernetes, the network policies are not working as expected.


Version-Release number of selected component (if applicable):

  + OCP v4.6
  + OVN-Kubernetes


How reproducible:

  + Frequent


Steps to Reproduce:

1. Create a Namespace:
~~~
kind: Namespace
apiVersion: v1
metadata:
  name: test-network-policy
~~~


2. Create Deployment:
~~~
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webserver
  namespace: test-network-policy
spec:
  selector:
    matchLabels:
      app: webserver
  replicas: 3
  template:
    metadata:
      labels:
        app: webserver
    spec:
      containers:
        - name: webserver
          image: ds0034oj:5000/rhel7/support-tools:latest
          args:
            - python
            - -m
            - SimpleHTTPServer
            - "8080"
          ports:
            - containerPort: 8080
              type: TCP
              name: http
~~~


3. Create Service:
~~~
apiVersion: v1
kind: Service
metadata:
  name: webserver
  namespace: test-network-policy
spec:
  selector:
    app: webserver
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080
~~~


4. Create Route:
~~~
kind: Route
apiVersion: route.openshift.io/v1
metadata:
  name: webserver
  namespace: test-network-policy
spec:
  host: webserver-test-network-policy.apps.npd.dc1.cloud.safran
  to:
    kind: Service
    name: webserver
    weight: 100
  port:
    targetPort: 8080
  tls:
    termination: edge
    insecureEdgeTerminationPolicy: Redirect
  wildcardPolicy: None
~~~


5. Test service from pods :
~~~
# for pod in $(oc get pods -n test-network-policy -o name); do oc -n test-network-policy exec $(echo $pod | awk -F "/" '{print $2}') -- curl -s -I -m 2 webserver:8080 | grep HTTP/1.0 ; done

HTTP/1.0 200 OK
HTTP/1.0 200 OK
HTTP/1.0 200 OK
~~~


6. Apply network policy :
~~~
---
# DENY ALL
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: test-network-policy
spec:
  podSelector: null
---

# ALLOW FROM SAME NAMESPACE
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-same-namespace
  namespace: test-network-policy
spec:
  podSelector: null
  ingress:
    - from:
        - podSelector: {}
---

# ALLOW FROM INGRESS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-ingress
  namespace: test-network-policy
spec:
  podSelector: null
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              network.openshift.io/policy-group: ingress
        - podSelector: {}
  policyTypes:
    - Ingress
~~~


7. Test service from pods :
~~~
# for pod in $(oc get pods -n test-network-policy -o name); do oc -n test-network-policy exec $(echo $pod | awk -F "/" '{print $2}') -- curl -s -I -m 2 webserver:8080 | grep HTTP/1.0 ; done

output ko; random timeout on service:

HTTP/1.0 200 OK
command terminated with exit code 28  <----
HTTP/1.0 200 OK
~~~


8. Test route
~~~
# for pod in $(oc get pods -n test-network-policy -o name); do oc -n test-network-policy exec $(echo $pod | awk -F "/" '{print $2}') -- curl -s -I -m 2  -k https://webserver-test-network-policy.apps.npd.dc1.cloud.safran | grep HTTP/1.0 ; done

output ko; route not available (503):

command terminated with exit code 28
command terminated with exit code 28
command terminated with exit code 28
~~~



Actual results:

  + Listed above.



Expected results:
 
  + The cURL outputs should work after applying the NetworkPolicies.

Comment 5 errata-xmlrpc 2021-02-08 13:51:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.6.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0308