Bug 1927841

Summary: Network Policies are not working as expected with OVN-Kubernetes when using default policys
Product: OpenShift Container Platform Reporter: Jon <jharding>
Component: NetworkingAssignee: Aniket Bhat <anbhat>
Networking sub component: ovn-kubernetes QA Contact: Arti Sood <asood>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: medium    
Priority: medium CC: aconstan, anbhat, aos-bugs, astoycos, atn, joboyer, jokerman, mateusz.bacal, mmckiern, mszczewski, obockows, openshift-bugs-escalate, sbelmasg, swasthan, zzhao
Version: 4.6.zKeywords: Reopened
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-12 15:28:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
configs and policy none

Description Jon 2021-02-11 16:27:46 UTC
Created attachment 1756405 [details]
configs and policy

Description of problem:
Setting up default network policies, following documentation 
https://docs.openshift.com/container-platform/4.6/networking/network_policy/multitenant-network-policy.html
and 
https://docs.openshift.com/container-platform/4.6/post_installation_configuration/network-configuration.html
Traffic from outside the cluster cannot access the test cakephp-mysql-exmaple
if I add a specific rule to allow all ingress to the app it works.

Version-Release number of selected component (if applicable):
started after 4.6.4 currently running 4.6.16* updated to fixed same namespace connections 

How reproducible:
100%

Steps to Reproduce:
1.Create new project * networkpolicies are applied by the default template
2.Install test cakephp-example app
3.Verify deployment open browser and use route provided in the console

Actual results:
Cannot reach the application 

Expected results:
Access to the cakephp-example web page 

Additional info:
This was working in a project I had setup right after I built the cluster it was 4.6.4 all worked as expected then upgraded 4.6.12/13 and now .16

Comment 1 Andrew Stoycos 2021-02-18 17:44:37 UTC
Hi Jon, 

I spun up an 4.6.16 cluster And followed your instructions but was not able to reproduce 

1. Make namespace test 

[astoycos@localhost demo]$ oc create ns test
namespace/test created

2. Apply the following network Policies to ns test 

[astoycos@localhost demo]$ cat test_ingress_bug.yaml 
---

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: deny-by-default
spec:
  podSelector:
  ingress: []

--- 

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-openshift-ingress
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          network.openshift.io/policy-group: ingress
  podSelector: {}
  policyTypes:
  - Ingress

---

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-openshift-monitoring
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          network.openshift.io/policy-group: monitoring
  podSelector: {}
  policyTypes:
  - Ingress

---

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-same-namespace
spec:
  podSelector:
  ingress:
  - from:
    - podSelector: {}

---

[astoycos@localhost demo]$ oc apply -f test_ingress_bug.yaml -n test
networkpolicy.networking.k8s.io/deny-by-default created
networkpolicy.networking.k8s.io/allow-from-openshift-ingress created
networkpolicy.networking.k8s.io/allow-from-openshift-monitoring created
networkpolicy.networking.k8s.io/allow-same-namespace created


3. Go into console and deploy cake-php demo application 

[astoycos@localhost demo]$ oc get pods -n test
NAME                                  READY   STATUS      RESTARTS   AGE
cakephp-mysql-persistent-1-build      0/1     Completed   0          96s
cakephp-mysql-persistent-1-deploy     1/1     Running     0          43s
cakephp-mysql-persistent-1-hook-pre   0/1     Completed   0          40s
cakephp-mysql-persistent-1-n4lvc      0/1     Running     0          33s
mysql-1-deploy                        0/1     Completed   0          96s
mysql-1-njdmc                         1/1     Running     0          93s

4. Go to default made route 

[astoycos@localhost demo]$ oc get route
NAME                       HOST/PORT                                                                                    PATH   SERVICES                   PORT    TERMINATION   WILDCARD
cakephp-mysql-persistent   cakephp-mysql-persistent-test.apps.ci-ln-mg1xi3k-f76d1.origin-ci-int-gce.dev.openshift.com          cakephp-mysql-persistent   <all>                 None


And I am able to access the DEMO application's start page in browser and terminal 

[astoycos@localhost demo]$ curl cakephp-mysql-persistent-test.apps.ci-ln-mg1xi3k-f76d1.origin-ci-int-gce.dev.openshift.com
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
  <title>Welcome to OpenShift</title>

......



MOVING FORWARD 

Let me know if was anything I did differently from your setup process 

Please attach the output of the following commands run on your broken cluster 

1. `oc adm must-gather` 
2. `oc adm must-gather -- gather_network_logs`

Thanks, 
Andrew

Comment 2 Jon 2021-02-18 18:29:02 UTC
Hello Andrew,
The only thing I can see different is that my cluster was built 4.6.4 and upgraded 4.6.12 -> 4.6.13 -> 4.6.15 -> 4.6.16, at 4.6.4 When I built the cluster I tested networkpolicies and it worked did not test again until 4.6.15 and it did not work could not even connect within the same namespace after upgrading to 4.6.16 I was able to connect within the same namespace. I have not rebuilt at 4.6.16.

I have run the requested commands and will attach the files to the case

Comment 3 Andrew Stoycos 2021-02-18 21:53:14 UTC
So after doing some research, If an ingresscontroller specifies "spec.endpointPublishingStrategy.type: HostNetwork" then the ingress router pods are host networked due to the fact that bare metal does not support cloud load balancers, and using the host network means that an external load-balancer can be easily configured to use ports 80/443/1936 on the nodes hosting router pods. 

There is no supported way to accept traffic from hostnetworked routers via network policy in OVN-kubernetes, with both versions 4.7 and 4.6. The workaround regarding the addition of a label in the default namespace (https://docs.openshift.com/container-platform/4.6/post_installation_configuration/network-configuration.html) is an Openshift-SDN specific oddity and needs to be documented as such.  Therefore I am reassigning this bug to the Documentation team to fix. 

Note: There is an upstream enhancement targeting 4.8 -> https://github.com/openshift/enhancements/blob/master/enhancements/network/allow-from-router-networkpolicy.md  

Thanks, 
Andrew

Comment 4 Mike McKiernan 2021-02-19 21:57:07 UTC
Doc PR, PTAL: https://github.com/openshift/openshift-docs/pull/29633

Comment 21 Andrew Stoycos 2021-03-19 14:47:57 UTC
*** Bug 1937008 has been marked as a duplicate of this bug. ***

Comment 22 Andrew Stoycos 2021-04-01 19:45:41 UTC
Closing since this was fixed by https://github.com/openshift/openshift-docs/pull/29633

Comment 24 msi_bacalm 2021-04-02 08:06:45 UTC
This bug has been closed based on update in docs, while the original issue tracked here shall be closed when RFE will be implemented which was mention few comments back
>> Note: There is an upstream enhancement targeting 4.8 -> https://github.com/openshift/enhancements/blob/master/enhancements/network/allow-from-router-networkpolicy.md

Comment 27 msi_bacalm 2021-05-05 07:48:34 UTC
Additional comment from me would be:
RFE refers to routers pods running in host network. While the issue affects ALL types of pods running with host network.

Simple example:
1. Project A has pod-1(host network) and pod-2(pod network)
2. Project B has pod-1(pod network) which exposes service on port 443
3. Now I specify following network policy in project A
   'Allow all incoming traffic from project B'
4. Now when i try to access from pod-1.project-B(host network) using curl service from pod-1.project-A:443(pod network) it will NOT work
5. Now when i try to access from pod-2.project-B(pod network) using curl service from pod-1.project-A:443 it will work

What i am trying to say:
RFE refers only to routers while the issue is wider(OVN) so when this will be fixed in overall- not only for router pods ?

Comment 28 Aniket Bhat 2021-05-12 00:38:09 UTC
@mateusz.bacal you can use the solution implemented by the enhancement tracked here: https://github.com/openshift/enhancements/blob/master/enhancements/network/allow-from-router-networkpolicy.md to also allow traffic from host network.

You will have to create a policy that selects pods using label selector: policy-group.network.openshift.io/host-network: "". Note that this feature will get released in OCP 4.8 first and then find it's way through our backport process to 4.6.z.

Comment 29 Aniket Bhat 2021-05-12 15:28:24 UTC
This enhancement is implemented now in OCP 4.8 and will be available when the 4.8 release happens. For backport to 4.7.z and 4.6.z please track: https://bugzilla.redhat.com/show_bug.cgi?id=1942603 and https://bugzilla.redhat.com/show_bug.cgi?id=1942604 respectively.

Closing this bug.