Bug 1959711

Summary: Egressnetworkpolicy doesn't work when configure the EgressIP
Product: OpenShift Container Platform Reporter: huirwang
Component: NetworkingAssignee: Alexander Constantinescu <aconstan>
Networking sub component: openshift-sdn QA Contact: huirwang
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aconstan, huirwang, jtanenba
Version: 4.8   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:08:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1971669    

Description huirwang 2021-05-12 07:33:52 UTC
Description of problem:
Egressnetworkpolicy  doesn't work when configure the EgressIP

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-05-11-192605  

How reproducible:
Always

Steps to Reproduce:
1. Create a namespace test and a pod under it.
2. Create a egressnetworkpolicy in project test
3. Check the egressnetworkpolicy take effect, the outbound traffic was blocked.
oc get egressnetworkpolicy -n test -o yaml
apiVersion: v1
.........

  spec:
    egress:
    - to:
        cidrSelector: 0.0.0.0/0
      type: Deny
...........
oc rsh -n test hello-pod
/ # curl --connect-timeout 5 172.31.249.80:9095
curl: (28) Connection timed out after 5001 milliseconds

4. Patch EgressIP to one node
5.Patch EgressIP to the namespace test
oc get hostsubnet
NAME              HOST              HOST IP          SUBNET          EGRESS CIDRS   EGRESS IPS
compute-0         compute-0         172.31.248.100   10.131.0.0/23                  ["172.31.249.12"]
compute-1         compute-1         172.31.248.96    10.128.2.0/23                  
compute-2         compute-2         172.31.248.94    10.129.2.0/23                  []
control-plane-0   control-plane-0   172.31.248.101   10.128.0.0/23                  []
control-plane-1   control-plane-1   172.31.248.102   10.130.0.0/23                  []
control-plane-2   control-plane-2   172.31.248.95    10.129.0.0/23                  []
oc get netnamespace test
NAME   NETID     EGRESS IPS
test   4651997   ["172.31.249.12"]
6. Check the outbound traffic from project test

Actual results:
The outbound traffic was allowed. The egressnetworkpolicy doesn't take effect.
oc rsh -n test hello-pod
/ # curl --connect-timeout 5 172.31.249.80:9095
172.31.249.12/
/ # curl -I www.google.com
HTTP/1.1 200 OK
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Date: Wed, 12 May 2021 07:30:44 GMT
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked
Expires: Wed, 12 May 2021 07:30:44 GMT
Cache-Control: private
Set-Cookie: 1P_JAR=2021-05-12-07; expires=Fri, 11-Jun-2021 07:30:44 GMT; path=/; domain=.google.com; Secure
Set-Cookie: NID=215=mWV_Per9VI4d1SH-QUE7InTiq11vrOneb8kqw3hHkeozRLHdvISpUplhLIscHcm3In2JX3ZAbGd7bvl0a0X_-RN1mFHU2Pntb7PsWmsivrQTJOZh8b0diRvKJJ9iQf6_S7HV2VjrFZM5aYWhlLy6wraVso6EV4eGGQ0LiB7LSfY; expires=Thu, 11-Nov-2021 07:30:44 GMT; path=/; domain=.google.com; HttpOnly

Expected results:
The egressnetworkpolicy works and outbound traffic is blocked.

Additional info:

Comment 1 Alexander Constantinescu 2021-05-12 12:52:34 UTC
@Jacob: I see you modified the action from `action: goto_table:101` (https://github.com/openshift/sdn/blob/release-4.7/pkg/network/node/ovscontroller.go#L792) to `action output:tun0` https://github.com/openshift/sdn/blob/master/pkg/network/node/ovscontroller.go#L795 

This breaks egress IP with egress network policy as this bug shows. Could you please have a look at that?

Comment 2 Alexander Constantinescu 2021-05-12 14:09:11 UTC
Expanding on my previous comment: I suspect you did that because the `goto_table` action is not allowed for OVS groups (I tested locally and verified the behavior). We (me and Winship) discussed this in the bug scrum and an idea was that we could have egress networkpolicy flows applied before egress IP ones, instead of the reverse as it was done before. 

I am marking this a blocker, since if we ship 4.8 with this it will raise a CVE. So if this can't be fixed until then we might need to revert the feature.

Comment 8 errata-xmlrpc 2021-07-27 23:08:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Comment 9 Red Hat Bugzilla 2023-09-15 01:06:26 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days