Bug 1959711 - Egressnetworkpolicy doesn't work when configure the EgressIP [NEEDINFO]
Summary: Egressnetworkpolicy doesn't work when configure the EgressIP
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Alexander Constantinescu
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks: 1971669
TreeView+ depends on / blocked
 
Reported: 2021-05-12 07:33 UTC by huirwang
Modified: 2021-08-16 22:19 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 23:08:01 UTC
Target Upstream Version:
aconstan: needinfo? (jtanenba)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sdn pull 299 0 None open Bug 1959711: Reverse table order for egress IP and egress network policy set up 2021-05-14 14:12:00 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:08:18 UTC

Description huirwang 2021-05-12 07:33:52 UTC
Description of problem:
Egressnetworkpolicy  doesn't work when configure the EgressIP

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-05-11-192605  

How reproducible:
Always

Steps to Reproduce:
1. Create a namespace test and a pod under it.
2. Create a egressnetworkpolicy in project test
3. Check the egressnetworkpolicy take effect, the outbound traffic was blocked.
oc get egressnetworkpolicy -n test -o yaml
apiVersion: v1
.........

  spec:
    egress:
    - to:
        cidrSelector: 0.0.0.0/0
      type: Deny
...........
oc rsh -n test hello-pod
/ # curl --connect-timeout 5 172.31.249.80:9095
curl: (28) Connection timed out after 5001 milliseconds

4. Patch EgressIP to one node
5.Patch EgressIP to the namespace test
oc get hostsubnet
NAME              HOST              HOST IP          SUBNET          EGRESS CIDRS   EGRESS IPS
compute-0         compute-0         172.31.248.100   10.131.0.0/23                  ["172.31.249.12"]
compute-1         compute-1         172.31.248.96    10.128.2.0/23                  
compute-2         compute-2         172.31.248.94    10.129.2.0/23                  []
control-plane-0   control-plane-0   172.31.248.101   10.128.0.0/23                  []
control-plane-1   control-plane-1   172.31.248.102   10.130.0.0/23                  []
control-plane-2   control-plane-2   172.31.248.95    10.129.0.0/23                  []
oc get netnamespace test
NAME   NETID     EGRESS IPS
test   4651997   ["172.31.249.12"]
6. Check the outbound traffic from project test

Actual results:
The outbound traffic was allowed. The egressnetworkpolicy doesn't take effect.
oc rsh -n test hello-pod
/ # curl --connect-timeout 5 172.31.249.80:9095
172.31.249.12/
/ # curl -I www.google.com
HTTP/1.1 200 OK
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Date: Wed, 12 May 2021 07:30:44 GMT
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked
Expires: Wed, 12 May 2021 07:30:44 GMT
Cache-Control: private
Set-Cookie: 1P_JAR=2021-05-12-07; expires=Fri, 11-Jun-2021 07:30:44 GMT; path=/; domain=.google.com; Secure
Set-Cookie: NID=215=mWV_Per9VI4d1SH-QUE7InTiq11vrOneb8kqw3hHkeozRLHdvISpUplhLIscHcm3In2JX3ZAbGd7bvl0a0X_-RN1mFHU2Pntb7PsWmsivrQTJOZh8b0diRvKJJ9iQf6_S7HV2VjrFZM5aYWhlLy6wraVso6EV4eGGQ0LiB7LSfY; expires=Thu, 11-Nov-2021 07:30:44 GMT; path=/; domain=.google.com; HttpOnly

Expected results:
The egressnetworkpolicy works and outbound traffic is blocked.

Additional info:

Comment 1 Alexander Constantinescu 2021-05-12 12:52:34 UTC
@Jacob: I see you modified the action from `action: goto_table:101` (https://github.com/openshift/sdn/blob/release-4.7/pkg/network/node/ovscontroller.go#L792) to `action output:tun0` https://github.com/openshift/sdn/blob/master/pkg/network/node/ovscontroller.go#L795 

This breaks egress IP with egress network policy as this bug shows. Could you please have a look at that?

Comment 2 Alexander Constantinescu 2021-05-12 14:09:11 UTC
Expanding on my previous comment: I suspect you did that because the `goto_table` action is not allowed for OVS groups (I tested locally and verified the behavior). We (me and Winship) discussed this in the bug scrum and an idea was that we could have egress networkpolicy flows applied before egress IP ones, instead of the reverse as it was done before. 

I am marking this a blocker, since if we ship 4.8 with this it will raise a CVE. So if this can't be fixed until then we might need to revert the feature.

Comment 8 errata-xmlrpc 2021-07-27 23:08:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.