Description of problem: While we are trying to use an Egress Firewall to Limit Access to External Resources, following the official documentation [1], where JSON format of the EgressNetworkPolicy policy configuration is specified, it is not working as expected. Because if the JSON file contains more than two egress entries, it is only taking the last entry into consideration. Whereas if we give only two entries it is working as expected. Also, if we apply the EgressNetworkPolicy policy configuration using a YAML file it is working as expected with any number of entries. [1] https://docs.openshift.com/container-platform/3.11/admin_guide/managing_networking.html#admin-guide-limit-pod-access-egress Version-Release number of selected component (if applicable): [root@master-0 ~]# oc version oc v3.11.219 kubernetes v1.11.0+d4cacc0 How reproducible: It is 100% reproducible. Steps to Reproduce: Step 1 - Verify whether you are using either ovs-multitenant or ovs-networkpolicy: # oc get clusterNetwork You must have the ovs-multitenant or ovs-networkpolicy plug-in enabled in order to limit pod access via egress policy. Step 2 - Create an egress json file, say `egress.json`: ~~~ { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "default" }, "spec": { "egress": [ { "type": "Deny", "to": { "dnsName": "registry.access.redhat.com" } }, { "type": "Deny", "to": { "dnsName": "google.com" } }, { "type": "Allow", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } ~~~ Step 3 - Open the required project, for example: # oc project openshift-web-console Step 4 - Apply the egressnetworkpolicy: # oc apply -f egress.json Step 5 - Verify whether the egressnetworkpolicy is created: # oc get egressnetworkpolicy Step 6 - Delete the pod and let it recreate: # oc delete pod webconsole-xxxxx-yyyy Step 7 - RSH into the new pod: # oc rsh webconsole-aaaaa-bbbbbb Step 8 - Try to curl into `registry.access.redhat.com` # sh-4.2$ curl -kv registry.access.redhat.com Actual results: Here the curl would be successful even though we had denied access to `registry.access.redhat.com`. In this case, the parameter which is considered is the last `Allow` entry for all the CIDR. However, if we had only two entries (here there are three entries in the JSON file) it works as expected. Also if the configuration file is in YAML format it also works as expected with any number of entries. For example: ~~~ --- kind: EgressNetworkPolicy apiVersion: v1 metadata: name: default spec: egress: - type: Deny to: dnsName: registry.access.redhat.com - type: Deny to: dnsName: google.com - type: Allow to: cidrSelector: 0.0.0.0/0 ~~~ Applying this EgressNetworkPolicy would work as expected. Expected results: The EgressNetworkPolicy should work as expected while using the JSON format with more than 2 entries. Additional info: As per the official documentation [2] the example JSON file would not give the expected results. [2] https://docs.openshift.com/container-platform/3.11/admin_guide/managing_networking.html#admin-guide-limit-pod-access-egress
Assigning to the main dev branch to see if this is an issue still. Once we work that out, we can work out if a fix is needed, or if it was fixed already and work out if we can safely backport a fix.
does not appear in version 4.5 [jtanenba@dell-pe-fm120-1b 4.5]$ oc version Client Version: 4.5.0-0.ci-2020-05-29-203954 Server Version: 4.5.0-0.ci-2020-06-18-032733 Kubernetes Version: v1.18.3 [jtanenba@dell-pe-fm120-1b 4.5]$ cat egressFirewall.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "default" }, "spec": { "egress": [ { "type": "Deny", "to": { "dnsName": "registry.access.redhat.com" } }, { "type": "Deny", "to": { "dnsName": "google.com" } }, { "type": "Allow", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } [jtanenba@dell-pe-fm120-1b 4.5]$ oc create -f egressFirewall.json egressnetworkpolicy.network.openshift.io/default created [jtanenba@dell-pe-fm120-1b 4.5]$ oc get pods NAME READY STATUS RESTARTS AGE udp-rc-vqbsr 1/1 Running 0 7m42s [jtanenba@dell-pe-fm120-1b 4.5]$ oc rsh udp-rc-vqbsr ~ $ curl -kv registry.access.redhat.com * Rebuilt URL to: registry.access.redhat.com/ * Trying 104.112.183.144... * TCP_NODELAY set ^C ~ $ ping registry.access.redhat.com PING e14353.d.akamaiedge.net (104.112.183.144) 56(84) bytes of data. ^C --- e14353.d.akamaiedge.net ping statistics --- 6 packets transmitted, 0 received, 100% packet loss, time 5134ms ~ $
using docker-in-docker tested on the latest 3.11 [jtanenba@dell-pe-fm120-1b origin]$ cat egressFirewall.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "default" }, "spec": { "egress": [ { "type": "Deny", "to": { "dnsName": "registry.access.redhat.com" } }, { "type": "Deny", "to": { "dnsName": "google.com" } }, { "type": "Allow", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } [jtanenba@dell-pe-fm120-1b origin]$ oc create -f egressFirewall.json egressnetworkpolicy.network.openshift.io/default created [jtanenba@dell-pe-fm120-1b origin]$ oc rsh udp-rc-tk4qf / $ curl -kv registry.access.redhat.com * Rebuilt URL to: registry.access.redhat.com/ * Trying 104.122.188.239... * TCP_NODELAY set PING e14353.d.akamaiedge.net (104.122.188.239) 56(84) bytes of data. ^C --- e14353.d.akamaiedge.net ping statistics --- 11 packets transmitted, 0 received, 100% packet loss, time 10266ms I cannot reproduce can the customer try upgrading to the latest 3.11?
Hi Jacob, thanks for your time on this. Yes, it is working as expected with `registry.access.redhat.com`. I guess the title has to be changed slightly. Actually it is not denying access to `registry-1.docker.io` despite giving the deny entry. [quicklab@master-0 ~]$ cat egressFirewall.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "default" }, "spec": { "egress": [ { "type": "Deny", "to": { "dnsName": "registry.access.redhat.com" } }, { "type": "Deny", "to": { "dnsName": "registry.access.redhat.com.edgekey.net" } }, { "type": "Deny", "to": { "dnsName": "registry-1.docker.io" } }, { "type": "Allow", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } [quicklab@master-0 ~]$ oc create -f egressFirewall.json egressnetworkpolicy.network.openshift.io/default created [quicklab@master-0 ~]$ oc rsh webconsole-85494cdb8c-bhdnp sh-4.2$ curl -kv registry-1.docker.io * About to connect() to registry-1.docker.io port 80 (#0) * Trying 23.22.155.84... * Connected to registry-1.docker.io (23.22.155.84) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: registry-1.docker.io > Accept: */* > < HTTP/1.1 301 Moved Permanently < content-length: 0 < location: https://registry-1.docker.io/ < * Connection #0 to host registry-1.docker.io left intact sh-4.2$ curl -kv https://registry-1.docker.io * About to connect() to registry-1.docker.io port 443 (#0) * Trying 3.223.220.229... * Connection timed out * Trying 52.1.121.53... * Connected to registry-1.docker.io (52.1.121.53) port 443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * skipping SSL peer certificate verification * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 * Server certificate: * subject: CN=*.docker.io * start date: May 23 00:00:00 2020 GMT * expire date: Jun 23 12:00:00 2021 GMT * common name: *.docker.io * issuer: CN=Amazon,OU=Server CA 1B,O=Amazon,C=US > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: registry-1.docker.io > Accept: */* > < HTTP/1.1 200 OK < Cache-Control: no-cache < Date: Fri, 19 Jun 2020 05:48:11 GMT < Content-Length: 0 < Strict-Transport-Security: max-age=31536000 < * Connection #0 to host registry-1.docker.io left intact sh-4.2$ curl -v https://registry-1.docker.io * About to connect() to registry-1.docker.io port 443 (#0) * Trying 52.4.20.24... * Connection timed out * Trying 52.5.11.128... * Connected to registry-1.docker.io (52.5.11.128) port 443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 * Server certificate: * subject: CN=*.docker.io * start date: May 23 00:00:00 2020 GMT * expire date: Jun 23 12:00:00 2021 GMT * common name: *.docker.io * issuer: CN=Amazon,OU=Server CA 1B,O=Amazon,C=US > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: registry-1.docker.io > Accept: */* > < HTTP/1.1 200 OK < Cache-Control: no-cache < Date: Fri, 19 Jun 2020 05:48:07 GMT < Content-Length: 0 < Strict-Transport-Security: max-age=31536000 < * Connection #0 to host registry-1.docker.io left intact sh-4.2$ curl -v registry-1.docker.io * About to connect() to registry-1.docker.io port 80 (#0) * Trying 35.174.73.84... * Connection timed out * Trying 34.195.246.183... * Connected to registry-1.docker.io (34.195.246.183) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: registry-1.docker.io > Accept: */* > < HTTP/1.1 301 Moved Permanently < content-length: 0 < location: https://registry-1.docker.io/ < * Connection #0 to host registry-1.docker.io left intact However, it is denying `registry.access.redhat.com` and `registry.access.redhat.com.edgekey.net`. Do you think it would be better to open a new bug? Let me know if something needs to be addressed from my end. -/Rejeeb
I still see it working for me using the 3.11 master branch [jtanenba@dell-pe-fm120-1b origin]$ oc get egressnetworkpolicy -o yaml apiVersion: v1 items: - apiVersion: network.openshift.io/v1 kind: EgressNetworkPolicy metadata: creationTimestamp: "2020-06-22T14:27:26Z" name: default namespace: z1 resourceVersion: "4493" selfLink: /apis/network.openshift.io/v1/namespaces/z1/egressnetworkpolicies/default uid: 7f241dee-b494-11ea-bade-024255060b37 spec: egress: - to: dnsName: registry.access.redhat.com type: Deny - to: dnsName: registry.access.redhat.com.edgekey.net type: Deny - to: dnsName: registry-1.docker.io type: Deny - to: cidrSelector: 0.0.0.0/0 type: Allow kind: List metadata: resourceVersion: "" selfLink: "" [jtanenba@dell-pe-fm120-1b origin]$ oc rsh hello-pod / # ping registry-1.docker.io PING registry-1.docker.io (3.223.220.229) 56(84) bytes of data. ^C --- registry-1.docker.io ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3110ms / # ping registry.access.redhat.com PING e14353.d.akamaiedge.net (104.82.119.143) 56(84) bytes of data. ^C --- e14353.d.akamaiedge.net ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4115ms / # ping registry.access.redhat.com.edgekey.net PING e14353.d.akamaiedge.net (104.82.119.143) 56(84) bytes of data. ^C --- e14353.d.akamaiedge.net ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4082ms / # exit command terminated with exit code 1
@jtanenba Reproduced the issue in v3.11.232. ## with three egress entries in JSON file [root@ip-172-18-7-3 ~]# oc rsh webconsole-847944fcc-vvhhd sh-4.2$ curl -kv registry.access.redhat.com * About to connect() to registry.access.redhat.com port 80 (#0) * Trying 104.80.113.87... * Connected to registry.access.redhat.com (104.80.113.87) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: registry.access.redhat.com > Accept: */* > < HTTP/1.1 301 Moved Permanently < Server: AkamaiGHost < Content-Length: 0 < Location: https://access.redhat.com/containers/ < Expires: Mon, 22 Jun 2020 18:43:46 GMT < Cache-Control: max-age=0, no-cache, no-store < Pragma: no-cache < Date: Mon, 22 Jun 2020 18:43:46 GMT < Connection: keep-alive < * Connection #0 to host registry.access.redhat.com left intact ## with three egress entries in YAML file [root@ip-172-18-7-3 ~]# oc rsh webconsole-847944fcc-8xmsl sh-4.2$ curl -kv registry.access.redhat.com * About to connect() to registry.access.redhat.com port 80 (#0) * Trying 23.203.22.61... * Connection timed out * Failed connect to registry.access.redhat.com:80; Connection timed out * Closing connection 0 curl: (7) Failed connect to registry.access.redhat.com:80; Connection timed out sh-4.2$ exit
@jtanenba As Weibin Liang said, I could also reproduce the same issue in the latest version of 3.11 -/Rejeeb
Weibin also reproduced on 4.5
This does not have to do with the file type, it has to do with how the egressDNS is used and changed *** This bug has been marked as a duplicate of bug 1850060 ***