Description of problem: In egress router policy configuration, If Deny 0.0.0.0/0, policy will block dnsname allowed, If deny single ip address like 98.138.253.109/32, policy will not block dnsname allowed, if deny 0.0.0.0/0, but allow rule is cidrSelector not dnsname, feature works fine Version-Release number of selected component (if applicable): oc v3.6.74 kubernetes v1.6.1+5115d708d7 How reproducible: Every time Steps to Reproduce: Deny 0.0.0.0/0 blocked dnsname allowed in egress router policy # If Deny 0.0.0.0/0, policy will block dnsname allowed [root@ip-172-18-13-83 ~]# cat policy.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "policy-test" }, "spec": { "egress": [ { "type": "Allow", "to": { "dnsName": "www.baidu.com" } }, { "type": "Deny", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } [root@ip-172-18-13-83 ~]# oc create -f policy.json egressnetworkpolicy "policy-test" created [root@ip-172-18-13-83 ~]# oc get pods NAME READY STATUS RESTARTS AGE hello-pod 1/1 Running 0 2m [root@ip-172-18-13-83 ~]# oc rsh hello-pod / # ping www.baidu.com ^C / # ping www.yahoo.com ^C / # exit [root@ip-172-18-13-83 ~]# # If deny single ip address like 98.138.253.109/32, policy will not block dnsname allowed [root@ip-172-18-13-83 ~]# oc delete egressnetworkpolicy policy-test egressnetworkpolicy "policy-test" deleted [root@ip-172-18-13-83 ~]# cat policy.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "policy-test" }, "spec": { "egress": [ { "type": "Allow", "to": { "dnsName": "www.baidu.com" } }, { "type": "Deny", "to": { "cidrSelector": "98.138.253.109/32" } } ] } } [root@ip-172-18-13-83 ~]# oc create -f policy.json egressnetworkpolicy "policy-test" created [root@ip-172-18-13-83 ~]# oc rsh hello-pod / # ping www.baidu.com PING www.baidu.com (103.235.46.39): 56 data bytes 64 bytes from 103.235.46.39: seq=0 ttl=37 time=231.139 ms 64 bytes from 103.235.46.39: seq=1 ttl=37 time=230.988 ms 64 bytes from 103.235.46.39: seq=2 ttl=37 time=231.568 ms ^C --- www.baidu.com ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 230.988/231.231/231.568 ms / # ping 98.138.253.109 PING 98.138.253.109 (98.138.253.109): 56 data bytes ^C --- 98.138.253.109 ping statistics --- 5 packets transmitted, 0 packets received, 100% packet loss / # # if deny 0.0.0.0/0, but allow rule is cidrSelector not dnsname, feature works fine [root@ip-172-18-13-83 ~]# oc delete egressnetworkpolicy policy-test egressnetworkpolicy "policy-test" deleted [root@ip-172-18-13-83 ~]# cat policy.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "policy-test" }, "spec": { "egress": [ { "type": "Allow", "to": { "cidrSelector": "103.235.46.39/32" } }, { "type": "Deny", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } [root@ip-172-18-13-83 ~]# oc create -f policy.json egressnetworkpolicy "policy-test" created [root@ip-172-18-13-83 ~]# oc rsh hello-pod / # ping 103.235.46.39 PING 103.235.46.39 (103.235.46.39): 56 data bytes 64 bytes from 103.235.46.39: seq=0 ttl=37 time=231.189 ms 64 bytes from 103.235.46.39: seq=1 ttl=37 time=230.903 ms 64 bytes from 103.235.46.39: seq=2 ttl=37 time=230.890 ms ^C --- 103.235.46.39 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 230.890/230.994/231.189 ms / # Actual results: / # ping www.baidu.com ^C Expected results: / # ping www.baidu.com PING www.baidu.com (103.235.46.39): 56 data bytes 64 bytes from 103.235.46.39: seq=0 ttl=37 time=231.139 ms 64 bytes from 103.235.46.39: seq=1 ttl=37 time=230.988 ms 64 bytes from 103.235.46.39: seq=2 ttl=37 time=231.568 ms Additional info:
Test failed because deny 0.0.0.0/0 will block all DNS resolution to local nameserver [root@ip-172-18-3-73 ~]# oc get pods NAME READY STATUS RESTARTS AGE hello-openshift-4-322jq 1/1 Running 0 1m hello-openshift-4-4r7dg 1/1 Running 0 1m hello-openshift-4-ht0kh 1/1 Running 0 1m hello-openshift-4-s2rm7 1/1 Running 0 1m hello-openshift-4-wn7dg 1/1 Running 0 1m hello-pod-4-39stw 1/1 Running 0 1m hello-pod-4-760sj 1/1 Running 0 1m hello-pod-4-g478k 1/1 Running 0 1m hello-pod-4-rhq8n 1/1 Running 0 1m hello-pod-4-sg3ld 1/1 Running 0 1m [root@ip-172-18-3-73 ~]# oc rsh hello-openshift-4-322jq / $ cat /etc/resolv.conf nameserver 172.18.1.118 search test.svc.cluster.local svc.cluster.local cluster.local ec2.internal options ndots:5 / $ Suggest to allow traffic to nameserver(172.18.1.118) by default even using deny 0.0.0.0/0, then above test case will pass.
We need to work out why an node ip address is being used in the resolv.conf, not the node sdn address. Alternatively, it would not be unreasonable to allow all traffic to the local node ip addresses (since we already allow traffic to the node SDN address).
Replying to myself: we can't (easily) make the installer set up resolv.conf with the sdn address because we haven't got one until the node starts up and registers itself. So I think we should just add a rule into OVS when EgressNetworkPolicy is used that allows the local host's default ip address (or perhaps all addresses on that node) by default.
https://github.com/openshift/origin/pull/14924
Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/74f0bafa0351a08e45bdb735302032ecb2494c9d add the nodes local IP address to OVS rules this change adds the nodes local IP address to the ovs rules when using egressnetworkpolicies to limit egress from the cluster. Adding the nodes local IP allows for dns resolution when dns is accessable on the node. bug 1458849 changelog: - changed the rules creation to SetupOVS() - made both udp and tcp rules the same priority
Test on latest OCP-3.6 env, seems the issue still could be reproduced. openshift v3.6.135 kubernetes v1.6.1+5115d708d7 etcd 3.2.1 [root@host-8-174-52 ~]# oc describe egressnetworkpolicy policy-test Name: policy-test Namespace: d2 Created: 9 minutes ago Labels: <none> Annotations: <none> Rule: Allow to www.baidu.com Rule: Deny to 0.0.0.0/0 [root@host-8-174-52 ~]# oc rsh hello-pod / # ping www.baidu.com ping: bad address 'www.baidu.com'
what does /etc/resolv.conf contain inside hello-pod? what is the output of "ovs-ofctl -O OpenFlow13 dump-flows br0" on the node?
openshift v3.6.135 kubernetes v1.6.1+5115d708d7 etcd 3.2.1 [root@host-8-174-52 ~]# oc get pod NAME READY STATUS RESTARTS AGE hello-pod 1/1 Running 0 14s [root@host-8-174-52 ~]# oc rsh hello-pod / # ping www.baidu.com PING www.baidu.com (103.235.46.39): 56 data bytes 64 bytes from 103.235.46.39: seq=0 ttl=36 time=252.119 ms 64 bytes from 103.235.46.39: seq=1 ttl=36 time=251.419 ms [root@host-8-174-52 ~]# oc create -f policy.json egressnetworkpolicy "policy-test" created [root@host-8-174-52 ~]# oc describe egressnetworkpolicy policy-test Name: policy-test Namespace: d3 Created: 11 seconds ago Labels: <none> Annotations: <none> Rule: Allow to www.baidu.com Rule: Deny to 0.0.0.0/0 [root@host-8-174-52 ~]# oc rsh hello-pod / # ping www.baidu.com ping: bad address 'www.baidu.com' / # cat /etc/resolv.conf nameserver 172.16.120.15 search d3.svc.cluster.local svc.cluster.local cluster.local openstacklocal host.centralci.eng.rdu2.redhat.com options ndots:5
Created attachment 1295159 [details] openflow log
Could you verify if this works if you use the installer to set up the cluster? That should set up some dnsmasq's that allow this to work.
Actually the env in #comment 9 was set up by installer. And here is the installer package version: openshift-ansible-3.6.138-1.git.0.2c647a9.el7.noarch.rpm
Can we get the output from 'ip a' on the node where that openflow dump came from, or if the env doesn't exist any more, can you get the following from a new node that exhibits the bug: - ip a - cat /etc/resolv.conf - ovs-ofctl -O OpenFlow13 dump-flows br0 Thanks
Interestingly, 172.16.120.15 is the address of a different node... it looks like the installer didn't set up a local dnsmasq, or perhaps the pod had moved to a different node than the one that had the ovs flow dump? Anyway, I will look at the installer to see what it does. Can you post the hosts file that you used so I can see what options were set (if any)? Also, can you please post the info from the previous comment too. Thanks.
Tested on latest OCP v3.6.140 and works fine. [root@ip-172-18-3-187 ~]# oc create -f policy.json egressnetworkpolicy "policy-test" created [root@ip-172-18-3-187 ~]# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/pod-for-ping.json pod "hello-pod" created [root@ip-172-18-3-187 ~]# cat policy.json { "kind": "EgressNetworkPolicy", "apiVersion": "v1", "metadata": { "name": "policy-test" }, "spec": { "egress": [ { "type": "Allow", "to": { "dnsName": "www.baidu.com" } }, { "type": "Deny", "to": { "cidrSelector": "0.0.0.0/0" } } ] } } [root@ip-172-18-3-187 ~]# oc get pod NAME READY STATUS RESTARTS AGE hello-pod 1/1 Running 0 15s [root@ip-172-18-3-187 ~]# oc describe egressnetworkpolicy policy-test Name: policy-test Namespace: p1 Created: About a minute ago Labels: <none> Annotations: <none> Rule: Allow to www.baidu.com Rule: Deny to 0.0.0.0/0 [root@ip-172-18-3-187 ~]# oc rsh hello-pod / # ping www.baidu.com PING www.baidu.com (103.235.46.39): 56 data bytes 64 bytes from 103.235.46.39: seq=0 ttl=37 time=240.932 ms 64 bytes from 103.235.46.39: seq=1 ttl=37 time=240.692 ms ^C --- www.baidu.com ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 240.692/240.812/240.932 ms / # ping www.cisco.com PING www.cisco.com (23.196.96.28): 56 data bytes ^C --- www.cisco.com ping statistics --- 4 packets transmitted, 0 packets received, 100% packet loss / # exit command terminated with exit code 1 [root@ip-172-18-3-187 ~]# oc version oc v3.6.140 kubernetes v1.6.1+5115d708d7 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-3-187.ec2.internal:8443 openshift v3.6.140 kubernetes v1.6.1+5115d708d7 [root@ip-172-18-3-187 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188