Bug 1766583 - [3.11] EgressIP doesn't work with NetworkPolicy unless traffic from default project is allowed
Summary: [3.11] EgressIP doesn't work with NetworkPolicy unless traffic from default p...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Juan Luis de Sousa-Valadas
QA Contact: huirwang
URL:
Whiteboard: SDN-CUST-IMPACT
Depends On: 1700431 1741477 1741499
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-29 13:04 UTC by K Chandra Sekar
Modified: 2023-12-15 16:53 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1741477
Environment:
Last Closed: 2020-09-03 13:47:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description K Chandra Sekar 2019-10-29 13:04:20 UTC
+++ This bug was initially created as a clone of Bug #1741477 +++

+++ This bug was initially created as a clone of Bug #1700431 +++

Description of problem:
Customer reports when they use networkPolicy combined with egressIP unless


Version-Release number of selected component (if applicable):
3.11

How reproducible:
Always

Steps to Reproduce:
1. Create a project X using egressIP
2. Add egressIP to node A
3. Create a pod in project X which is *not* running on node A.
4. Create a networkPolicy which only allows traffic from itself:

- apiVersion: extensions/v1beta1
  kind: NetworkPolicy
  metadata:
    name: deny-ingress-from-other-namespaces
  spec:
    ingress:
    - from:
      - podSelector: {}
    podSelector: {}
    policyTypes:
    - Ingress
5. Go to the pod in project X and try to reach a resource outside OpenShift. Traffic is dropped.
6. Create a rule that allows traffic from the default project (assumes the default netnamespace netid equals 0 and the project has a label project=default)
- apiVersion: extensions/v1beta1
  kind: NetworkPolicy
  metadata:
    name: allow from default
  spec:
    ingress:
    - from:
      - podSelector: {}
      - namespaceSelector:
          matchLabels:
            project: default
    podSelector: {}
    policyTypes:
    - Ingress
Traffic works
7. Completely remove every networkPolicy. Traffic also works

Actual results:
Packet is dropped somewhere.

Expected results:
Packet goes through the egressIP and comes back

Additional info:
Additionally the customer has an egressNetworkPolicy which allows traffic to destination and denies traffic by default. I believe this egressNetworkPolicy is unrelated to the issue.

Opening this new bug as the customer is still facing the issue reported in the above bugzilla as the errata[1] didn't fixed the issue for them

[1] - https://access.redhat.com/errata/RHBA-2019:2816

Comment 4 zhaozhanqi 2019-12-12 06:23:35 UTC
hi, huiran, could you help try if this can be reproduced?

Comment 10 Juan Luis de Sousa-Valadas 2020-01-02 10:39:22 UTC
Hi Chandra,
Because QA cannot reproduce the issue aaand 3.11.146 should already have the fix, I'm going to need the following information:

1- oc get namespace <project with egressIP>
2- oc get hostsubnet/<node hosting the pod> hostsubnet/<node with the hostsubnet>
3- oc get pod <affected pod> -o wide
4- oc get clusternetwork
5- In both nodes (the one with the egressIP and the one hosting the pod): oc rsh <pod name> ovs-ofctl -O OpenFlow13 dump-flows br0
6- In both nodes: iptables-save
7- In both nodes: The file /etc/origin/node/node-config.yaml
8- SDN pod logs of both nodes (I don't really expect anything useful here, but let's give it a shot anyway.

The problem might be:
1- The conntrack action not being added to the flows
2- The conntrack action being added but not being honored by OVS
3- The fix being fine and we're having an unrelated problem

Comment 19 Juan Luis de Sousa-Valadas 2020-01-17 13:04:05 UTC
Chandra, I cannot reproduce it in my environment.

Attempt to reproduce:
# oc get netnamespace test
NAME      NETID     EGRESS IPS
test      48985     [172.17.0.230]

# oc get hostsubnet
NAME                    HOST                    HOST IP      SUBNET          EGRESS CIDRS   EGRESS IPS
openshift-master-node   openshift-master-node   172.17.0.2   10.130.0.0/23   []             []
openshift-node-1        openshift-node-1        172.17.0.3   10.129.0.0/23   []             []
openshift-node-2        openshift-node-2        172.17.0.4   10.128.0.0/23   []             [172.17.0.230]

# oc get networkpolicy -o yaml -n test
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: NetworkPolicy
  metadata:
    creationTimestamp: "2020-01-17T12:52:04Z"
    generation: 1
    name: default-deny
    namespace: test
    resourceVersion: "113338"
    selfLink: /apis/extensions/v1beta1/namespaces/test/networkpolicies/default-deny
    uid: 294f1b1b-3928-11ea-842b-0242ac110002
  spec:
    podSelector: {}
    policyTypes:
    - Ingress
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

# oc get pod -o wide -n test
NAME                      READY     STATUS    RESTARTS   AGE       IP           NODE               NOMINATED NODE
hello-openshift-3-2cmbf   1/1       Running   0          1m        10.128.0.6   openshift-node-2   <none>
hello-openshift-3-r9lr6   1/1       Running   0          17m       10.129.0.9   openshift-node-1   <none

# oc rsh -n test hello-openshift-3-2cmbf curl 192.168.97.133:8000  -o /dev/null  -s
# oc rsh -n test hello-openshift-3-r9lr6 curl 192.168.97.133:8000  -o /dev/null  -s


And the application log of  192.168.97.133:8000:
$ python -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
172.17.0.230 - - [17/Jan/2020 13:56:02] "GET / HTTP/1.1" 200 -
172.17.0.230 - - [17/Jan/2020 13:56:11] "GET / HTTP/1.1" 200 -


It's working on my environment.


Note You need to log in before you can comment on or make changes to this bug.