Bug 2016534 - externalIP does not work when egressIP is also present
Summary: externalIP does not work when egressIP is also present
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Patryk Diak
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks: 2082451 2093068
TreeView+ depends on / blocked
 
Reported: 2021-10-21 20:25 UTC by Alan Chan
Modified: 2022-09-02 03:37 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2093068 (view as bug list)
Environment:
Last Closed: 2022-08-10 10:39:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sdn pull 423 0 None Merged Bug 2016534: Masquerade in cluster traffic that is marked for egress IP 2022-05-09 14:32:36 UTC
Github openshift sdn pull 428 0 None open [WIP] Bug 2016534: Exclude the default drop bit from egress IP VNID 2022-05-12 16:50:56 UTC
Red Hat Knowledge Base (Solution) 6704411 0 None None None 2022-02-04 07:51:02 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:39:33 UTC

Description Alan Chan 2021-10-21 20:25:23 UTC
Description of problem:
-----------------------

- Service's externalIP works independently without anything else, egressIP also works independently without anything else.

- But when BOTH are activated on the same namespace, service's externalIP's traffic seems to get dropped when connecting within the cluster.

- Interestingly, cu states that, when it's not working, curl'ing to the externalIP still works outside of the cluster, but not inside of the cluster, e.g.: curl'ing from a pod within the same namespace to the externalIP does not connect.


Version-Release number of selected component (if applicable):
-------------------------------------------------------------

OCP 4.8 with OpenShiftSDN plugin


How reproducible:
-----------------

Seems to be always.


Questions that needs some answers:
----------------------------------

1. If that's an inherited conflict with service's externalIP and egressIP in the same namespace, i.e.: it's not supported. 

2. Does the externalIP range need to be on different subnet than the node's subnet?


Other BZs:
----------

- At the moment, don't believe the BZ#2008987 is at play here, because when just using egressIP by itself it seems to work fine.


Reproducer:
-----------

Here is a reproducer situation using quicklab. 10.0.88.18 (egressIP) & .28 (externalIP) are both unused in the subnet as far as I know. They are in the same subnet as the node's subnet.

---
1. Create a project

[quicklab@upi-0 ~]$ oc new-project alchan-t1

[quicklab@upi-0 ~]$ oc project
Using project "alchan-t1" on server "https://api.sharedocp4upi48.lab.upshift.rdu2.redhat.com:6443".


2. Create two independent pods, not replicas of each other, best if they are on different nodes

[quicklab@upi-0 ~]$ oc new-app httpd

[quicklab@upi-0 ~]$ oc new-app httpd --name=httpd2

[quicklab@upi-0 ~]$ oc get pod -owide
NAME                     READY   STATUS    RESTARTS   AGE     IP             NODE                                                   NOMINATED NODE   READINESS GATES
httpd-7c9df9c9b4-mbk8n   1/1     Running   0          7h39m   10.128.2.116   worker-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   <none>           <none>
httpd2-86478565d-b9fl7   1/1     Running   0          47m     10.131.1.241   worker-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   <none>           <none>


3. Setup service's externalIP on the first pod's svc

[quicklab@upi-0 ~]$ oc patch svc httpd -p '{"spec":{"externalIPs":["10.0.88.28"]}}'

[quicklab@upi-0 ~]$ oc get svc
NAME     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
httpd    ClusterIP   172.30.227.231   10.0.88.28    8080/TCP,8443/TCP   19h
httpd2   ClusterIP   172.30.152.89    <none>        8080/TCP,8443/TCP   51m


4. Test connection from the second pod to the first pod's externalIP, while no egressIP has been applied yet

[quicklab@upi-0 ~]$ oc get netnamespace alchan-t1
NAME        NETID      EGRESS IPS
alchan-t1   13991661   

[quicklab@upi-0 ~]$ oc exec httpd2-86478565d-b9fl7 -- curl -sv 10.0.88.28:8080 --output /dev/null
* Rebuilt URL to: 10.0.88.28:8080/
*   Trying 10.0.88.28...
* TCP_NODELAY set
* Connected to 10.0.88.28 (10.0.88.28) port 8080 (#0)
> GET / HTTP/1.1
> Host: 10.0.88.28:8080
> User-Agent: curl/7.61.1
> Accept: */*
> 
< HTTP/1.1 403 Forbidden
< Date: Wed, 20 Oct 2021 17:20:03 GMT
< Server: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1g
< Last-Modified: Thu, 12 Nov 2020 16:02:45 GMT
< ETag: "1181-5b3eb0ae4b340"
< Accept-Ranges: bytes
< Content-Length: 4481
< Content-Type: text/html; charset=UTF-8
< 
{ [3908 bytes data]
* Connection #0 to host 10.0.88.28 left intact

This shows it works, and it can reach. The openflow & iptables rules appear to be doing things properly...


5. Now apply an egressIP on the same netnamespace, using the third node (different node than the two pods), i.e: worker-2

First, assign a node to host the desired ip:

[quicklab@upi-0 ~]$ oc patch hostsubnet worker-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com --type=merge -p '{"egressCIDRs": ["10.0.88.0/21"]}'

[quicklab@upi-0 ~]$ oc get hostsubnet
NAME                                                   HOST                                                   HOST IP       SUBNET          EGRESS CIDRS       EGRESS IPS
master-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   master-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.89.24    10.128.0.0/23                      
master-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   master-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.92.199   10.129.0.0/23                      
master-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   master-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.89.211   10.130.0.0/23                      
worker-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   worker-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.94.213   10.128.2.0/23                      
worker-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   worker-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.92.128   10.131.0.0/23                      
worker-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   worker-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.95.80    10.129.2.0/23   ["10.0.88.0/21"]

Second, assign the netnamespace with the desired ip:

[quicklab@upi-0 ~]$ oc patch netnamespaces alchan-t1 --type=merge -p '{"egressIPs": ["10.0.88.18"]}'

[quicklab@upi-0 ~]$ oc get netnamespace alchan-t1
NAME        NETID      EGRESS IPS
alchan-t1   13991661   ["10.0.88.18"]

[quicklab@upi-0 ~]$ oc get hostsubnet
NAME                                                   HOST                                                   HOST IP       SUBNET          EGRESS CIDRS       EGRESS IPS
master-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   master-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.89.24    10.128.0.0/23                      
master-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   master-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.92.199   10.129.0.0/23                      
master-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   master-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.89.211   10.130.0.0/23                      
worker-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   worker-0.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.94.213   10.128.2.0/23                      
worker-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   worker-1.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.92.128   10.131.0.0/23                      
worker-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   worker-2.sharedocp4upi48.lab.upshift.rdu2.redhat.com   10.0.95.80    10.129.2.0/23   ["10.0.88.0/21"]   ["10.0.88.18"]


6. Now test connection from the second pod to the first pod's externalIP again with egressIP setup

[quicklab@upi-0 ~]$ oc exec httpd2-86478565d-b9fl7 -- curl -sv 10.0.88.28:8080 --output /dev/null --connect-timeout 5
* Rebuilt URL to: 10.0.88.28:8080/
*   Trying 10.0.88.28...
* TCP_NODELAY set
* Connection timed out after 5000 milliseconds
* Closing connection 0
command terminated with exit code 28

You can see it fails to connect for some reason, seems like the packets get dropped somewhere inside the sdn.
---


Expectation:
------------

- externalIP and egressIP should work fine with each other at the same time in the same namespace/netnamespace.

Comment 13 Dan Winship 2022-04-06 13:52:59 UTC
Earlier comment is wrong; although we originally felt that there was no reason to be using external IPs (or load balancer IPs) from within the cluster, we gave up on that idea a long time ago. (In fact, I had thought we fixed this case already...) Anyway, the current behavior is a bug. We should not blackhole traffic. Either it shouldn't go through the egress IP, or it should go through the egress IP but still work.

Comment 27 errata-xmlrpc 2022-08-10 10:39:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.