Bug 1878071 - [OVN] After configure egressIP, outgoing traffic broke
Summary: [OVN] After configure egressIP, outgoing traffic broke
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Alexander Constantinescu
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-11 09:10 UTC by huirwang
Modified: 2020-09-11 10:28 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-11 09:57:24 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description huirwang 2020-09-11 09:10:20 UTC
Description of problem:
After configure egressIP, outgoing traffic broke

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-09-10-195619

How reproducible:
Always

Steps to Reproduce:
1.Label one node to be egressIP node
oc label node compute-1 "k8s.ovn.org/egress-assignable"=""

2.
 Create ns test and pods in it.
oc label ns test team=red

oc get pods -o wide -n test
NAME            READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES
hello-pod       1/1     Running   0          15m   10.128.2.100   compute-0   <none>           <none>
test-rc-mks8g   1/1     Running   0          18s   10.128.2.106   compute-0   <none>           <none>
test-rc-t26dw   1/1     Running   0          17s   10.131.0.9     compute-1   <none>           <none>

3. Apply egressIP object
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
  name: egressip
spec:
  egressIPs:
  - 139.178.76.20 
  namespaceSelector:
    matchLabels:
      team: red

oc get egressip
NAME       EGRESSIPS       ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip   139.178.76.20   compute-1       139.178.76.20

4. From test pods to access outside websites.



Actual results:
oc rsh -n test hello-pod     
/ # curl ifconfig.me --connect-timeout 5
curl: (7) Failed to connect to ifconfig.me port 80: Operation timed out
/ # 

without patch egressIP, the outgoing traffic works
oc rsh -n test hello-pod
~ $ curl ifconfig.me
139.178.76.9~ $ exit

Expected results:
The outgoing traffic should work with egressIP as source IP



Additional info:

Comment 2 Alexander Constantinescu 2020-09-11 09:57:24 UTC
Hi Huiran

As I mentioned in comment: https://bugzilla.redhat.com/show_bug.cgi?id=1872098#c25, don't use nightly versions to test right now. The release process for nightly versions is broken, so it cannot be trusted. 

I figured out what is wrong on your cluster, the OVN version in that nightly: 4.6.0-0.nightly-2020-09-10-195619 is not correct. It contains:

ovn2.13-host-20.06.1-6.el7fdp.x86_64
ovn2.13-vtep-20.06.1-6.el7fdp.x86_64
ovn2.13-20.06.1-6.el7fdp.x86_64
ovn2.13-central-20.06.1-6.el7fdp.x86_64

It should however contain:

https://github.com/openshift/cluster-network-operator/pull/767#issuecomment-686374055

Without that OVN fix any pod matching an egress IP looses external connectivity, as you've now discovered.

You can use any of the latest green CI builds defined here: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/#4.6.0-0.ci

/Alex


Note You need to log in before you can comment on or make changes to this bug.