Bug 1744322 - [OVN] hairpin service connections fail in ovnkubenetes
Summary: [OVN] hairpin service connections fail in ovnkubenetes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: ---
: 4.5.0
Assignee: Dan Winship
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-21 20:32 UTC by Weibin Liang
Modified: 2020-07-13 17:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:11:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:11:19 UTC

Description Weibin Liang 2019-08-21 20:32:02 UTC
Description of problem:
Same curl testing pass in sdn but fail in ovnkubenetes

Version-Release number of selected component (if applicable):
4.2.0-0.ci-2019-08-21-151437

How reproducible:
Always

Steps to Reproduce:
#### Curl fail in ovnkubenetes setup:
[root@dhcp-41-193 ~]# oc new-project p7
Now using project "p7" on server "https://api.weliangipi8212.qe.devcluster.openshift.com:6443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app django-psql-example

to build a new example application in Python. Or use kubectl to deploy a simple Kubernetes application:

    kubectl create deployment hello-node --image=gcr.io/hello-minikube-zero-install/hello-node

[root@dhcp-41-193 ~]# oc create -f https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/OSE3.3/blue-pod1.json
route.route.openshift.io/blue-route created
service/blue-service created
pod/blue-pod-1 created
[root@dhcp-41-193 ~]# oc get all
NAME             READY   STATUS    RESTARTS   AGE
pod/blue-pod-1   1/1     Running   0          18s

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/blue-service   ClusterIP   172.30.67.237   <none>        8080/TCP   18s

NAME                                  HOST/PORT          PATH   SERVICES       PORT    TERMINATION   WILDCARD
route.route.openshift.io/blue-route   blue.example.com          blue-service   <all>                 None
[root@dhcp-41-193 ~]# oc exec blue-pod-1 -n p7 --  curl blue-service.p7.svc.cluster.local:8080
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:02:08 --:--:--     0curl: (7) Failed to connect to blue-service.p7.svc.cluster.local port 8080: Operation timed out
command terminated with exit code 7
[root@dhcp-41-193 ~]# oc exec blue-pod-1 -n p7 --  curl blue-service:8080
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:02:10 --:--:--     0curl: (7) Failed to connect to blue-service port 8080: Operation timed out
command terminated with exit code 7
[root@dhcp-41-193 ~]# oc exec blue-pod-1 -n p7 --  curl google.com
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
100   219  100   219    0     0   5615      0 --:--:-- --:--:-- --:--:--  5615
[root@dhcp-41-193 ~]# 



#### curl pass in SDN with networkpolicy setup:
[root@dhcp-41-193 ~]# oc new-project p7
Now using project "p7" on server "https://api.qe-weliang-8211.qe.devcluster.openshift.com:6443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app django-psql-example

to build a new example application in Python. Or use kubectl to deploy a simple Kubernetes application:

    kubectl create deployment hello-node --image=gcr.io/hello-minikube-zero-install/hello-node

[root@dhcp-41-193 ~]# oc create -f https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/OSE3.3/blue-pod1.json
route.route.openshift.io/blue-route created
service/blue-service created
pod/blue-pod-1 created
[root@dhcp-41-193 ~]# oc get all
NAME             READY   STATUS    RESTARTS   AGE
pod/blue-pod-1   1/1     Running   0          18s

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/blue-service   ClusterIP   172.30.222.55   <none>        8080/TCP   18s

NAME                                  HOST/PORT          PATH   SERVICES       PORT    TERMINATION   WILDCARD
route.route.openshift.io/blue-route   blue.example.com          blue-service   <all>                 None
[root@dhcp-41-193 ~]# oc exec blue-pod-1 -n p7 --  curl blue-service.p7.svc.cluster.local:8080
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    25  100    25    0     0   1923      0 --:--:-- --:--:-- --:--:--  1923
Hello Blue Pod-1 Example
[root@dhcp-41-193 ~]# oc exec blue-pod-1 -n p7 --  curl blue-service:8080
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0Hello Blue Pod-1 Example
100    25  100    25    0     0   3571      0 --:--:-- --:--:-- --:--:--  3571
[root@dhcp-41-193 ~]# oc exec blue-pod-1 -n p7 --  curl google.com
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
100   219  100   219    0     0   1938      0 --:--:-- --:--:-- --:--:--  1938
[root@dhcp-41-193 ~]#

Actual results:
curl fail in ovnkubenetes

Expected results:
curl should pass in ovnkubenetes


Additional info:

Comment 1 Weibin Liang 2019-08-22 15:40:22 UTC
More information from testing pods:


#### Curl fail in ovnkubenetes setup:
[root@dhcp-41-193 AWS]# oc rsh blue-pod-1
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if14: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1400 qdisc noqueue state UP 
    link/ether 2a:ba:63:81:02:09 brd ff:ff:ff:ff:ff:ff
    inet 10.129.2.8/23 brd 10.129.3.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::28ba:63ff:fe81:209/64 scope link 
       valid_lft forever preferred_lft forever
/ # ip route
default via 10.129.2.1 dev eth0 
10.129.2.0/23 dev eth0  src 10.129.2.8 
/ # cat /etc/resolv.conf 
search p7.svc.cluster.local svc.cluster.local cluster.local us-west-2.compute.internal
nameserver 172.30.0.10
options ndots:5
/ # nslookup blue-service.p7.svc.cluster.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      blue-service.p7.svc.cluster.local
Address 1: 172.30.6.147 blue-service.p7.svc.cluster.local
/ # nslookup blue-service
nslookup: can't resolve '(null)': Name does not resolve

Name:      blue-service
Address 1: 172.30.6.147 blue-service.p7.svc.cluster.local
/ # curl blue-service:8080
^C
/ # exit
command terminated with exit code 130
[root@dhcp-41-193 AWS]# oc get ep
NAME           ENDPOINTS         AGE
blue-service   10.129.2.8:8080   17m
[root@dhcp-41-193 AWS]# oc rsh blue-pod-1
/ # curl 10.129.2.8:8080
Hello Blue Pod-1 Example
/ # 



#### curl pass in SDN with networkpolicy setup:
[root@dhcp-41-193 ~]# oc rsh blue-pod-1
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if21: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 8951 qdisc noqueue state UP 
    link/ether 0a:58:0a:83:00:10 brd ff:ff:ff:ff:ff:ff
    inet 10.131.0.16/23 brd 10.131.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::c064:15ff:fe7d:ad18/64 scope link 
       valid_lft forever preferred_lft forever
/ # ip route
default via 10.131.0.1 dev eth0 
10.128.0.0/14 dev eth0 
10.131.0.0/23 dev eth0  src 10.131.0.16 
172.30.0.0/16 via 10.131.0.1 dev eth0 
224.0.0.0/4 dev eth0 
/ # cat /etc/resolv.conf 
search p7.svc.cluster.local svc.cluster.local cluster.local us-east-2.compute.internal
nameserver 172.30.0.10
options ndots:5
/ # nslookup blue-service.p7.svc.cluster.local
nslookup: can't resolve '(null)': Name does not resolve

Name:      blue-service.p7.svc.cluster.local
Address 1: 172.30.11.32 blue-service.p7.svc.cluster.local
/ # nslookup blue-service
nslookup: can't resolve '(null)': Name does not resolve

Name:      blue-service
Address 1: 172.30.11.32 blue-service.p7.svc.cluster.local
/ # curl blue-service:8080
Hello Blue Pod-1 Example
/ # 
/ # 
/ # exit
[root@dhcp-41-193 ~]# oc get ep
NAME           ENDPOINTS          AGE
blue-service   10.131.0.16:8080   15m
[root@dhcp-41-193 ~]# oc rsh blue-pod-1
/ # curl 10.131.0.16:8080
Hello Blue Pod-1 Example
/ #

Comment 2 Dan Winship 2019-08-23 17:50:37 UTC
Filed an upstream bug: https://github.com/ovn-org/ovn-kubernetes/issues/817

Comment 4 Dan Winship 2020-05-07 14:57:26 UTC
fixed with the upgrade to OVN 2.13.

(This has been fixed for a while, I just forgot there was a bz...)

Comment 7 zhaozhanqi 2020-05-08 11:17:47 UTC
Verified this bug on 4.5.0-0.nightly-2020-05-07-144853

 #oc get svc
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
test-service   ClusterIP   172.30.185.53   <none>        27017/TCP   79s
[zzhao@dhcp-140-240 ~]$ oc rsh -n z1 test-rc-d4x7d
~ $ curl test-service:27017
Hello OpenShift!
~ $

Comment 9 errata-xmlrpc 2020-07-13 17:11:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.