Description of problem: Same service externalIP test case run in v4.3 SDN and OVN, pods can not curl using externalIP:port in OVN, but testing pass in SDN. Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2019-11-13-103541 How reproducible: Always Steps to Reproduce: #### v4.3 SDN testing: #oc edit networks.config.openshift.io cluster -o yaml to add externalIP: policy: allowedCIDRs: - 22.2.2.0/24 #oc login -u testuser-0 -p 06h3AVaB7RJ3 #oc new-project test #curl -s https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/externalip_service1.json | sed s/10.5.0.1/22.2.2.10/g | oc create -f- #[root@dhcp-41-193 FILE]# oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service-unsecure ClusterIP 172.30.111.21 22.2.2.10 27017/TCP 43s #oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/caddy-docker.json #[root@dhcp-41-193 FILE]# oc get pod NAME READY STATUS RESTARTS AGE caddy-docker 1/1 Running 0 61s #[root@dhcp-41-193 FILE]# oc rsh caddy-docker /srv $ curl 22.2.2.10:27017 Hello-OpenShift-1 http-8080 /srv $ #### v4.3 OVN testing: #oc edit networks.config.openshift.io cluster -o yaml to add externalIP: policy: allowedCIDRs: - 22.2.2.0/24 #oc login -u testuser-0 -p eqDURNXRU3Vn #oc new-project test #curl -s https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/externalip_service1.json | sed s/10.5.0.1/22.2.2.10/g | oc create -f- #[root@dhcp-41-193 verification-tests]# oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service-unsecure ClusterIP 172.30.140.7 22.2.2.10 27017/TCP 11s #oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/caddy-docker.json #[root@dhcp-41-193 verification-tests]# oc get pods NAME READY STATUS RESTARTS AGE caddy-docker 1/1 Running 0 27s #[root@dhcp-41-193 verification-tests]# oc rsh caddy-docker /srv $ curl 22.2.2.10:27017 curl: (7) Failed to connect to 22.2.2.10 port 27017: Operation timed out /srv $ Actual results: #[root@dhcp-41-193 verification-tests]# oc rsh caddy-docker /srv $ curl 22.2.2.10:27017 curl: (7) Failed to connect to 22.2.2.10 port 27017: Operation timed out Expected results: #[root@dhcp-41-193 FILE]# oc rsh caddy-docker /srv $ curl 22.2.2.10:27017 Hello-OpenShift-1 http-8080 /srv $ Additional info:
This is what I did: 1. Launch an OVN cluster 2. Create an elastic IP from AWS console (in my case 3.136.100.230) 3. Associate the elastic IP to one of the of the cluster nodes 4. Follow your steps: [ricky@ricky-laptop ~]$ oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service-unsecure ClusterIP 172.30.100.224 3.136.100.230 27017/TCP 110m [ricky@ricky-laptop ~]$ oc get networks.config.openshift.io cluster -o yaml apiVersion: config.openshift.io/v1 kind: Network metadata: creationTimestamp: "2019-12-03T10:51:25Z" generation: 3 name: cluster resourceVersion: "22952" selfLink: /apis/config.openshift.io/v1/networks/cluster uid: 2da8890e-bff9-4b90-879b-441a548e8fd7 spec: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 externalIP: policy: allowedCIDRs: - 3.136.100.230/32 networkType: OVNKubernetes serviceNetwork: - 172.30.0.0/16 status: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 clusterNetworkMTU: 8901 networkType: OVNKubernetes serviceNetwork: - 172.30.0.0/16 5. From my laptop try to curl 3.136.100.230 on 27017: [ricky@ricky-laptop ~]$ curl -s --connect-timeout 2 3.136.100.230:27017 [ricky@ricky-laptop ~]$ echo $? 28 I was expecting that to work, so I'll keep digging.
FWIW, I also opened 27017 on the SG of the node.
The PR is in upstream and not in downstream OCP build yet, testing on latest v4.6 still failed.
Hi Weibin The code is in downstream ovn-kubernetes now, feel free to test with the latest 4.6 version
Hi Alexander, All test cases running in SDN cluster before has been tested and passed in 4.6.0-0.nightly-2020-08-04-103153 OVN cluster. Can QE close this bug even it is in POST state? Thanks, Weibin
I would say so, but I am unsure of what the process is like for situations like this. @Ben do you know? The bug has merged and been verified, but the bot never updated the status because we never had a specific PR for this (it was a back-port PR). Should we just update the status to ON_QE and let them set it to VERIFIED?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days