Bug 1875806 - When creating a service of type "LoadBalancer" (Kuryr,OVN) communication through this loadbalancer failes after 2-5 minutes.
Summary: When creating a service of type "LoadBalancer" (Kuryr,OVN) communication thro...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Luis Tomas Bolivar
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1879389
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-04 12:30 UTC by Robert Heinzmann
Modified: 2024-03-25 16:25 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1879389 (view as bug list)
Environment:
Last Closed: 2021-02-24 15:17:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:17:46 UTC

Comment 9 Luis Tomas Bolivar 2020-09-24 12:52:57 UTC
This is the ovn fix for this problem: https://review.opendev.org/#/c/753833/

Comment 13 rlobillo 2020-11-20 13:02:31 UTC
Tested on OCP4.7.0-0.nightly-2020-11-18-203317 over OSP16.1 with OVN octavia (compose: RHOS-16.1-RHEL-8-20201110.n.1)

Confirmed that RHOS-16.1-RHEL-8-20201110.n.1 includes the package with the fix:

http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/16.1-RHEL-8/RHOS-16.1-RHEL-8-20201110.n.1/compose/OpenStack/source/tree/Packages/python-networking-ovn-7.3.1-1.20200902233413.el8ost.src.rpm  

# Setting the environment using default ingress port:

oc new-project test
oc run --image kuryr/demo demo
oc expose service/demo

$ openstack port list | grep ingress
| 74d380f9-32c6-4161-a97a-212d1f5238d3 | ostest-9fn97-ingress-port                            | fa:16:3e:f0:7c:1a | ip_address='10.196.0.7', subnet_id='3ef39638-9aa4-4b26-8376-e7fdbca140ea' 
    | DOWN   |
$ openstack floating ip set --port 74d380f9-32c6-4161-a97a-212d1f5238d3 10.46.22.104

(shiftstack) [stack@undercloud-0 ~]$ tail -1 /etc/hosts
10.46.22.104 demo-test.apps.ostest.shiftstack.com

(shiftstack) [stack@undercloud-0 ~]$ curl demo-test.apps.ostest.shiftstack.com
demo: HELLO! I AM ALIVE!!!


Now, applying the 'Scaling for ingress traffic by using {rh-openstack} Octavia' procedure. Ref: https://github.com/openshift/openshift-docs/pull/23467/files?short_path=6f43ec6#diff-6f43ec66e11c7071e703211dc4f7773b4c4c861924d39c24e6a7437cba323c2a

$ cat external_router.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    ingresscontroller.operator.openshift.io/owning-ingresscontroller: default
  name: router-external-default
  namespace: openshift-ingress
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: http
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  - name: metrics
    port: 1936
    protocol: TCP
    targetPort: 1936
  selector:
    ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default
  sessionAffinity: None
  type: LoadBalancer


$ oc -n openshift-ingress get svc
NAME                      TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                                     AGE
router-external-default   LoadBalancer   172.30.20.242   10.46.22.91   80:30620/TCP,443:31102/TCP,1936:30261/TCP   80m
router-internal-default   ClusterIP      172.30.16.228   <none>        80/TCP,443/TCP,1936/TCP                     24h

# Changing the local resolution to use router-external-default svc:

$ tail -1 /etc/hosts
10.46.22.91 demo-test.apps.ostest.shiftstack.com

$ curl demo-test.apps.ostest.shiftstack.com -vvv
* Rebuilt URL to: demo-test.apps.ostest.shiftstack.com/
* Uses proxy env variable no_proxy == ',overcloud.ctlplane.redhat.local,overcloud.redhat.local'
*   Trying 10.46.22.91...
* TCP_NODELAY set
* Connected to demo-test.apps.ostest.shiftstack.com (10.46.22.91) port 80 (#0)
> GET / HTTP/1.1
> Host: demo-test.apps.ostest.shiftstack.com
> User-Agent: curl/7.61.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< date: Fri, 20 Nov 2020 12:46:28 GMT
< content-length: 27
< content-type: text/plain; charset=utf-8
< set-cookie: 368e81a3926bc2f60deaf1f00c88a746=aa04daf3e7f13f57765db9508e838ccb; path=/; HttpOnly
< cache-control: private
< 
demo: HELLO! I AM ALIVE!!!
* Connection #0 to host demo-test.apps.ostest.shiftstack.com left intact


# Keeping trying for 10 minutes:

$ while(true); do date && curl demo-test.apps.ostest.shiftstack.com && sleep 300; done                                                                     
Fri Nov 20 12:47:43 UTC 2020
demo: HELLO! I AM ALIVE!!!
Fri Nov 20 12:52:43 UTC 2020
demo: HELLO! I AM ALIVE!!!
Fri Nov 20 12:57:43 UTC 2020
demo: HELLO! I AM ALIVE!!!

Comment 19 errata-xmlrpc 2021-02-24 15:17:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.