Bug 1932401 - Cluster Ingress Operator degrades if external LB redirects http to https because of new "canary" route [NEEDINFO]
Summary: Cluster Ingress Operator degrades if external LB redirects http to https beca...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 4.7
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: 4.8.0
Assignee: Stephen Greene
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks: 1932649
TreeView+ depends on / blocked
 
Reported: 2021-02-24 15:08 UTC by Josef Meier
Modified: 2021-11-15 04:11 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Exposing the default ingress controller via an external load balancer that redirects all HTTP traffic to HTTPS Consequence: Ingress Canary endpoint checks performed by the ingress operator would fail, which would ultimately cause the ingress cluster operator to become degraded. Fix: Convert the cleartext canary route to an edge encrypted route. Result: The canary route works via HTTPS only load balancers, when insecure traffic is redirected by the load balancer.
Clone Of:
: 1932649 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:48:13 UTC
Target Upstream Version:
sgreene: needinfo?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 556 0 None open Bug 1932401: Canary: Add edge termination to canary route 2021-02-24 19:38:26 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:48:34 UTC

Description Josef Meier 2021-02-24 15:08:17 UTC
Hi,

in my company we use an external load balancer that redirects HTTP traffic to HTTPS.

During an upgrade from 4.6 to 4.7 the cluster-ingress-operator degraded because it couldn't reach the new canary route in openshift-ingress-canary.

I saw that this canary route is a HTTP route. This won't work in our setup.

I manually added edge termination to this route and immediately the upgrade proceeded.

This is a PR that should add 'edge' termination to the canary route:
https://github.com/openshift/cluster-ingress-operator/pull/555

Thanks and regards,

Josef

Comment 1 Hongan Li 2021-02-25 03:12:15 UTC
verified with a cluster launched by cluster-bot (launch openshift/cluster-ingress-operator#556) and passed

$ oc get clusterversion
NAME      VERSION                                           AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci.test-2021-02-25-014749-ci-ln-lvfqbrt   True        False         33m     Cluster version is 4.8.0-0.ci.test-2021-02-25-014749-ci-ln-lvfqbrt

$ oc -n openshift-ingress-canary get route
NAME     HOST/PORT                                                                                      PATH   SERVICES         PORT   TERMINATION     WILDCARD
canary   canary-openshift-ingress-canary.apps.ci-ln-lvfqbrt-f76d1.origin-ci-int-gce.dev.openshift.com          ingress-canary   8080   edge/Redirect   None

$ curl -kL http://canary-openshift-ingress-canary.apps.ci-ln-lvfqbrt-f76d1.origin-ci-int-gce.dev.openshift.com
Hello OpenShift!

$ curl -k https://canary-openshift-ingress-canary.apps.ci-ln-lvfqbrt-f76d1.origin-ci-int-gce.dev.openshift.com
Hello OpenShift!

Comment 4 Louis Santillan 2021-03-02 01:09:39 UTC
IHAC that is also hitting this issue since their F5 ELB is configured to drop all HTTP/80 traffic.  So this bug is related but may require another workaround.  Also, could I request an appropriate docs update (Release Notes and Install pages)?  It seems now that HTTP/80 traffic is fully required in order to upgrade to/install 4.7.

Comment 5 Stephen Greene 2021-03-02 14:13:23 UTC
(In reply to Louis Santillan from comment #4)
> IHAC that is also hitting this issue since their F5 ELB is configured to
> drop all HTTP/80 traffic.  So this bug is related but may require another
> workaround.  Also, could I request an appropriate docs update (Release Notes
> and Install pages)?  It seems now that HTTP/80 traffic is fully required in
> order to upgrade to/install 4.7.

There is a workaround mentioned here
https://github.com/openshift/openshift-docs/pull/29807

Comment 6 Louis Santillan 2021-03-03 18:11:36 UTC
I don't think the TLS termination matters if the packets on port 80 get dropped.

Comment 7 Stephen Greene 2021-03-03 18:21:46 UTC
(In reply to Louis Santillan from comment #6)
> I don't think the TLS termination matters if the packets on port 80 get
> dropped.

Using an edge terminated route means requests for the canary route will come into the cluster on port 443.

Comment 8 Stephen Greene 2021-03-03 18:24:00 UTC
(In reply to Stephen Greene from comment #7)
> (In reply to Louis Santillan from comment #6)
> > I don't think the TLS termination matters if the packets on port 80 get
> > dropped.
> 
> Using an edge terminated route means requests for the canary route will come
> into the cluster on port 443.

well I should be more specific. Requests for the edge terminated canary route will come into the external load balancer on port 443 (which will forward to the ingress controller's node port).

Comment 9 Stephen Greene 2021-03-03 18:41:53 UTC
ah, but if traffic to port 80 is dropped, the canary requests wont be able to redirect to use https. Can the customer just use an external load balancer that redirects http traffic to https? Do we officially support using an external load balancer for ingress that drops traffic on port 80? 

Sorry for churn with prior comments.

Comment 10 Stephen Greene 2021-03-03 18:46:36 UTC
Would it be sufficient to have the canary controller make requests over https (rather than over http + resolve via the route redirect?).

If so, could you open a new BZ to address that issue (and attach the a customer case)? Thanks!

Comment 11 Stephen Greene 2021-03-03 19:58:34 UTC
(In reply to Stephen Greene from comment #10)
> Would it be sufficient to have the canary controller make requests over
> https (rather than over http + resolve via the route redirect?).
> 
> If so, could you open a new BZ to address that issue (and attach the a
> customer case)? Thanks!

Please see https://bugzilla.redhat.com/show_bug.cgi?id=1934773

Comment 12 Aleksey Usov 2021-03-12 08:12:30 UTC
Adding DNS entry for the route (wildcards are not allowed by my customer's policy) and edge termination worked, but then I started seeing "x509 certificate signed by unknown authority" errors. Fixed it by adding CA to the proxy, as described here https://docs.openshift.com/container-platform/4.7/networking/enable-cluster-wide-proxy.html#nw-proxy-configure-object_config-cluster-wide-proxy.

Comment 16 errata-xmlrpc 2021-07-27 22:48:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.