Bug 1826990 - HTTP/2 frontend support breaks oauth flow
Summary: HTTP/2 frontend support breaks oauth flow
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.0
Assignee: Miciah Dashiel Butler Masters
QA Contact: Hongan Li
URL:
Whiteboard:
: 1826992 (view as bug list)
Depends On: 1825354
Blocks: 1826992
TreeView+ depends on / blocked
 
Reported: 2020-04-22 23:55 UTC by Miciah Dashiel Butler Masters
Modified: 2020-05-04 11:50 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1825354
Environment:
Last Closed: 2020-05-04 11:50:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift origin pull 24913 None closed Bug 1825354: test/extended/router: Disable HTTP/2 tests 2020-07-11 04:42:13 UTC
Github openshift router pull 123 None closed Bug 1826990: Removes ALPN from haproxy frontends 2020-07-11 04:42:13 UTC
Red Hat Product Errata RHBA-2020:0581 None None None 2020-05-04 11:50:32 UTC

Internal Links: 1853711

Description Miciah Dashiel Butler Masters 2020-04-22 23:55:07 UTC
+++ This bug was initially created as a clone of Bug #1825354 +++

In OpenShift 4.4, we enabled HTTP/2 on the frontend for the ingress controller; that is, we allow clients to ask the ingress controller to use HTTP/2, using ALPN.  Because the console and oauth-server are both behind the ingress controller's balancer and are using the same serving certificate (namely the ingress controller's default certificate), a browser might perform connection coalescing[1], meaning the browser re-uses the connection that it used to connect to the console route to connect to the oauth route.  Because we enabled HTTP/2 on the frontend, this means the browser may connect to the console route using HTTP/2 and then re-use the HTTP/2 connection to try to connect to the oauth route, which fails.

In general, we cannot support HTTP/2 ALPN on routes that use the default certificate without risk of connection re-use/coalescing causing problems of this nature.  To unblock this issue, we can disable HTTP/2 on the frontend.  Later on, in order to support HTTP/2, we will need a solution that enables HTTP/2 only for routes that have custom certificates (which should prevent browsers from coalescing connections).

1. https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/

Comment 1 Daneyon Hansen 2020-04-23 00:12:52 UTC
*** Bug 1826992 has been marked as a duplicate of this bug. ***

Comment 2 Scott Dodson 2020-04-23 13:12:00 UTC
Miciah, Can you please answer the questions outlined on the bug this was cloned from, see https://bugzilla.redhat.com/show_bug.cgi?id=1825354#c3

Comment 3 Miciah Dashiel Butler Masters 2020-04-24 21:53:05 UTC
Who is impacted?
  Customers running 4.5.
What is the impact?
  Browsers that use HTTP/2 and perform aggressive connection coalescing may re-use a previous connection to one route to connect to a different route if the two routes use the same certificate.
  In particular, this may break the OAuth flow from OpenShift Console, which both use the default certificate.
How involved is remediation?
  A user can wait 30 seconds for keepalive connections to timeout and then retry, or use a browser that does not perform aggressive connection coalescing.
Is this a regression?
  Yes, the issue is caused by HTTP/2 support that was enabled in 4.4.

Comment 4 Miciah Dashiel Butler Masters 2020-04-24 21:55:26 UTC
Sorry, I mixed up my Bugzilla reports (we swapped target releases on this bug and bug 1825354 after I opened this one).  "Customers running 4.5" should be replaced with "Customers running 4.4".  Following is the corrected text:

Who is impacted?
  Customers running 4.4.
What is the impact?
  Browsers that use HTTP/2 and perform aggressive connection coalescing may re-use a previous connection to one route to connect to a different route if the two routes use the same certificate.
  In particular, this may break the OAuth flow from OpenShift Console, which both use the default certificate.
How involved is remediation?
  A user can wait 30 seconds for keepalive connections to timeout and then retry, or use a browser that does not perform aggressive connection coalescing.
Is this a regression?
  Yes, the issue is caused by HTTP/2 support that was enabled in 4.4.

Comment 7 Hongan Li 2020-04-26 08:49:03 UTC
Verified with upgrade from 4.3.12 to 4.4.0-0.nightly-2020-04-25-191512 and issue has been fixed.

follow below steps and the console, prometheus, grafana UI are all accessible after upgrade. 

Steps to Reproduce:
1. Create 4.3.12 cluster.
2. Apply OAuth templates
3. Wait for authentication to finish processing the oauth changes
4. Remove OAuth templates
5. Wait for authentication to finish processing the oauth changes
6. Upgrade to 4.4.0-0.nightly-2020-04-25-191512


And the routes are still using HTTP/1.1 after upgrade:
$ curl https://console-openshift-console.apps.hongli-bv.qe.devcluster.openshift.com -k -I
HTTP/1.1 200 OK

$ curl https://oauth-openshift.apps.hongli-bv.qe.devcluster.openshift.com -k -I
HTTP/1.1 403 Forbidden

Comment 9 errata-xmlrpc 2020-05-04 11:50:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.