1686476 – Login flows directing user to api discover page instead of console

Bug 1686476 - Login flows directing user to api discover page instead of console

Summary: Login flows directing user to api discover page instead of console

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	apiserver-auth
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Mo
QA Contact:	Chuan Yu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-03-07 14:23 UTC by Justin Pierce
Modified:	2019-06-04 10:45 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:45:17 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:45:33 UTC

Description Justin Pierce 2019-03-07 14:23:43 UTC

Description of problem:
Logging into the console results in constant redirection to page showing available apis.

{
  "paths": [
    "/apis",
    "/healthz",
    "/healthz/log",
    "/healthz/ping",
    "/healthz/poststarthook/oauth.openshift.io-startoauthclientsbootstrapping",
    "/metrics"
  ]
}

The preceding page content results from a 404 during the auth flow:
Request URL: https://console-openshift-console.apps.rhcos-cr.rhcos.sandbox.openshift.com/auth/callback?code=jS2....p4qc&state=90db818c
Request Method: GET
Status Code: 404 
Remote Address: 18.205.100.105:443



Version-Release number of selected component (if applicable):
version   4.0.0-0.alpha-2019-03-05-054505

How reproducible:
100% on two recent clusters

Comment 3 Dan Mace 2019-03-18 21:04:06 UTC

I now have reason to believe the underlying problem is https://bugzilla.redhat.com/show_bug.cgi?id=1688390 — need to do a bit more diagnosis to confirm.

Comment 4 Dan Mace 2019-03-18 21:17:56 UTC

Created https://bugzilla.redhat.com/show_bug.cgi?id=1690146 to track it in 4.0

Comment 5 Dan Mace 2019-03-20 14:26:05 UTC

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1688390 and https://bugzilla.redhat.com/show_bug.cgi?id=1690146 — yesterday I was able to reproduce the console problem in a cluster which was NOT exhibiting the suspicious haproxy process issue, so the problems may not be related after all.

Comment 7 Hongan Li 2019-03-26 06:38:27 UTC

(In reply to Dan Mace from comment #5)
> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1688390 and
> https://bugzilla.redhat.com/show_bug.cgi?id=1690146 — yesterday I was able
> to reproduce the console problem in a cluster which was NOT exhibiting the
> suspicious haproxy process issue, so the problems may not be related after
> all.

Hi Dan, I tried but didn't reproduce this issue with several nightly builds recently.
Could you please provide the detail steps if you can reproduce. 
Thanks.

Comment 8 Samuel Padgett 2019-03-26 12:07:45 UTC

We're only seeing this on certain clusters. We don't have a specific steps to reproduce unfortunately.

Comment 16 Dan Mace 2019-03-27 18:02:13 UTC

We've identified the root issue as edge case in our passthrough route handling of HTTP2 endpoints. Given the following conditions:

* A wildcard ingress certificate (eg. *.apps.openshift.example.com)
* A DNS wildcard (eg. *.apps.openshift.example.com) with A records resolving to a static set of ingress load balancer IPs
* A passthrough route to an HTTP2 server (eg. auth.apps.openshift.example.com)
* A an edge or reencrypted route to a server in the same subdomain (eg. console.apps.openshift.example.com)
* An HTTP2 client which coalesces connections (eg. Chrome/Firefox)

It's possible for packets destined for a proxy-terminated route (eg. console) to be misdirected to the passthrough/HTTP2 route (eg. auth).

In brief, a connection to the passthrough/HTTP2 server may be reused by the client for packets destined for other servers for which the wildcard certificate is valid. Because both route host names are valid for the certificate and resolve to the same IPs through DNS, an existing HTTP2 server connection is considered reusable for the other servers' packets. However, because the HTTP2 connection at the proxy is coupled to the HTTP2 server through the initial SNI header from a TLS handshake, and the packets coming through are opaque and cannot be disambiguated by the proxy, the packets cannot be identified by the proxy as misdirected and are all forwarded to the HTTP2 server.

Solutions could be:

1. Discontinue use of HTTP2 for these services
2. Implement mutual TLS in the ingress controller to enable terminating TLS at the proxy for the auth server
3. Implement HTTP 421 misdirected request support at the auth server to hint clients to stop reusing the connection for the request authority

Our current recommendation in the short term is (3), and longer term we would like to implement mTLS (2) for ease of use.

References:

* https://httpwg.org/specs/rfc7540.html#reuse
* https://httpwg.org/specs/rfc7540.html#MisdirectedRequest
* https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing

Comment 17 Dan Mace 2019-03-28 15:09:37 UTC

Another viable solution Paul Weil presented is to use a separate ingresscontroller, domain, and cert for auth.

In the meantime, users can probably work around the issue by waiting for connections to expire (eg. ~30s) and reloading the console page (or using a new browser session).

Comment 20 Mo 2019-04-14 03:52:00 UTC

https://github.com/openshift/origin/pull/22529 is merged

Comment 22 Chuan Yu 2019-04-19 05:19:45 UTC

No such issue found for nightly build, move to verified, please re-open when met such problem again.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-04-18-210657   True        False         170m    Cluster version is 4.1.0-0.nightly-2019-04-18-210657

Comment 24 errata-xmlrpc 2019-06-04 10:45:17 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.