Bug 1743657
| Summary: | ERR_TOO_MANY_RETRIES loop logging in to console | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||||
| Component: | Management Console | Assignee: | Samuel Padgett <spadgett> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Yadan Pei <yapei> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 4.2.0 | CC: | aos-bugs, ccoleman, fdeutsch, jokerman, lszaszki, mfojtik, mfranczy, tjelinek, ukalifon, walemark, yapei | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 4.2.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1748727 (view as bug list) | Environment: | |||||||
| Last Closed: | 2019-10-16 06:36:40 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1748727 | ||||||||
| Attachments: |
|
||||||||
|
Description
Mike Fiedler
2019-08-20 12:03:10 UTC
I couldn't reproduce the issue with a cluster in the following version:
Client Version: version.Info{Major:"4", Minor:"1+", GitVersion:"v4.1.0+f931880-1318", GitCommit:"f931880eb3", GitTreeState:"clean", BuildDate:"2019-08-19T15:12:51Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.0+ee8999f", GitCommit:"ee8999f", GitTreeState:"clean", BuildDate:"2019-08-20T15:03:50Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
OpenShift Version: 4.2.0-0.okd-2019-08-21-062313
Nevertheless, I decided to count the number of redirects and I noticed that right after or during login to the console the number of redirection is small (<4) but right after that, the web app starts to send lots of request to https://console-openshift-console.apps.cluster-gda.devcluster.openshift.com/api/kubernetes//apis/subresources.kubevirt.io/v1alpha3/healthz.
I think that all requests to that URL end with 301 HTTP status code (Moved Permanently). The location header of the responses points to "/api/kubernetes/apis/subresources.kubevirt.io/v1alpha3/healthz". That triggers GET requests to https://console-openshift-console.apps.cluster-gda.devcluster.openshift.com/api/kubernetes/apis/subresources.kubevirt.io/v1alpha3/healthz which end with 404 HTTP status code (Not Found). I can imagine that modern browsers (including Chrome) have a fuse that counts the number of redirections and warns if there are too many such requests.
I tried with:
Client Version: version.Info{Major:"4", Minor:"1+", GitVersion:"v4.1.0+f931880-1318", GitCommit:"f931880eb3", GitTreeState:"clean", BuildDate:"2019-08-19T15:12:51Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.0+0eda05f", GitCommit:"0eda05f", GitTreeState:"clean", BuildDate:"2019-08-19T22:55:10Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
OpenShift Version: 4.2.0-0.ci-2019-08-20-075
It behaves exactly the same as with "4.2.0-0.okd-2019-08-21-062313" version - see my previous comment.
Created attachment 1606455 [details]
before login
Created attachment 1606456 [details]
after login
I see a ton of "cancelled" requests. This may actually be a web console bug. I think based on the info we have now we can move this to web console. Tomas, it's one thing to see why the health check fails, but it also seems that the console is looking at a wrong url (leading to a redirect) Marcin, can you tell if the health check URL is correct? The failed health checks are a symptom of Bug 1738292. I suspect it's a red herring, though. Here's the Chromium comment for TOO_MANY_RETRIES: // An HTTP transaction was retried too many times due for authentication or // invalid certificates. This may be due to a bug in the net stack that would // otherwise infinite loop, or if the server or proxy continually requests fresh // credentials or presents a fresh invalid certificate. NET_ERROR(TOO_MANY_RETRIES, -375) https://github.com/chromium/chromium/blob/966c0e95c915aba3b75eb432957cd421bac3ef86/net/base/net_error_list.h#L759-L763 Thinking more on this, it might be the problem. I haven't been able to reproduce. A HAR with the network requests would be immensely helpful if anyone is able to reproduce. 1. Open Google Chrome 2. Open Developer Tools (Ctrl-Shift-I or Cmd-Shift-I for macOS) 3. Switch to the Network tab 4. Click the "Preserve log" checkbox 5. Reproduce the problem 6. Right click a row in the network tab and select "Save all as HAR with content" Samuel, bug 1738292 would indeed explain it. The leading slash here is wrong, causing the redirect: https://github.com/spadgett/console/blob/ae8fe60fc1e1a1788aa01cacf2ebf5343cf47869/frontend/packages/kubevirt-plugin/src/plugin.tsx#L149-L158 https://github.com/spadgett/console/blob/ae8fe60fc1e1a1788aa01cacf2ebf5343cf47869/frontend/public/actions/dashboards.ts#L71 This PR fixes the redirect, although it's unclear if that's the cause of the TOO_MANY_RETRIES error. https://github.com/openshift/console/pull/2438 *** Bug 1737423 has been marked as a duplicate of this bug. *** GET chrome-extension://fmkadmapgofadopljbjfkapdkoienihi/build/backend.js net::ERR_UNKNOWN_URL_SCHEME (anonymous) @ VM54:7 (anonymous) @ VM54:9 And I can see this error in JS console It looks like the JS error is from an extension you have installed. I'm not sure if that is contributing to the problem or not. I opened https://github.com/openshift/console/pull/2485 to make sure we're not trying to logout/login more than once due to multiple concurrent requests returning unauthorized. I believe that's the basic problem. We've fixed the unnecessary redirect and canceled logout requests. Looking at the screenshot from comment #14, I'm not convinced that we've fixed the underlying problem. Console is redirecting to the `/authorize` endpoint on the OAuth server. The status is `(failed)` with no actual status code. I spoke with Mike, and he said he was not prompted to accept the certificate from the OAuth server. You can see it bouncing back and forth between `login` and `authorize` endpoints, which results in the TOO_MANY_RETRIES error. I believe the other failed requests are red herrings. It seems like Chrome is blocking the request to the OAuth server, perhaps due to something in the OAuth server certificate or possibly an extension like an ad blocker? The console error referencing the extension is suspicious as well. Assigning back to the Auth team for now. The console team has fixed the bad requests from comment #1 and comment #5, and the failing request based on the screenshots from Yadan in comment #14 is to the OAuth server. I think that I finally know how to reproduce the issue and what caused it, td;dr I think that we have already fixed it with https://github.com/openshift/console/pull/2485 I think that all failed requests without status are indeed blocked/canceled by the browser and never reach the server. In general, browsers are allowed to block/cancel the request, one example would be too many concurrent connections to the same origin (browsers have limits). In our case, some requests are blocked because they are coming from an untrusted source (net::ERR_CERT_AUTHORITY_INVALID) and I suspect that the browser prefers to warn the users and get consent before proceeding. After analysing the network traffic and the *.har file attached to this issue (fixing unnecessary redirect helped a lot as it made it more apparent) I realised that we send multiple logout request which I suspect triggers login requests which are redirected to /authorize endpoint and block by the browser (net::ERR_CERT_AUTHORITY_INVALID). The only issue with that theory was that I couldn't reproduce it. I needed more login requests. How can you increase the number of request? You could ask the browser to resend them:). So I decided to enable throttling and that was it I started seeing net::ERR_TOO_MANY_RETRIES error. Steps to reproduce: 1. take a version before https://github.com/openshift/console/pull/2485 2. make sure you are not logged in 3. enable security warning for the site if you have already disabled them - I’m not sure if you can reproduce the issue with it (that is - accepting the cert) 4. enable throttling (Fast3G, Slow3G) 5. try to login but do not trust the certificate You might need to repeat step 5 a few times before you run into ERR_TOO_MANY_RETRIES WDYT, does it make sense? Moving to modified based on comment #20 I didn't meet the issue on several builds with #2485 @Mike, is it ok to verify it? Marking this verified on most recent nightly. I think there is still some suspicion that issues in this area remain, but we can work them under a new bz if needed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 There's a long-standing bug in Chromium regarding how links without protocols are handled. This error does not have a single solution till date because it arises due to a multitude of reasons. The ERR_UNKNOWN_URL_SCHEME error is commonly because of your browser issue . There's no application on your device which can handle that particular action. It is a Chromium bug . In Chrome version 40 and up, this bug has resurfaced, but only if you are manually entering the URL of the redirect page in the address bar. The bug in chromium is responsible, yet everytime a patch is added to solve, the error finds a new way to resurface. The issue is on the chromium issue tracker here: https://bugs.chromium.org/p/chromium/issues/detail?id=459156 More Info: http://net-informations.com/q/mis/scheme.html Common solutions: Prefixing your links with http:// (or https://) should resolve the issue in some cases If Err_Unknown_Url_Scheme error occurs in mailto: or tel: links inside an iframe then you can try to add target="_blank" in your URL Scheme. |