Bug 2115814
| Summary: | Issues with samples in a disconnected cluster in OCP 4.9 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Andy Bartlett <andbartl> |
| Component: | Dev Console | Assignee: | Christoph Jerolimov <cjerolim> |
| Status: | CLOSED ERRATA | QA Contact: | spathak <spathak> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.9 | CC: | cjerolim, nmukherj |
| Target Milestone: | --- | ||
| Target Release: | 4.12.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-01-17 19:54:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Andy Bartlett
2022-08-05 13:02:21 UTC
@andbartl this happens after updating to 4.10, right? So I would like to update the affected version to 4.10 then? My guess (we need to verify this) is that our UI waits in the catalog until all network calls finished (successfully or not) to show the samples. Unfortunately the network call for devfile samples doesn't fail immediately (why not?) and it took "exactly" 30 seconds until the call timeouts and we show the result, which means the other samples in this case. What we can do: 1. Reproduce this on a totally disconnected cluster (tested this today on a disconnected cluster and learned that this means a disconnected cluster with a proxy :)) 2. Check if there is any parameter that we're running on a totally disconnected cluster so that we don't fetch Devfile samples (from the official devfile registry) in this case?! 3. Update the developer catalog so that it shows results after n seconds (n round about 2-5 seconds?) also if other network calls are still pending. Waiting that all network calls finish or this n seconds makes sense so that the UI doesn't flicker when everything responds in 100-500ms. Hi Andy (andbartl), TL;DR: Samples has the same issue as the developer catalog. Both showing Devfiles and we had timeout issues with them on disconnected clusters. We have two fixes for the Samples implemented and backported. 1. The proxy support, so that Devfiles on a disconnected cluster could get loaded. (I need to check if the import works also. I will followup on this asap.) 2. The developer catalog shows all items after 3 seconds, independent of any network call taking more time. Both changes are available in our releases 4.12.0, 4.11.8, 4.10.37. The proxy support is also available in 4.9.50, the UI fix to show other items after 3 seconds in our merge queue, and should be part of the next 4.9 release. Additional fixes that are not released yet: 3. A reduced Devfile API timeout from 30 to 10 seconds is in code review. 4. We implemented a reduced timeout when loading Helm chart for the Developer catalog. Let me know if you need more details, I will try to update this ticket from time to time until all PRs are merged. ======================================================================== Here is a full overview of all related issues. (October 27th) ======================================================================== ## 1. Developer catalog fails to load => Proxy support added when loading Devfiles Old versions of the Devfile api ignores a proxy configuration on a disconnected cluster. The new version uses the proxy configuration correctly. This doesn't help fully disconnected clusters. With this fix alone the API calls still timeouted after 30seconds. (See next two fixes!) - 4.12.0 / https://bugzilla.redhat.com/show_bug.cgi?id=2112812 / https://github.com/openshift/console/pull/12011 - 4.11.5 / https://issues.redhat.com/browse/OCPBUGS-1030 / https://github.com/openshift/console/pull/12028 - 4.10.35 / https://issues.redhat.com/browse/OCPBUGS-1634 / https://github.com/openshift/console/pull/12040 - 4.9.50 / https://issues.redhat.com/browse/OCPBUGS-1635 / https://github.com/openshift/console/pull/12041 - 4.8 / doesn't load devfiles from the devfile registry, so no update is needed ======================================================================== ## 2. Show already loaded catalog items after a timeout (3sec) The first issue is that the Developer catalog and Samples catalog waits 30 second (until the Devfile network call timed-out) to show anything. This was a frontend issue we fixed. After 3 seconds we show now everything that is loaded until then. It still takes 30 second until the error is shown, at least until the timeout in the next fix is get merged. - 4.12.0 / https://issues.redhat.com/browse/OCPBUGS-270 / https://github.com/openshift/console/pull/12019 - 4.11.8 / https://issues.redhat.com/browse/OCPBUGS-1523 / https://github.com/openshift/console/pull/12070 - 4.10.37 / https://issues.redhat.com/browse/OCPBUGS-1759 / https://github.com/openshift/console/pull/12106 - 4.9.? / https://issues.redhat.com/browse/OCPBUGS-2008 / https://github.com/openshift/console/pull/12136 in merge queue - 4.8.? / planned when 4.9 is merged ======================================================================== ## 3. Developer catalog fails to load => Reduce Devfile timeout On fully disconnected clusters the API call to the devfile registry takes up to 30 seconds. The devfile registry calls uses now a reduced timeout of 10 seconds. Whatever delays the network call, this will help that the UI shows an error earlier. - 4.12.0 / https://issues.redhat.com/browse/OCPBUGS-1106 / https://github.com/openshift/console/pull/12043 / needs validation - 4.11.? / https://issues.redhat.com/browse/OCPBUGS-2716 / https://github.com/openshift/console/pull/12186 / in code review - 4.10.? / https://issues.redhat.com/browse/OCPBUGS-2717 / https://github.com/openshift/console/pull/12191 / in code review - 4.9.? / https://issues.redhat.com/browse/OCPBUGS-2718 / https://github.com/openshift/console/pull/12192 / in code review - 4.8 / doesn't load devfiles from the devfile registry, so no update is needed ======================================================================== ## 4. No helm chart could be loaded if one timeouted (reduced timeout per chart repository to 5 seconds) 4.12.0 / https://issues.redhat.com/browse/OCPBUGS-803 / https://github.com/openshift/console/pull/12096 4.11.8 / https://issues.redhat.com/browse/OCPBUGS-1782 / https://github.com/openshift/console/pull/12107 (internal follow up) 4.12.0 / https://issues.redhat.com/browse/OCPBUGS-2344 / https://github.com/openshift/console/pull/12141 4.11.8 / https://issues.redhat.com/browse/OCPBUGS-2515 / https://github.com/openshift/console/pull/12182 UI change to show alerts when some chart repositories could not be fetched 4.12.0 / https://issues.redhat.com/browse/OCPBUGS-1959 / https://github.com/openshift/console/pull/12200 4.8-4.10 / tbd. ======================================================================== I updated this to in progress and would recommend to close it when the 4.9 PR https://github.com/openshift/console/pull/12136 is merged. Hi @andbartl we backported the most PRs already. Esp. "Show already loaded catalog items after a timeout (3sec)" is now released, it is part of 4.9.52. 4.8 is still in progress. I'm closing this, you can follow this ticket for 4.8: https://issues.redhat.com/browse/OCPBUGS-4120 Verified on a fully disconnected cluster with version: 4.9.0-0.nightly-2022-11-30-072039 Browser version: Chrome 106 Changed Doc Type to "No Doc Update" because this issue is from the customer perspective really similar to https://issues.redhat.com/browse/OCPBUGS-270 We kept two bugs because we improved the backend and frontend as well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 |